Data protection for remote artificial intelligence models

US12549347B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12549347-B2
Application numberUS-202117457717-A
CountryUS
Kind codeB2
Filing dateDec 6, 2021
Priority dateDec 6, 2021
Publication dateFeb 10, 2026
Grant dateFeb 10, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, computer system, and a computer program product for data protection is provided. The present invention may include, generating an encoder network. The present invention may also include, encoding a training data using the generated encoder network, wherein the training data includes natural language data. The present invention may further include, training a deep learning model using the encoded training data.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method, comprising: enhancing data protection of training data in remote training of machine learning (ML) models configured on remote artificial intelligence (AI) platforms, the enhancing comprising: generating an encoder network according to a set of seeding network parameters; encoding a training data using the generated encoder network, wherein the training data includes natural language data; configuring a deep learning (DL) model as an integration of a decoder network and a classifier network, wherein the decoder network comprises a first subset of a set of hidden layers of the DL model and the classifier network comprises a second subset of the set of hidden layers of the DL model; and executing a training mode of the DL model using the encoded training data such that the training optimizes the decoder network component of the DL model while omitting using the set of seeding network parameters in training the decoder network component of the DL model, the executing outputting a trained DL model. 2 . The method of claim 1 , wherein generating the encoder network further comprises generating a user-specific neural network on a local device associated with a user, and wherein training the deep learning model using the encoded training data further comprises transmitting the encoded training data to a remote device storing the deep learning model. 3 . The method of claim 1 , wherein generating the encoder network further comprises: receiving a unique user key and a current timestamp; and generating a neural network based on the received unique user key and the current timestamp. 4 . The method of claim 3 , wherein generating the neural network based on the received unique user key and the current timestamp further comprises: initializing the set of seeding network parameters for the generated neural network based on the received unique user key and the current timestamp as seeds, such that the initialized set of seeding network parameters includes a plurality of random numbers. 5 . The method of claim 1 , wherein encoding the training data using the generated encoder network further comprises: converting the natural language data of the training data to a plurality of embeddings; and mapping the plurality of embeddings from a relatively low-dimensional embedding to a high-dimensional embedding to increase data privacy of the training data. 6 . The method of claim 5 , wherein mapping the plurality of embeddings from the relatively low-dimensional embedding to the high-dimensional embedding further comprises: determining a set of semantic features captured by the relatively low-dimensional embedding; and maintaining the determined set of semantic features in the high-dimensional embedding. 7 . The method of claim 1 , wherein the training data comprises at least one input and at least one output, and wherein encoding the training data further comprises: converting the at least one input from the natural language data to an embedding; and maintaining the natural language data of the at least one output. 8 . The method of claim 2 , further comprising: encoding a prediction request on the local device using the generated user-specific neural network; transmitting the encoded prediction request from the local device to the trained deep learning model stored on the remote device; and receiving, on the local device, a natural language output from the trained deep learning model responsive to the encoded prediction request. 9 . A computer system for data protection, comprising: one or more processors, one or more computer-readable memories, one or more computer-readable tangible storage media, and program instructions stored on at least one of the one or more computer-readable tangible storage media for execution by at least one of the one or more processors via at least one of the one or more memories, wherein the computer system is capable of performing a method comprising: enhancing data protection of training data in remote training of machine learning (ML) models configured on remote artificial intelligence (AI) platforms, the enhancing comprising: generating an encoder network according to a set of seeding network parameters; encoding a training data using the generated encoder network, wherein the training data includes natural language data; configuring a deep learning (DL) model as an integration of a decoder network and a classifier network, wherein the decoder network comprises a first subset of a set of hidden layers of the DL model and the classifier network comprises a second subset of the set of hidden layers of the DL model; and executing a training mode of the DL model using the encoded training data such that the training optimizes the decoder network component of the DL model while omitting using the set of seeding network parameters in training the decoder network component of the DL model, the executing outputting a trained DL model. 10 . The computer system of claim 9 , wherein generating the encoder network further comprises generating a user-specific neural network on a local device associated with a user, and wherein training the deep learning model using the encoded training data further comprises transmitting the encoded training data to a remote device storing the deep learning model. 11 . The computer system of claim 9 , wherein generating the encoder network further comprises: receiving a unique user key and a current timestamp; and generating a neural network based on the received unique user key and the current timestamp. 12 . The computer system of claim 11 , wherein generating the neural network based on the received unique user key and the current timestamp further comprises: initializing the set of seeding network parameters for the generated neural network based on the received unique user key and the current timestamp as seeds, such that the initialized set of seeding network parameters includes a plurality of random numbers. 13 . The computer system of claim 9 , wherein encoding the training data using the generated encoder network further comprises: converting the natural language data of the training data to a plurality of embeddings; and mapping the plurality of embeddings from a relatively low-dimensional embedding to a high-dimensional embedding to increase data privacy of the training data. 14 . The computer system of claim 13 , wherein mapping the plurality of embeddings from the relatively low-dimensional embedding to the high-dimensional embedding further comprises: determining a set of semantic features captured by the relatively low-dimensional embedding; and maintaining the determined set of semantic features in the high-dimensional embedding. 15 . The computer system of claim 9 , wherein the training data comprises at least one input and at least one output, and wherein encoding the training data further comprises: converting the at least one input from the natural language data to an embedding; and maintaining the natural language data of the at least one output. 16 . The computer system of claim 10 , further comprising: encoding a prediction request on the local device using the generated user-specific neural network; transmitting the encoded prediction request from the local device to the trained deep learning model stored on the remote device; and receiving, on the local device, a natural language output from the trained deep learning model responsive to the encoded prediction request.

Assignees

Inventors

Classifications

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Combinations of networks · CPC title

  • Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Processing or translation of natural language (natural language analysis G06F40/20; semantic analysis G06F40/30) · CPC title

  • Key transport or distribution, i.e. key establishment techniques where one party creates or otherwise obtains a secret value, and securely transfers it to the other(s) (network architectures or network communication protocols for key distribution in a packet data network H04L63/062) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12549347B2 cover?
A method, computer system, and a computer program product for data protection is provided. The present invention may include, generating an encoder network. The present invention may also include, encoding a training data using the generated encoder network, wherein the training data includes natural language data. The present invention may further include, training a deep learning model using …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification H04L9/0869. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Feb 10 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).