Verifying that the influence of a user data point has been removed from a machine learning classifier
US-10225277-B1 · Mar 5, 2019 · US
US10733292B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10733292-B2 |
| Application number | US-201816031330-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 10, 2018 |
| Priority date | Jul 10, 2018 |
| Publication date | Aug 4, 2020 |
| Grant date | Aug 4, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Mechanisms are provided for protecting a neural network model against model inversion attacks. The mechanisms generate a decoy dataset comprising decoy data for each class recognized by a neural network model. The mechanisms further configure the neural network model to generate a modified output based on the decoy dataset that directs a gradient of the modified output to the decoy dataset. The neural network model receives and process input data to generate an actual output. The neural network model modifies one or more actual elements of the actual output to be one or more corresponding modified elements of the modified output, and returns the one or more corresponding modified elements, instead of the one or more actual elements, to the source computing device.
Opening claim text (preview).
What is claimed is: 1. A method for protecting a neural network model against model inversion attacks, the method being performed in a data processing system comprising at least one processor and at least one memory, the at least one memory comprising instructions executed by the at least one processor to specifically configure the at least one processor to implement the neural network model and a targeted deceptive gradient engine, the method comprising: generating, by the targeted deceptive gradient engine, a decoy dataset comprising decoy data for each class recognized by the neural network model; configuring, by the targeted deceptive gradient engine, a first neural network model to generate a modified output based on the decoy dataset that directs a gradient of the modified output to the decoy dataset; receiving, by the first neural network model, from a source computing device, input data to be processed by the first neural network model; processing, by the first neural network model, the input data to generate an actual output; modifying, by the first neural network model, one or more actual elements of the actual output to be one or more corresponding modified elements of the modified output; and returning, by the first neural network model, the one or more corresponding modified elements instead of the one or more actual elements, to the source computing device. 2. The method of claim 1 , wherein the modified output obscures a gradient of a loss function of the first neural network model. 3. The method of claim 1 , wherein the one or more modified elements of the modified output provide a correct classification of the input data, but modified confidence scores associated with the classifications that direct a gradient of a loss function of the first neural network model towards the decoy dataset. 4. The method of claim 1 , wherein the modified output equates a gradient of a loss function of the first neural network model to a difference between the decoy data and training data used to train the first neural network model for each class recognized by the first neural network model. 5. The method of claim 1 , wherein the modified output maintains a largest class label between the modified output and actual output of the first neural network model to be the same largest class label. 6. The method of claim 1 , further comprising: training a second neural network model with an original training dataset and the decoy dataset to identify input data as being either actual input data corresponding to the original training dataset or decoy data corresponding to the decoy dataset; and determining, by the second neural network model, whether the received input data, for processing by the first neural network model, approximates decoy data in the decoy dataset. 7. The method of claim 6 , wherein the first neural network model processes the input data in response to the second neural network model determining that the received input data does not approximate decoy data in the decoy dataset. 8. The method of claim 6 , further comprising: performing, by a protective action logic engine executing in the data processing system, a protective action in response to a determination by the second neural network model that the received input data approximates decoy data in the decoy dataset. 9. The method of claim 8 , wherein the protective action comprises at least one of logging a request associated with the received input data, sending a notification message to a system administrator, or preventing access to a protected resource. 10. The method of claim 1 , wherein the data processing system is a cloud computing system comprising a plurality of server computing devices, and wherein the at least one processor and at least one memory comprise at least one processor and at least one memory in each server computing device in the plurality of server computing devices. 11. A computer program product comprising a computer readable storage medium having a computer readable program stored therein, wherein the computer readable program, when executed on a data processing system, causes the data processing system to specifically configure the data processing system to implement a first neural network model and a targeted deceptive gradient engine, the data processing system being further configured by the computer readable program to: generate, by the targeted deceptive gradient engine, a decoy dataset comprising decoy data for each class recognized by the neural network model; configure, by the targeted deceptive gradient engine, a first neural network model to generate a modified output based on the decoy dataset that directs a gradient of the modified output to the decoy dataset; receive, by the first neural network model, from a source computing device, input data to be processed by the first neural network model; process, by the first neural network model, the input data to generate an actual output; modify, by the first neural network model, one or more actual elements of the actual output to be one or more corresponding modified elements of the modified output; and return, by the first neural network model, the one or more corresponding modified elements instead of the one or more actual elements, to the source computing device. 12. The computer program product of claim 11 , wherein the modified output obscures a gradient of a loss function of the first neural network model. 13. The computer program product of claim 11 , wherein the one or more modified elements of the modified output provide a correct classification of the input data, but modified confidence scores associated with the classifications that direct a gradient of a loss function of the first neural network model towards the decoy dataset. 14. The computer program product of claim 11 , wherein the modified output equates a gradient of a loss function of the first neural network model to a difference between the decoy data and training data used to train the first neural network model for each class recognized by the first neural network model. 15. The computer program product of claim 11 , wherein the modified output maintains a largest class label between the modified output and actual output of the first neural network model to be the same largest class label. 16. The computer program product of claim 11 , wherein the data processing system is further configured by the computer readable program to: train a second neural network model with an original training dataset and the decoy dataset to identify input data as being either actual input data corresponding to the original training dataset or decoy data corresponding to the decoy dataset; and determine, by the second neural network model, whether the received input data, for processing by the first neural network model, approximates decoy data in the decoy dataset. 17. The computer program product of claim 16 , wherein the first neural network model processes the input data in response to the second neural network model determining that the received input data does not approximate decoy data in the decoy dataset. 18. The computer program product of claim 16 , wherein the data processing system is further configured by the computer readable program to: perform, by a protective action logic engine executing in the data processing system, a protective action in response to a determination by the second neural network model that the received input data approximates decoy data in the decoy dataset. 19. The computer program produc
Combinations of networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Supervised learning · CPC title
Learning methods · CPC title
involving event detection and direct action · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.