Spatial Attention Model for Image Captioning
US-2018143966-A1 · May 24, 2018 · US
US11640527B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11640527-B2 |
| Application number | US-201916658399-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 21, 2019 |
| Priority date | Sep 25, 2019 |
| Publication date | May 2, 2023 |
| Grant date | May 2, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are provided for near-zero-cost (NZC) query framework or approach for differentially private deep learning. To protect the privacy of training data during learning, the near-zero-cost query framework transfers knowledge from an ensemble of teacher models trained on partitions of the data to a student model. Privacy guarantees may be understood intuitively and expressed rigorously in terms of differential privacy. Other features are also provided.
Opening claim text (preview).
What is claimed is: 1. A system for private deep learning, comprising: a communication interface that receives model is used to process input data including sensitive information; a memory storing a machine learning model and a plurality of processor-executable instructions; and one or more processors that execute the plurality of processor-executable instructions, wherein during a training process of the machine learning model, to: partition a dataset including the sensitive information into a plurality of data splits; train each of a plurality of teacher models in an ensemble with a respective one of the data splits; in response to a query, generate, by each trained teacher model, a respective prediction; aggregate the predictions generated by the teacher models in the ensemble into a label count vector by computing each entry in the label count vector based on a sum of predicted probabilities from the plurality of teacher models corresponding to a respective label; perturb the label count vector for the teacher ensemble with noise and by adding a constant to a highest counted class vote in the label count vector; and train a student model using the perturbed label count vector for the teacher ensemble. 2. The system of claim 1 , wherein aggregating the predictions is responsive to a query from the student model. 3. The system of claim 1 , wherein the noise used to perturb the label count vector is random. 4. The system of claim 1 , wherein the machine learning model is configured to use a query function. 5. The system of claim 1 , wherein during a training process, the machine learning model performs a noisy ArgMax operation on the label count vector for the teacher ensemble. 6. The system of claim 1 , wherein during a training process, the machine learning model performs an immutable noisy ArgMax operation on the label count vector for the teacher ensemble. 7. A method for training a machine learning model with private deep learning, comprising: partitioning a dataset including the sensitive information into a plurality of data splits; training each of a plurality of teacher models in an ensemble with a respective one of the data splits; in response to a query, generating, by each trained teacher model, a respective prediction; aggregating the predictions generated by the teacher models in the ensemble into a label count vector by computing each entry in the label count vector based on a sum of predicted probabilities from the plurality of teacher models corresponding to a respective label; perturbing the label count vector for the teacher ensemble with noise and by adding a constant to a highest counted class vote in the label count vector; and training a student model using the perturbed label count vector for the teacher ensemble. 8. The method of claim 7 , wherein aggregating the predictions is responsive to a query from the student model. 9. The method of claim 7 , wherein the noise used to perturb the label count vector is random. 10. The method of claim 7 , comprising the student model making a query function to the label count vector. 11. The method of claim 7 , comprising performing a noisy ArgMax operation on the label count vector for the teacher ensemble. 12. The method of claim 7 , comprising performing an immutable noisy ArgMax operation on the label count vector for the teacher ensemble. 13. A non-transitory machine-readable medium comprising a plurality of machine-readable instructions which, when executed by one or more processors, are adapted to cause the one or more processors to perform a method for training a machine learning model with private deep learning comprising: partitioning a dataset including the sensitive information into a plurality of data splits; training each of a plurality of teacher models in an ensemble with a respective one of the data splits; in response to a query, generating, by each trained teacher model, a respective prediction; aggregating the predictions generated by the teacher models in the ensemble into a label count vector by computing each entry in the label count vector based on a sum of predicted probabilities from the plurality of teacher models corresponding to a respective label; perturbing the label count vector for the teacher ensemble with noise and by adding a constant to a highest counted class vote in the label count vector; and training a student model using the perturbed label count vector for the teacher ensemble. 14. The non-transitory machine-readable medium of claim 13 , wherein aggregating the predictions is responsive to a query from the student model. 15. The non-transitory machine-readable medium of claim 13 , wherein the noise used to perturb the label count vector is random. 16. The non-transitory machine-readable medium of claim 13 , comprising the student model making a query function to the label count vector. 17. The non-transitory machine-readable medium of claim 13 , comprising performing a noisy ArgMax operation on the label count vector for the teacher ensemble. 18. The non-transitory machine-readable medium of claim 13 , comprising performing an immutable noisy ArgMax operation on the label count vector for the teacher ensemble.
Feedforward networks · CPC title
Supervised learning · CPC title
Artificial neural networks [ANN] · CPC title
Architecture, e.g. interconnection topology · CPC title
Training; Learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.