Near-zero-cost differentially private deep learning with teacher ensembles

US11640527B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11640527-B2
Application numberUS-201916658399-A
CountryUS
Kind codeB2
Filing dateOct 21, 2019
Priority dateSep 25, 2019
Publication dateMay 2, 2023
Grant dateMay 2, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are provided for near-zero-cost (NZC) query framework or approach for differentially private deep learning. To protect the privacy of training data during learning, the near-zero-cost query framework transfers knowledge from an ensemble of teacher models trained on partitions of the data to a student model. Privacy guarantees may be understood intuitively and expressed rigorously in terms of differential privacy. Other features are also provided.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for private deep learning, comprising: a communication interface that receives model is used to process input data including sensitive information; a memory storing a machine learning model and a plurality of processor-executable instructions; and one or more processors that execute the plurality of processor-executable instructions, wherein during a training process of the machine learning model, to: partition a dataset including the sensitive information into a plurality of data splits; train each of a plurality of teacher models in an ensemble with a respective one of the data splits; in response to a query, generate, by each trained teacher model, a respective prediction; aggregate the predictions generated by the teacher models in the ensemble into a label count vector by computing each entry in the label count vector based on a sum of predicted probabilities from the plurality of teacher models corresponding to a respective label; perturb the label count vector for the teacher ensemble with noise and by adding a constant to a highest counted class vote in the label count vector; and train a student model using the perturbed label count vector for the teacher ensemble. 2. The system of claim 1 , wherein aggregating the predictions is responsive to a query from the student model. 3. The system of claim 1 , wherein the noise used to perturb the label count vector is random. 4. The system of claim 1 , wherein the machine learning model is configured to use a query function. 5. The system of claim 1 , wherein during a training process, the machine learning model performs a noisy ArgMax operation on the label count vector for the teacher ensemble. 6. The system of claim 1 , wherein during a training process, the machine learning model performs an immutable noisy ArgMax operation on the label count vector for the teacher ensemble. 7. A method for training a machine learning model with private deep learning, comprising: partitioning a dataset including the sensitive information into a plurality of data splits; training each of a plurality of teacher models in an ensemble with a respective one of the data splits; in response to a query, generating, by each trained teacher model, a respective prediction; aggregating the predictions generated by the teacher models in the ensemble into a label count vector by computing each entry in the label count vector based on a sum of predicted probabilities from the plurality of teacher models corresponding to a respective label; perturbing the label count vector for the teacher ensemble with noise and by adding a constant to a highest counted class vote in the label count vector; and training a student model using the perturbed label count vector for the teacher ensemble. 8. The method of claim 7 , wherein aggregating the predictions is responsive to a query from the student model. 9. The method of claim 7 , wherein the noise used to perturb the label count vector is random. 10. The method of claim 7 , comprising the student model making a query function to the label count vector. 11. The method of claim 7 , comprising performing a noisy ArgMax operation on the label count vector for the teacher ensemble. 12. The method of claim 7 , comprising performing an immutable noisy ArgMax operation on the label count vector for the teacher ensemble. 13. A non-transitory machine-readable medium comprising a plurality of machine-readable instructions which, when executed by one or more processors, are adapted to cause the one or more processors to perform a method for training a machine learning model with private deep learning comprising: partitioning a dataset including the sensitive information into a plurality of data splits; training each of a plurality of teacher models in an ensemble with a respective one of the data splits; in response to a query, generating, by each trained teacher model, a respective prediction; aggregating the predictions generated by the teacher models in the ensemble into a label count vector by computing each entry in the label count vector based on a sum of predicted probabilities from the plurality of teacher models corresponding to a respective label; perturbing the label count vector for the teacher ensemble with noise and by adding a constant to a highest counted class vote in the label count vector; and training a student model using the perturbed label count vector for the teacher ensemble. 14. The non-transitory machine-readable medium of claim 13 , wherein aggregating the predictions is responsive to a query from the student model. 15. The non-transitory machine-readable medium of claim 13 , wherein the noise used to perturb the label count vector is random. 16. The non-transitory machine-readable medium of claim 13 , comprising the student model making a query function to the label count vector. 17. The non-transitory machine-readable medium of claim 13 , comprising performing a noisy ArgMax operation on the label count vector for the teacher ensemble. 18. The non-transitory machine-readable medium of claim 13 , comprising performing an immutable noisy ArgMax operation on the label count vector for the teacher ensemble.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11640527B2 cover?
Systems and methods are provided for near-zero-cost (NZC) query framework or approach for differentially private deep learning. To protect the privacy of training data during learning, the near-zero-cost query framework transfers knowledge from an ensemble of teacher models trained on partitions of the data to a student model. Privacy guarantees may be understood intuitively and expressed rigor…
Who is the assignee on this patent?
Salesforce Com Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/082. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 02 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).