Hyperparameter determination for a differentially private federated learning process

US11941520B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11941520-B2
Application numberUS-202016738114-A
CountryUS
Kind codeB2
Filing dateJan 9, 2020
Priority dateJan 9, 2020
Publication dateMar 26, 2024
Grant dateMar 26, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques regarding determining hyperparameters for a differentially private federated learning process are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable components stored in the memory. The computer executable components can comprise a hyperparameter advisor component that determines a hyperparameter for a model of a differentially private federated learning process based on a defined numeric relationship between a privacy budget, a learning rate schedule, and a batch size.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a memory that stores computer executable components; and a processor, operably coupled to the memory, and that executes the computer executable components stored in the memory, wherein the computer executable components comprise: a hyperparameter advisory component that iteratively trains an overall machine learning model of a differentially private federated learning process, wherein the training comprises, at each iteration: determining respective values of a hyperparameter for machine learning models distributed on computing devices based on respective privacy budgets, respective learning rate schedules, and respective batch sizes associated with the machine learning models, wherein the respective values of the hyperparameter indicate respective amounts of noise to introduce to respective derivatives of the machine learning models from training to achieve respective defined amounts of privacy of respective training data employed for the training of the machine learning models; transmitting the respective values of the hyperparameter to the computing devices to train the machine learning models and introduce the respective amounts of noise to the respective derivatives of the machine learning models; receiving the respective derivatives of the machine learning models from the computing devices; and aggregating the respective derivatives of the machine learning models to update the overall machine learning model, wherein the respective derivatives comprise at least model weights. 2. The system of claim 1 , wherein the respective privacy budgets associated with the machine learning models define respective total amounts of noise that can be added during the training of the associated machine learning models, wherein the respective learning rate schedules associated with the machine learning models define respective learning rates used at the iterations during the training of the associated machine learning models, and wherein the respective batch sizes associated with the machine learning models define respective amounts of data points used at the iterations during the training of the associated machine learning models. 3. The system of claim 1 , wherein the hyperparameter advisory component further determines respective batch size computations associated with the machine learning models based on the respective privacy budgets and the respective learning rate schedules associated with the machine learning models, wherein the respective batch size computations comprise at least one of the respective batch sizes, respective first factors by which the respective batch sizes change in relation to respective changes in the respective learning rate schedules, and respective second factors by which the respective batch sizes change in relation to respective changes in the respective privacy budgets. 4. The system of claim 1 , wherein the hyperparameter advisory component further determines respective learning rate computations associated with the machine learning models based on the respective privacy budgets and the respective batch sizes associated with the machine learning models, wherein the respective learning rate computations comprise at least one of the respective learning rate schedules, respective first factors by which the respective learning rate schedules change in relation to respective changes in the respective batch sizes, and respective second factors by which the respective learning rate schedules change in relation to respective changes in the respective privacy budgets. 5. The system of claim 1 , wherein the hyperparameter advisory component further determines respective privacy computations associated with the machine learning models based on the respective batch sizes and the learning rate schedules associated with the machine learning models, wherein the respective privacy computations at least one of the respective privacy budgets, respective first factors by which the respective privacy budgets change in relation to respective changes in the respective learning rate schedules, and respective second factors by which the respective privacy budgets change in relation to respective changes in the respective batch sizes. 6. The system of claim 1 , wherein the respective amounts of noise increase as the respective batch sizes increase. 7. The system of claim 1 , wherein the respective amounts of noise increase as respective learning rates defined in the respective learning rate schedules decrease. 8. A system, comprising: a memory that stores computer executable components; and a processor, operably coupled to the memory, and that executes the computer executable components stored in the memory, wherein the computer executable components comprise: a model component that iteratively trains a machine learning model of a differentially private federated learning process, wherein the training comprises, at each iteration: receiving, from a server device, a hyperparameter that was determined based on a privacy budget, a learning rate schedule, and a batch size associated with the machine learning model, wherein the value of the hyperparameter indicates an amount of noise to introduce to derivatives of the machine learning model from training to achieve a defined amount of privacy of training data employed for the training of the machine learning model, wherein the derivatives comprise at least model weights; training the machine learning model using the training data; introducing the amount of noise to the derivatives of the machine learning models from the training; and sending the derivatives to the server device for training an overall machine learning model of the differentially private federated learning process. 9. The system of claim 8 , wherein the privacy budget associated with the machine learning model defines a total amount of noise that can be added during the training of the machine learning model, wherein the learning rate schedule associated with the machine learning model defines learning rates used at the iterations during the training of the machine learning model, and wherein the batch size associated with the machine learning model defines amounts of data points used at the iterations during the training of the machine learning model. 10. The system of claim 8 , wherein the amount of noise increases as the batch size increases. 11. The system of claim 8 , wherein the amount of noise increases as a learning rate defined in the learning rate schedule decreases. 12. A computer-implemented method, comprising: iteratively training, by a system operatively coupled to a processor, an overall machine learning model of a differentially private federated learning process, wherein the training comprises, at each iteration: determining respective values of a hyperparameter for machine learning models distributed on computing devices respective privacy budgets, respective learning rate schedules, and respective batch sizes associated with the machine learning models, wherein the respective values of the hyperparameter indicate respective amounts of noise to introduce to respective derivatives of the machine learning models from training to achieve respective defined amounts of privacy of respective training data employed for the training of the machine learning models; transmitting the respective values of the hyperparameter to the computing devices to train the machine learning models and introduce the respective amounts of noise to the respective derivatives of the machine learning models; receiving the respective derivatives of the machine learning models from the computing devices; and aggregating the resp

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • Supervised learning · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11941520B2 cover?
Techniques regarding determining hyperparameters for a differentially private federated learning process are provided. For example, one or more embodiments described herein can comprise a system, which can comprise a memory that can store computer executable components. The system can also comprise a processor, operably coupled to the memory, and that can execute the computer executable compone…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 26 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).