Compression switching for federated learning

US11790039B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11790039-B2
Application numberUS-202017083459-A
CountryUS
Kind codeB2
Filing dateOct 29, 2020
Priority dateOct 29, 2020
Publication dateOct 17, 2023
Grant dateOct 17, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods for compression switching that includes distributing a model to client nodes, which use the model to generate a gradient vector (GV) based on a client node data set. The method includes receiving a model update that includes a gradient sign vector (GSV) based on the gradient vector; generating an updated model using the GSV; and distributing the updated model to the client nodes. The client node uses the updated model to generate a second GV based on a second client node data set. The method also includes a determination that a compression switch condition exists; based on the determination, transmitting an instruction to the client node to perform a compression switch; receiving, in response to the instruction, another model update including a subset GSV based on the second gradient vector; generating a second updated model using the subset GSV; and distributing the second updated model to the client nodes.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for compression switching during model training, the method comprising: distributing, by a model coordinator, a current model to a plurality of client nodes comprising a client node, wherein the client node uses the current model to generate a gradient vector based on a client node data set; receiving, from the client node, a first model update comprising a gradient sign vector based on the gradient vector; generating, by the model coordinator, a first updated model using the gradient sign vector; distributing the first updated model to the plurality of client nodes, wherein the client node uses the first updated model to generate a second gradient vector based on a second client node data set; making a first determination, by the model coordinator and after distributing the first updated model, that a compression switch condition exists corresponding to the client node; based on the first determination, transmitting, from the model coordinator, an instruction to the client node to perform a compression switch; receiving, by the model coordinator, from the client node, and in response to the instruction, a second model update comprising a gradient sign subset vector based on the second gradient vector; generating, by the model coordinator, a second updated model using the gradient sign subset vector; and distributing the second updated model to the plurality of client nodes. 2. The method of claim 1 , wherein the compression switch condition comprises a second determination that the client node is overfitting the current model. 3. The method of claim 2 , wherein the second determination comprises determining that the client node is overfitting based on an analysis of past validation losses. 4. The method of claim 1 , wherein the compression switch condition comprises a second determination that network conditions have fallen below a network conditions threshold. 5. The method of claim 1 , wherein generating, by the model coordinator, the first updated model using the gradient sign vector comprises applying a scaling factor to the gradient sign vector. 6. The method of claim 1 , wherein generating, by the model coordinator, the second updated model using the gradient sign subset vector comprises applying a scaling factor to non-null values of the gradient sign subset vector. 7. The method of claim 1 , wherein, before receiving, from the client node, the first model update comprising the gradient sign vector based on the gradient vector, a shared random seed is transmitted from the model coordinator to the client node. 8. The method of claim 7 , wherein generating, by the model coordinator, the second updated model using the gradient sign subset vector comprises using the shared random seed to identify index positions associated with the gradient sign subset vector. 9. A non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to perform a method for compression switching during model training, the method comprising: distributing, by a model coordinator, a current model to a plurality of client nodes comprising a client node, wherein the client node uses the current model to generate a gradient vector based on a client node data set; receiving, from the client node, a first model update comprising a gradient sign vector based on the gradient vector; generating, by the model coordinator, a first updated model using the gradient sign vector; distributing the first updated model to the plurality of client nodes, wherein the client node uses the first updated model to generate a second gradient vector based on a second client node data set; making a first determination, by the model coordinator and after distributing the first updated model, that a compression switch condition exists corresponding to the client node; based on the first determination, transmitting, from the model coordinator, an instruction to the client node to perform a compression switch; receiving, by the model coordinator, from the client node, and in response to the instruction, a second model update comprising a gradient sign subset vector based on the second gradient vector; generating, by the model coordinator, a second updated model using the gradient sign subset vector; and distributing the second updated model to the plurality of client nodes. 10. The non-transitory computer readable medium of claim 9 , wherein the compression switch condition comprises a second determination that the client node is overfitting the current model. 11. The non-transitory computer readable medium of claim 10 , wherein the second determination comprises determining that the client node is overfitting based on an analysis of past validation losses. 12. The non-transitory computer readable medium of claim 9 , wherein the compression switch condition comprises a second determination that network conditions have fallen below a network conditions threshold. 13. The non-transitory computer readable medium of claim 9 , wherein generating, by the model coordinator, the first updated model using the gradient sign vector comprises applying a scaling factor to the gradient sign vector. 14. The non-transitory computer readable medium of claim 9 , wherein generating, by the model coordinator, the second updated model using the gradient sign subset vector comprises applying a scaling factor to non-null values of the gradient sign subset vector. 15. The non-transitory computer readable medium of claim 9 , wherein, before receiving, from the client node, the first model update comprising the gradient sign vector based on the gradient vector, a shared random seed is transmitted from the model coordinator to the client node. 16. The non-transitory computer readable medium of claim 15 , wherein generating, by the model coordinator, the second updated model using the gradient sign subset vector comprises using the shared random seed to identify index positions associated with the gradient sign subset vector. 17. A system for compression switching during model training, the system comprising: a model coordinator, executing on a processor comprising circuitry, and configured to: distribute a current model to a plurality of client nodes comprising a client node, wherein the client node uses the current model to generate a gradient vector based on a client node data set; receive, from the client node, a first model update comprising a gradient sign vector based on the gradient vector; generate a first updated model using the gradient sign vector; distribute the first updated model to the plurality of client nodes, wherein the client node uses the first updated model to generate a second gradient vector based on a second client node data set; make a first determination, after distributing the first updated model, that a compression switch condition exists corresponding to the client node; transmit, based on the first determination, an instruction to the client node to perform a compression switch; receive, from the client node, and in response to the instruction, a second model update comprising a gradient sign subset vector based on the second gradient vector; generate a second updated model using the gradient sign subset vector; and distribute the second updated model to the plurality of client nodes. 18. The system of claim 17 , wherein the compression switch condition comprises a second determination that the client node is overfitting the current model.

Assignees

Inventors

Classifications

  • characterised by the process organisation or structure, e.g. boosting cascade · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Graphical models, e.g. Bayesian networks · CPC title

  • by checking functioning · CPC title

  • Threshold monitoring · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11790039B2 cover?
Methods for compression switching that includes distributing a model to client nodes, which use the model to generate a gradient vector (GV) based on a client node data set. The method includes receiving a model update that includes a gradient sign vector (GSV) based on the gradient vector; generating an updated model using the GSV; and distributing the updated model to the client nodes. The cl…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F18/2148. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 17 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).