Federated learning method and apparatus, and chip
US-2023116117-A1 · Apr 13, 2023 · US
US11790039B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11790039-B2 |
| Application number | US-202017083459-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 29, 2020 |
| Priority date | Oct 29, 2020 |
| Publication date | Oct 17, 2023 |
| Grant date | Oct 17, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods for compression switching that includes distributing a model to client nodes, which use the model to generate a gradient vector (GV) based on a client node data set. The method includes receiving a model update that includes a gradient sign vector (GSV) based on the gradient vector; generating an updated model using the GSV; and distributing the updated model to the client nodes. The client node uses the updated model to generate a second GV based on a second client node data set. The method also includes a determination that a compression switch condition exists; based on the determination, transmitting an instruction to the client node to perform a compression switch; receiving, in response to the instruction, another model update including a subset GSV based on the second gradient vector; generating a second updated model using the subset GSV; and distributing the second updated model to the client nodes.
Opening claim text (preview).
What is claimed is: 1. A method for compression switching during model training, the method comprising: distributing, by a model coordinator, a current model to a plurality of client nodes comprising a client node, wherein the client node uses the current model to generate a gradient vector based on a client node data set; receiving, from the client node, a first model update comprising a gradient sign vector based on the gradient vector; generating, by the model coordinator, a first updated model using the gradient sign vector; distributing the first updated model to the plurality of client nodes, wherein the client node uses the first updated model to generate a second gradient vector based on a second client node data set; making a first determination, by the model coordinator and after distributing the first updated model, that a compression switch condition exists corresponding to the client node; based on the first determination, transmitting, from the model coordinator, an instruction to the client node to perform a compression switch; receiving, by the model coordinator, from the client node, and in response to the instruction, a second model update comprising a gradient sign subset vector based on the second gradient vector; generating, by the model coordinator, a second updated model using the gradient sign subset vector; and distributing the second updated model to the plurality of client nodes. 2. The method of claim 1 , wherein the compression switch condition comprises a second determination that the client node is overfitting the current model. 3. The method of claim 2 , wherein the second determination comprises determining that the client node is overfitting based on an analysis of past validation losses. 4. The method of claim 1 , wherein the compression switch condition comprises a second determination that network conditions have fallen below a network conditions threshold. 5. The method of claim 1 , wherein generating, by the model coordinator, the first updated model using the gradient sign vector comprises applying a scaling factor to the gradient sign vector. 6. The method of claim 1 , wherein generating, by the model coordinator, the second updated model using the gradient sign subset vector comprises applying a scaling factor to non-null values of the gradient sign subset vector. 7. The method of claim 1 , wherein, before receiving, from the client node, the first model update comprising the gradient sign vector based on the gradient vector, a shared random seed is transmitted from the model coordinator to the client node. 8. The method of claim 7 , wherein generating, by the model coordinator, the second updated model using the gradient sign subset vector comprises using the shared random seed to identify index positions associated with the gradient sign subset vector. 9. A non-transitory computer readable medium comprising computer readable program code, which when executed by a computer processor enables the computer processor to perform a method for compression switching during model training, the method comprising: distributing, by a model coordinator, a current model to a plurality of client nodes comprising a client node, wherein the client node uses the current model to generate a gradient vector based on a client node data set; receiving, from the client node, a first model update comprising a gradient sign vector based on the gradient vector; generating, by the model coordinator, a first updated model using the gradient sign vector; distributing the first updated model to the plurality of client nodes, wherein the client node uses the first updated model to generate a second gradient vector based on a second client node data set; making a first determination, by the model coordinator and after distributing the first updated model, that a compression switch condition exists corresponding to the client node; based on the first determination, transmitting, from the model coordinator, an instruction to the client node to perform a compression switch; receiving, by the model coordinator, from the client node, and in response to the instruction, a second model update comprising a gradient sign subset vector based on the second gradient vector; generating, by the model coordinator, a second updated model using the gradient sign subset vector; and distributing the second updated model to the plurality of client nodes. 10. The non-transitory computer readable medium of claim 9 , wherein the compression switch condition comprises a second determination that the client node is overfitting the current model. 11. The non-transitory computer readable medium of claim 10 , wherein the second determination comprises determining that the client node is overfitting based on an analysis of past validation losses. 12. The non-transitory computer readable medium of claim 9 , wherein the compression switch condition comprises a second determination that network conditions have fallen below a network conditions threshold. 13. The non-transitory computer readable medium of claim 9 , wherein generating, by the model coordinator, the first updated model using the gradient sign vector comprises applying a scaling factor to the gradient sign vector. 14. The non-transitory computer readable medium of claim 9 , wherein generating, by the model coordinator, the second updated model using the gradient sign subset vector comprises applying a scaling factor to non-null values of the gradient sign subset vector. 15. The non-transitory computer readable medium of claim 9 , wherein, before receiving, from the client node, the first model update comprising the gradient sign vector based on the gradient vector, a shared random seed is transmitted from the model coordinator to the client node. 16. The non-transitory computer readable medium of claim 15 , wherein generating, by the model coordinator, the second updated model using the gradient sign subset vector comprises using the shared random seed to identify index positions associated with the gradient sign subset vector. 17. A system for compression switching during model training, the system comprising: a model coordinator, executing on a processor comprising circuitry, and configured to: distribute a current model to a plurality of client nodes comprising a client node, wherein the client node uses the current model to generate a gradient vector based on a client node data set; receive, from the client node, a first model update comprising a gradient sign vector based on the gradient vector; generate a first updated model using the gradient sign vector; distribute the first updated model to the plurality of client nodes, wherein the client node uses the first updated model to generate a second gradient vector based on a second client node data set; make a first determination, after distributing the first updated model, that a compression switch condition exists corresponding to the client node; transmit, based on the first determination, an instruction to the client node to perform a compression switch; receive, from the client node, and in response to the instruction, a second model update comprising a gradient sign subset vector based on the second gradient vector; generate a second updated model using the gradient sign subset vector; and distribute the second updated model to the plurality of client nodes. 18. The system of claim 17 , wherein the compression switch condition comprises a second determination that the client node is overfitting the current model.
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
Validation; Performance evaluation; Active pattern learning techniques · CPC title
Graphical models, e.g. Bayesian networks · CPC title
by checking functioning · CPC title
Threshold monitoring · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.