System
US-2019188563-A1 · Jun 20, 2019 · US
US12242952B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12242952-B2 |
| Application number | US-201816129161-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 12, 2018 |
| Priority date | Dec 18, 2017 |
| Publication date | Mar 4, 2025 |
| Grant date | Mar 4, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to one embodiment, in nth (n is a natural number) processing, a first node calculates a first gradient to update a first weight and a second node calculates a second gradient to update the first weight. In mth (m is a natural number) processing, a third node calculates a third gradient to update a third weight and a fourth node calculates a fourth gradient to update the third weight. If the calculation by the first and second nodes is faster than the calculation by the third and fourth nodes, in n+1th processing, a second weight updated from the first weight is further updated using the first and second gradients, and, in m+1th processing, a fourth weight updated from the third weight is further updated using the first to fourth gradients.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a first node and a second node of a first group; a third node and a fourth node of a second group; and a server node, wherein, each of the first to fourth nodes includes a processor and stores a first weight of objective function, the server node includes a processor, the processor included in the first node is configured to obtain a first gradient by performing a first training process using first training data allocated to the first node and the first weight, and to transmit the first gradient to the server node, the processor included in the second node is configured to obtain a second gradient by performing a second training process using second training data allocated to the second node and the first weight, and to transmit the second gradient to the server node, the first training process and the second training process are performed in synchronization, the processor included in the server node is configured to perform a first update process to update the first weight to a second weight based on an average value of the first and second gradients, and to transmit the second weight to the first node and the second node, the second weight is stored in each of the first and second nodes, the processor included in the third node is configured to obtain a third gradient by performing a third training process using third training data allocated to the third node and the first weight, and to transmit the third gradient to the server node, the processor included in the fourth node is configured to obtain a fourth gradient by performing a fourth training process using fourth training data allocated to the fourth node and the first weight, and to transmit the fourth gradient to the server node, the third training process and the fourth training process are performed in synchronization, the processor included in the server node is configured to perform a second update process to update the second weight to a third weight based on an average value of the third and fourth gradients, and to transmit the third weight to the third node and the fourth node, the third weight is stored in each of the third and fourth nodes, and the first update process and the second update process are performed in non-synchronization. 2. The system of claim 1 , wherein the first weight is updated using a fifth gradient calculated from the first and second gradients, and the second weight is updated using a sixth gradient calculated from the third and fourth gradients. 3. The system of claim 1 , wherein, if the calculation of gradient by the processors included in the first node and the second node is faster than the calculation of gradient by the processors included in the third node and the fourth node, the first and second gradients are transmitted to the third node and the fourth node, and, the third weight is further updated using the first to fourth gradients, and if the calculation of gradient by the processors included in the third node and the fourth node is faster than the calculation of gradient by the processors included in the first node and the second node, the third and fourth gradients are transmitted to the first node and the second node, and, the second weight is further updated using the first to fourth gradients. 4. The system of claim 1 , wherein a difference of processing speed between the first node and the second node of the first group is a first threshold value or less, and a difference of processing speed between the third node and the fourth node of the second group is a second threshold value or less. 5. The system of claim 4 , wherein a plurality of nodes including the first node and the second node are in the first group, a plurality of nodes including the third node and the fourth node are in the second group, and if the processing speed of the third node and the fourth node is slower than the processing speed of the first node and the second node, a first number of the plurality of nodes of the first group is less than a second number of the plurality of nodes of the second group. 6. The system of claim 4 , wherein, if the processing speed of the first node and the second node is slower than the processing speed of the third node and the fourth node, an amount of processing of each of the third node and the fourth node of the second group in the parallel distributed processing is less than the amount of processing of the first node and the second node of the first group.
Distributed learning, e.g. federated learning · CPC title
Supervised learning · CPC title
Machine learning · CPC title
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Combinations of networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.