Systems and methods of distributed optimization
US-10402469-B2 · Sep 3, 2019 · US
US11487698B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11487698-B2 |
| Application number | US-202117216322-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 29, 2021 |
| Priority date | Jun 1, 2017 |
| Publication date | Nov 1, 2022 |
| Grant date | Nov 1, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed herein are a parameter server and a method for sharing distributed deep-learning parameters using the parameter server. The method for sharing distributed deep-learning parameters using the parameter server includes initializing a global weight parameter in response to an initialization request by a master process; performing an update by receiving a learned local gradient parameter from the worker process, which performs deep-learning training after updating a local weight parameter using the global weight parameter; accumulating the gradient parameters in response to a request by the master process; and performing an update by receiving the global weight parameter from the master process that calculates the global weight parameter using the accumulated gradient parameters of the one or more worker processes.
Opening claim text (preview).
What is claimed is: 1. A method for sharing distributed deep-learning parameters, performed by a parameter server, comprising: creating and allocating shared memory in response to a first request from a plurality of distributed deep-learning processes, which include a master process and one or more worker processes; initializing a master weight parameter area in the shared memory; enabling the plurality of distributed deep-learning processes to perform distributed deep-learning training using deep-learning parameters shared through the shared memory; and deallocating and deleting the shared memory that is no longer used after the distributed deep-learning training is finished, wherein deallocating and deleting the shared memory comprises: receiving a second request to deallocate the shared memory from the one or more worker processes; deallocating the shared memory in response to the second request; receiving a third request to delete the shared memory from the master process when the shared memory is deallocated; and deleting the shared memory in response to the third request. 2. The method of claim 1 , wherein creating and allocating the shared memory comprises: receiving the first request from the master process; creating the shared memory in response to the first request; sending a shared memory creation key and access information corresponding to the created shared memory to the master process; receiving a fourth request to set an event from the master process and setting an event of the shared memory in response to the fourth request; receiving a fifth request to allocate the shared memory from the one or more worker processes, which have received the shared memory creation key from the master process; and allocating the shared memory in response to the fifth request and sending information that is necessary in order to access the allocated shared memory to the one or more worker processes. 3. The method of claim 1 , wherein the plurality of distributed deep-learning processes share, using the shared memory, the deep-learning parameters in a synchronous manner or in an asynchronous manner. 4. The method of claim 3 , wherein deep-learning parameters in a synchronous manner comprises: updating, by the one or more worker processes, worker local weight parameter areas of the one or more worker processes using a value of a master weight parameter in the shared memory; accumulating, by the parameter server, gradient parameters by receiving learned worker local gradient parameters from the one or more worker processes that perform the distributed deep-learning training in the synchronous manner; receiving, by the parameter server, an updated master weight parameter, calculated using the accumulated gradient parameters of the one or more worker processes, from the master process, and updating, by the parameter server, the master weight parameter area with the updated master weigher parameter; and announcing, by the parameter server, an update of the master weight parameter area to the one or more worker processes. 5. The method of claim 4 , wherein accumulating the gradient parameters comprises: storing the worker local gradient parameters, learned by the one or more worker processes that perform the distributed deep-learning training, in worker gradient parameter areas in the shared memory; receiving a request to accumulate the worker local gradient parameters from the one or more worker processes; accumulating the worker local gradient parameters stored in the shared memory, which correspond to the request to accumulate the worker local gradient parameters, into the updated master gradient parameter; and announcing completion of the accumulation to the master process. 6. The method of claim 3 , wherein sharing the deep-learning parameters in an asynchronous manner comprises: updating, by the one or more worker processes, worker local weight parameter areas of the one or more worker processes using a value of a master weight parameter in the shared memory; updating, by the one or more worker processes, a worker gradient parameter in the shared memory; updating, by the parameter server, the master weight parameter area in response to a request to update the master weight parameter, which is received from the one or more worker processes; and announcing, by the parameter server, an update of the master weight parameter area to the one or more worker processes.
Distributed learning, e.g. federated learning · CPC title
in which an application is distributed across nodes in the network (software deployment G06F8/60; multiprogramming arrangements G06F9/46) · CPC title
Allocation of resources, e.g. of the central processing unit [CPU] · CPC title
Buffers; Shared memory; Pipes · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.