Implementing parameter server in networking infrastructure for high-performance computing
US-2019325302-A1 · Oct 24, 2019 · US
US11915147B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11915147-B2 |
| Application number | US-202218048203-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 20, 2022 |
| Priority date | Nov 5, 2018 |
| Publication date | Feb 27, 2024 |
| Grant date | Feb 27, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques that facilitate model support in deep learning are provided. In one example, a system includes a graphics processing unit and a central processing unit memory. The graphics processing unit processes data to train a deep neural network. The central processing unit memory stores a portion of the data to train the deep neural network. The graphics processing unit provides, during a forward pass process of the deep neural network that traverses through a set of layers for the deep neural network from a first layer of the set of layers to a last layer of the set of layers that provides a set of outputs for the deep neural network, input data for a layer from the set of layers for the deep neural network to the central processing unit memory.
Opening claim text (preview).
What is claimed is: 1. A graphics processing unit, comprising: a graphics processing unit cache memory, wherein the graphics processing unit is communicatively coupled to a central processing unit comprising a central processing unit cache memory, wherein the graphics processing unit, during a forward pass process of training a deep neural network that traverses through a set of layers of the deep neural network from a first layer of the set of layers to a last layer of the set of layers, transmits, to the central processing unit for storage in the central processing unit cache memory, data from the graphics processing unit cache memory employed for the training by an intermediate layer of the set of layers between the first layer and the last layer, and wherein the graphics processing unit has determined that at least a portion of the data will be employed by the intermediate layer during a backward pass process of training the deep neural network that traverses from the last layer to the first layer. 2. The graphics processing unit of claim 1 , wherein the graphics processing unit receives, from the central processing unit, during the backward pass process, at least the portion of the data. 3. The graphics processing unit of claim 1 , wherein the intermediate layer employs, during the backward pass process, at least the portion of the data. 4. The graphics processing unit of claim 1 , wherein the graphics processing unit transmits the data to the central processing unit using a compression scheme. 5. The graphics processing unit of claim 1 , wherein the graphics processing unit transmits the data to the central processing unit using a half-precision floating-point format. 6. The graphics processing unit of claim 1 , wherein the data comprises gradient data. 7. The graphics processing unit of claim 1 , wherein the data comprises parameter data. 8. A computer-implemented method, comprising: training, by a graphics processing unit using a graphics processing unit cache memory of the graphics processing unit, a deep neural network that comprises a set of layers, wherein the training comprises: determining, by the graphics processing unit, during a forward pass process of training the deep neural network that traverses through the set of layers from a first layer of the set of layers to a last layer of the set of layers, that data from the graphics processing unit cache memory employed for the training by an intermediate layer of the set of layers between the first layer and the last layer, will be employed by the intermediate layer during a backward pass process of training the deep neural network that traverses from the last layer to the first layer; and transmitting, by the graphics processing unit during the forward pass process, the data to a central processing unit for storage in a central processing unit cache memory of the central processing unit. 9. The computer-implemented method of claim 8 , receiving, by the graphics processing unit, from the central processing unit, the data during the backward pass process. 10. The computer-implemented method of claim 8 , employing, by the graphics processing unit, via the intermediate layer, the data during the backward pass process. 11. The computer-implemented method of claim 8 , wherein the graphics processing unit transmits the data to the central processing unit using a compression scheme. 12. The computer-implemented method of claim 8 , wherein the graphics processing unit transmits the data to the central processing unit using a half-precision floating-point format. 13. The computer-implemented method of claim 8 , wherein the data comprises gradient data. 14. The computer-implemented method of claim 8 , wherein the data comprises parameter data. 15. A computer program product for model support in deep learning, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a graphics processing unit to cause the graphics processing unit to: train, using a graphics processing unit cache memory of the graphics processing unit, a deep neural network that comprises a set of layers, wherein the training comprises: a determination, during a forward pass process of training the deep neural network that traverses through the set of layers from a first layer of the set of layers to a last layer of the set of layers, that data from the graphics processing unit cache memory employed for the training by an intermediate layer of the set of layers between the first layer and the last layer, will be employed by the intermediate layer during a backward pass process of training the deep neural network that traverses from the last layer to the first layer; and a transmission, during the forward pass process, of the data to a central processing unit for storage in a central processing unit cache memory of the central processing unit. 16. The computer program product of claim 15 , wherein the program instructions are further executable by the graphics processing unit to cause the graphics processing unit to: receive, from the central processing unit, the data during the backward pass process. 17. The computer program product of claim 15 , wherein the program instructions are further executable by the graphics processing unit to cause the graphics processing unit to: employ, via the intermediate layer, the data during the backward pass process. 18. The computer program product of claim 15 , wherein the graphics processing unit transmits the data to the central processing unit using a compression scheme. 19. The computer program product of claim 15 , wherein the graphics processing unit transmits the data to the central processing unit using a half-precision floating-point format. 20. The computer program product of claim 15 , wherein the data comprises gradient data.
Supervised learning · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
Feedforward networks · CPC title
Backpropagation, e.g. using gradient descent · CPC title
on a serial bus, e.g. I2C bus, SPI bus (on daisy chain buses G06F13/4247) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.