Deep learning via dynamic root solvers

US10169084B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10169084-B2
Application numberUS-201715857765-A
CountryUS
Kind codeB2
Filing dateDec 29, 2017
Priority dateMar 3, 2017
Publication dateJan 1, 2019
Grant dateJan 1, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention provides a computer implemented method, system, and computer program product of deep learning via dynamic root solvers. In an embodiment, the present invention includes (1) forming an initial set of GPUs into an initial binary tree architecture, where the initial set includes initially idle GPUs and an initial root solver GPU as the root of the initial binary tree architecture, (2) calculating initial gradients and initial adjusted weight data, (3) choosing a first currently idle GPU as a current root solver GPU, (4) forming a current set of GPUs into a current binary tree architecture, where the current set includes the additional currently idle GPUs and the current root solver GPU as the root of the current binary tree architecture, (5) calculating current gradients and current adjusted weight data, and (6) transmitting an initial update to the weight data to the available GPUs.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer implemented method comprising: identifying, by a host computer processor, graphic processor units (GPUs) that are available (available GPUs); identifying, by the host computer processor, GPUs that are idle (initially idle GPUs) among the available GPUs for an initial iteration of deep learning; choosing, by the host computer processor, one of the initially idle GPUs as an initial root solver GPU for the initial iteration; initializing, by the host computer processor, weight data for an initial set of multidimensional data; transmitting, by the host computer processor, the initial set of multidimensional data to the available GPUs; forming, by the host computer processor, an initial set of GPUs into an initial binary tree architecture, wherein the initial set comprises the initially idle GPUs and the initial root solver GPU, wherein the initial root solver GPU is the root of the initial binary tree architecture; calculating, by the initial set of GPUs, initial gradients and a set of initial adjusted weight data with respect to the weight data and the initial set of multidimensional data via the initial binary tree architecture; in response to the calculating the initial gradients and the initial adjusted weight data, identifying, by the host computer processor, a first GPU among the available GPUs to become idle (first currently idle GPU) for a current iteration of deep learning; choosing, by the host computer processor, the first currently idle GPU as a current root solver GPU for the current iteration; transmitting, by the host computer processor, a current set of multidimensional data to the current root solver GPU; in response to the identifying the first currently idle GPU, identifying, by the host computer processor, additional GPUs that are currently idle (additional currently idle GPUs) among the available GPUs; transmitting, by the host computer processor, the current set of multidimensional data to the additional currently idle GPUs; forming, by the host computer processor, a current set of GPUs into a current binary tree architecture, wherein the current set comprises the additional currently idle GPUs and the current root solver GPU, wherein the current root solver GPU is the root of the current binary tree architecture; calculating, by the current set of GPUs, current gradients and a set of current adjusted weight data with respect to at least the weight data and the current set of multidimensional data via the current binary tree architecture; in response to the initial root solver GPU receiving a set of calculated initial adjusted weight data, transmitting, by the initial root solver GPU, an initial update to the weight data to the available GPUs; in response to the current root solver GPU receiving a set of current initial adjusted weight data, transmitting, by the current root solver GPU, a current update to the weight data to the available GPUs; and repeating the identifying, the choosing, the transmitting, the forming, and the calculating with respect to the weight data, updates to the weight data, and subsequent sets of multidimensional data. 2. The method of claim 1 wherein the identifying GPUs that are idle among the available GPUs comprises executing, by the host computer processor, a run command from a central processing unit (CPU) of each of the available GPUs to determine a percentage of the each of the available GPUs being utilized. 3. The method of claim 1 wherein the initializing comprises setting, by the host computer processor, the weight data in a random manner. 4. The method of claim 1 wherein the initializing comprises setting, by the host computer processor, the weight data in accordance with input received from a user. 5. The method of claim 1 wherein the forming the initial set of GPUs into the initial binary tree architecture comprises logically connecting, by the host computer processor, a first GPU among the initially idle GPUs as a leaf node to a second GPU among the initially idle GPUs as a parent node if a fast communication link exists between the first GPU and the second GPU. 6. The method of claim 5 wherein the fast communication link comprises a peer-to-peer connection. 7. The method of claim 1 wherein the calculating the initial gradients and the set of initial adjusted weight data with respect to the weight data and the initial set of multidimensional data via the initial binary tree architecture comprises: distributing, by the initial root solver GPU, the weight data to the initially idle GPUs within the initial set of GPUs via the initial binary tree architecture; calculating, by each of the initially idle GPUs within the initial set of GPUs, an initial gradient with respect to the initial set of multidimensional data and the weight data; transmitting, by each of the initially idle GPUs within the initial set of GPUs, the calculated initial gradient to a corresponding initial parent GPU within the initial set of GPUs via the initial binary tree architecture; calculating, by the corresponding initial parent GPU, initial adjusted weight data with respect to the calculated initial gradient; and transmitting, by the corresponding initial parent GPU, the calculated initial adjusted weight data to a parent GPU of the corresponding initial parent GPU via the initial binary tree architecture, wherein the parent GPU is within the initial set of GPUs. 8. The method of claim 1 wherein the identifying a first GPU among the available GPUs to become idle comprises executing, by the host computer processor, a run command from a central processing unit (CPU) of each of the available GPUs to determine a percentage of the each of the available GPUs being utilized. 9. The method of claim 1 wherein the identifying additional GPUs that are currently idle among the available GPUs comprises executing, by the host computer processor, a run command from a central processing unit (CPU) of each of the available GPUs to determine a percentage of the each of the available GPUs being utilized. 10. The method of claim 1 wherein the forming the current set of GPUs into the current binary tree architecture comprises logically connecting, by the host computer processor, a first GPU among the additional currently idle GPUs as a leaf node to a second GPU among the additional currently idle GPUs and the current root solver GPU as a parent node if a fast communication link exists between the first GPU and the second GPU. 11. The method of claim 10 wherein the fast communication link comprises a peer-to-peer connection. 12. The method of claim 1 wherein the calculating the current gradients and the set of current adjusted weight data with respect to at least the weight data and the current set of multidimensional data via the current binary tree architecture comprises: distributing, by the current root solver GPU, the weight data to the additional currently idle GPUs within the current set of GPUs via the current binary tree architecture; calculating, by each of the additional currently idle GPUs within the current set of GPUs, a current gradient with respect to the current set of multidimensional data and the weight data; transmitting, by each of the additional currently idle GPUs within the current set of GPUs, the calculated current gradient to a corresponding current parent GPU within the initial set of GPUs via the current binary tree architecture; calculating, by the corresponding current parent GPU, current adjusted weight data with respect to the calculated current gradient; and transmitting, by the corresponding current parent GPU, the calculated current adjusted weight data to a parent GPU

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • G06F9/50Primary

    Allocation of resources, e.g. of the central processing unit [CPU] · CPC title

  • Physics · mapped topic

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10169084B2 cover?
The present invention provides a computer implemented method, system, and computer program product of deep learning via dynamic root solvers. In an embodiment, the present invention includes (1) forming an initial set of GPUs into an initial binary tree architecture, where the initial set includes initially idle GPUs and an initial root solver GPU as the root of the initial binary tree architec…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/50. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 01 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).