Dynamic composition of data pipeline in accelerator-as-a-service computing environment

US10776164B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10776164-B2
Application numberUS-201816206857-A
CountryUS
Kind codeB2
Filing dateNov 30, 2018
Priority dateNov 30, 2018
Publication dateSep 15, 2020
Grant dateSep 15, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques are provided to decouple data pipeline tasks from an execution flow of a high-performance computing task (e.g., distributed deep model training) in a distributed computing system. For example, a method includes receiving a client request to provision resources for executing a computing job, provisioning accelerator resources of one or more accelerator server nodes in the distributed computing system to perform tasks associated with an execution flow of the computing job, and provisioning a logical nodes within the distributed computing system to compose a data flow pipeline which is configured to perform data flow operations associated with the computing job for providing data to the provisioned accelerator resources to perform the tasks associated with the execution flow of the computing job. The data flow operations include, e.g., data storage input/output operations, data pre-processing operations, and data staging operations, which are decoupled from the execution flow of the computing job.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving a client request from a client node to provision resources for executing a computing job in a distributed computing system; provisioning a plurality of accelerator resources of one or more server nodes in the distributed computing system to perform tasks associated with an execution flow of the computing job; and provisioning a plurality of logical nodes within the distributed computing system to compose a data flow pipeline which is configured to perform data flow operations associated with the computing job for providing data to the provisioned accelerator resources to perform the tasks associated with the execution flow of the computing job; wherein the data flow operations of the computing job comprise data storage input/output operations, data pre-processing operations, and data staging operations, which are decoupled from the execution flow of the computing job. 2. The method of claim 1 , wherein the computing job comprises a distributed deep learning model training job, wherein the execution flow comprises application programming interface (API) calls that are directed to the one or more server nodes from the client node to perform deep learning model training tasks using data provided by the data flow pipeline. 3. The method of claim 1 , wherein the logical nodes of the data flow pipeline comprise: at least one data storage node to store raw data associated with the computing job; at least one data pre-processing node comprising processors which execute data pre-processing functions on raw data accessed from the at least one data storage node to generate pre-processed data; and at least one data staging node comprising staging memory to store the pre-processed data generated by the at least one data pre-processing node. 4. The method of claim 3 , further comprising performing a memory copy operation by the one or more server nodes to copy the pre-processed data from the staging memory of the at least one data staging node to a memory of one or more accelerator devices, in response to a memory copy API request issued by the client node. 5. The method of claim 3 , further comprising the at least one data pre-processing node continually pre-loading and pre-processing raw data accessed from the at least one data storage node independently of the execution flow between the client node and the one or more server nodes. 6. The method of claim 3 , wherein the at least one data storage node, the at least one data pre-processing node, and the at least one data staging node reside on different server nodes of the distributed computing system. 7. The method of claim 3 , wherein the at least one data pre-processing node and the at least one data staging node reside on a same server node of the distributed computing system. 8. The method of claim 3 , wherein the at least one data storage node and the at least one data pre-processing node reside on a same storage node of a data storage system within the distributed computing system. 9. The method of claim 3 , wherein the at least one data staging node resides on at least one of the one or more server nodes of the distributed computing system. 10. The method of claim 1 , wherein the plurality of logical nodes within the distributed computing system are provisioned to compose the data flow pipeline based on one or more predefined data pipeline composition policies which specify at least one of (i) a number of instances of the logical nodes to be provisioned and (ii) which logical nodes should be co-located on a same physical node. 11. The method of claim 1 , further comprising initiating a remote process mode on the client node to bypass local data loading and pre-processing operations by processors on the client node. 12. The method of claim 11 , wherein responsive to the remote process mode, the client node: intercepting an API call issued during execution of an application associated with the computing job on a processor of the client node; and sending a coordination message to the one or more server nodes to enable the one or more server nodes to execute a task associated with the intercepted API call based on information contained in the coordination message. 13. The method of claim 1 , wherein the distributed computing system comprises one of an accelerator-as-a-service system and a graphics processing unit (GPU)-as-a-service system. 14. An article of manufacture comprising a non-transitory processor-readable storage medium having stored program code of one or more software programs, wherein the program code is executable by one or more processors to implement method steps comprising: receiving a client request from a client node to provision resources for executing a computing job in a distributed computing system; provisioning a plurality of accelerator resources of one or more server nodes in the distributed computing system to perform tasks associated with an execution flow of the computing job; and provisioning a plurality of logical nodes within the distributed computing system to compose a data flow pipeline which is configured to perform data flow operations associated with the computing job for providing data to the provisioned accelerator resources to perform the tasks associated with the execution flow of the computing job; wherein the data flow operations of the computing job comprise data storage input/output operations, data pre-processing operations, and data staging operations, which are decoupled from the execution flow of the computing job. 15. The article of manufacture of claim 14 , wherein the computing job comprises a distributed deep learning model training job, wherein the execution flow comprises application programming interface (API) calls that are directed to the one or more server nodes from the client node to perform deep learning model training tasks using data provided by the data flow pipeline. 16. The article of manufacture of claim 14 , wherein the logical nodes of the data flow pipeline comprise: at least one data storage node to store raw data associated with the computing job; at least one data pre-processing node comprising processors which execute data pre-processing functions on raw data accessed from the at least one data storage node to generate pre-processed data; and at least one data staging node comprising staging memory to store the pre-processed data generated by the at least one data pre-processing node. 17. The article of manufacture of claim 14 , wherein the plurality of logical nodes within the distributed computing system are provisioned to compose the data flow pipeline based on one or more predefined data pipeline composition policies which specify at least one of (i) a number of instances of the logical nodes to be provisioned and (ii) which logical nodes should be co-located on a same physical node. 18. A distributed computing system, comprising: a server cluster comprising a plurality of server nodes, wherein the plurality of server nodes comprise accelerator resources; a control server node comprising a memory to store program instructions, and a processor to execute the stored program instructions to cause the control server node to perform a process which comprises: receiving a client request from a client node to provision resources for executing a computing job in the distributed computing system; provisioning a plurality of accelerator resources of one or more server nodes of the plurality of server nodes in the distributed computing system to perform tasks associated with an ex

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10776164B2 cover?
Techniques are provided to decouple data pipeline tasks from an execution flow of a high-performance computing task (e.g., distributed deep model training) in a distributed computing system. For example, a method includes receiving a client request to provision resources for executing a computing job, provisioning accelerator resources of one or more accelerator server nodes in the distributed …
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/5061. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 15 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).