Scheduling computation graph heterogeneous computer system

US2020249998A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020249998-A1
Application numberUS-201916265868-A
CountryUS
Kind codeA1
Filing dateFeb 1, 2019
Priority dateFeb 1, 2019
Publication dateAug 6, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present disclosure relates to a method for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph. The computation graph includes a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes. The method comprises partitioning the computation graph into a plurality of subsets, each subset includes at least two nodes, and generating one or more task allocation models for each subset of the plurality of subsets. Wherein a task allocation model of the one or more task allocation models includes information of an execution order of operations represented by the at least two nodes of the corresponding subset and of a target device of the one or more target devices for executing each of the operations. The method further comprises determining an optimized task allocation model for each of the plurality of subsets based on the generated one or more task allocation models, and combining each determined optimized task allocation model for each of the plurality of subsets into a combined model corresponding to the computation graph.

First claim

Opening claim text (preview).

1 . A method for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph, the computation graph including a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes, the method comprising: partitioning the computation graph into a plurality of subsets, each subset includes at least two nodes; generating one or more task allocation models for each subset of the plurality of subsets, wherein a task allocation model includes information of an execution order of operations represented by at least two nodes of the corresponding subset and of a target device of the one or more target devices for executing each of the operations; determining an optimized task allocation model for each of the plurality of subsets based on the generated one or more task allocation models; and combining each determined optimized task allocation model for each of the plurality of subsets into a combined model corresponding to the computation graph. 2 . The method of claim 1 , wherein the task allocation model is represented by a sequence of nodes and a sequence of target devices. 3 . The method of claim 1 , wherein partitioning the computation graph is performed by cutting a single edge connecting two subsets of the plurality of the subsets. 4 . The method of claim 1 , further comprising: replacing a subgraph including at least two nodes among the plurality of nodes included in the computation graph with a single node before partitioning the computation graph. 5 . The method of claim 4 , wherein a target device among the one or more target devices for executing the single node replaced from the subgraph is determined based on a prior execution history. 6 . The method of claim 1 , wherein: determining the optimized task allocation model is performed based on reinforcement learning using a policy network, the policy network receives the task allocation model as an input and outputs an action among possible actions based on probability distribution over the actions, the action corresponding to a change on at least one of the execution order of the operations or the target device for executing one or more of the operations, the policy network is updated according to a reward determined by performance evaluation of the action in runtime environments for executing the computation graph. 7 . The method of claim 6 , wherein the reward is determined based on execution delay or memory usage efficiency. 8 . The method of claim 1 , wherein the task allocation model further includes information of a processing element of the target device for executing each of the operations, and the task allocation model is represented by a sequence of nodes and a sequence of processing elements. 9 . An apparatus for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph, the computation graph including a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes, the apparatus comprising: a memory storing a set of instructions; and one or more processors configured to execute the set of instructions to cause the apparatus to perform: partitioning the computation graph into a plurality of subsets, each subset includes at least two nodes; generating one or more task allocation models for each subset of the plurality of subsets, wherein a task allocation model includes information of an execution order of operations represented by at least two nodes of the corresponding subset and of a target device of the one or more target devices for executing each of the operations; determining an optimized task allocation model for each of the plurality of subsets based on the generated one or more task allocation models; and combining each determined optimized task allocation model for each of the plurality of subsets into a combined model corresponding to the computation graph. 10 . The apparatus of claim 9 , wherein the task allocation model is represented by a sequence of nodes and a sequence of target devices. 11 . The apparatus of claim 9 , wherein partitioning the computation graph is performed by cutting a single edge connecting two subsets of the plurality of the subsets. 12 . The apparatus of claim 9 , wherein the one or more processors are configured to execute the set of instructions to cause the apparatus to further perform: replacing a subgraph including at least two nodes among the plurality of nodes included in the computation graph with a single node before partitioning the computation graph. 13 . The apparatus of claim 12 , wherein a target device among the one or more target devices for executing the single node replaced from the subgraph is determined based on a prior execution history. 14 . The apparatus of claim 9 , wherein: determining the optimized task allocation model is performed based on reinforcement learning using a policy network, the policy network receives the task allocation model as an input and outputs an action among possible actions based on probability distribution over the actions, the action corresponding to a change on at least one of the execution order of the operations or the target device for executing one or more of the operations, the policy network is updated according to a reward determined by performance evaluation of the action in runtime environments for executing the computation graph. 15 . The apparatus of claim 14 , wherein the reward is determined based on execution delay or memory usage efficiency. 16 . The apparatus of claim 9 , wherein the task allocation model further includes information of a processing element of the target device for executing each of the operations, and the task allocation model is represented by a sequence of nodes and a sequence of processing elements. 17 . A non-transitory computer readable medium that stores a set of instructions that is executable by at least one processor of a computing device to cause the computing device to perform a method for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph, the computation graph including a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes, the method comprising: partitioning the computation graph into a plurality of subsets, each subset includes at least two nodes; generating one or more task allocation models for each subset of the plurality of subsets, wherein a task allocation model includes information of an execution order of operations represented by at least two nodes of the corresponding subset and of a target device of the one or more target devices for executing each of the operations; determining an optimized task allocation model for each of the plurality of subsets based on the generated one or more task allocation models; and combining each determined optimized task allocation model for each of the plurality of subsets into a combined model corresponding to the computation graph. 18 . The computer readable medium of claim 17 , wherein the task allocation model is represented by a sequence of nodes and a sequence of target devices. 19 . The computer readable medium of claim 17 , wherein partitioning the computation graph is performed by cutting a single edge connecting two subsets of the plurality of the subsets. 20 . The computer readable

Assignees

Inventors

Classifications

  • G06N3/08Primary

    Learning methods · CPC title

  • Activation functions · CPC title

  • Combinations of networks · CPC title

  • Reinforcement learning · CPC title

  • considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020249998A1 cover?
The present disclosure relates to a method for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph. The computation graph includes a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes. The method comprises partitioning the computation graph into a plurality of subsets…
Who is the assignee on this patent?
Alibaba Group Holding Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 06 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).