Deep Reinforcement Learning for Workflow Optimization
US-2019325304-A1 · Oct 24, 2019 · US
US2020249998A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020249998-A1 |
| Application number | US-201916265868-A |
| Country | US |
| Kind code | A1 |
| Filing date | Feb 1, 2019 |
| Priority date | Feb 1, 2019 |
| Publication date | Aug 6, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to a method for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph. The computation graph includes a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes. The method comprises partitioning the computation graph into a plurality of subsets, each subset includes at least two nodes, and generating one or more task allocation models for each subset of the plurality of subsets. Wherein a task allocation model of the one or more task allocation models includes information of an execution order of operations represented by the at least two nodes of the corresponding subset and of a target device of the one or more target devices for executing each of the operations. The method further comprises determining an optimized task allocation model for each of the plurality of subsets based on the generated one or more task allocation models, and combining each determined optimized task allocation model for each of the plurality of subsets into a combined model corresponding to the computation graph.
Opening claim text (preview).
1 . A method for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph, the computation graph including a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes, the method comprising: partitioning the computation graph into a plurality of subsets, each subset includes at least two nodes; generating one or more task allocation models for each subset of the plurality of subsets, wherein a task allocation model includes information of an execution order of operations represented by at least two nodes of the corresponding subset and of a target device of the one or more target devices for executing each of the operations; determining an optimized task allocation model for each of the plurality of subsets based on the generated one or more task allocation models; and combining each determined optimized task allocation model for each of the plurality of subsets into a combined model corresponding to the computation graph. 2 . The method of claim 1 , wherein the task allocation model is represented by a sequence of nodes and a sequence of target devices. 3 . The method of claim 1 , wherein partitioning the computation graph is performed by cutting a single edge connecting two subsets of the plurality of the subsets. 4 . The method of claim 1 , further comprising: replacing a subgraph including at least two nodes among the plurality of nodes included in the computation graph with a single node before partitioning the computation graph. 5 . The method of claim 4 , wherein a target device among the one or more target devices for executing the single node replaced from the subgraph is determined based on a prior execution history. 6 . The method of claim 1 , wherein: determining the optimized task allocation model is performed based on reinforcement learning using a policy network, the policy network receives the task allocation model as an input and outputs an action among possible actions based on probability distribution over the actions, the action corresponding to a change on at least one of the execution order of the operations or the target device for executing one or more of the operations, the policy network is updated according to a reward determined by performance evaluation of the action in runtime environments for executing the computation graph. 7 . The method of claim 6 , wherein the reward is determined based on execution delay or memory usage efficiency. 8 . The method of claim 1 , wherein the task allocation model further includes information of a processing element of the target device for executing each of the operations, and the task allocation model is represented by a sequence of nodes and a sequence of processing elements. 9 . An apparatus for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph, the computation graph including a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes, the apparatus comprising: a memory storing a set of instructions; and one or more processors configured to execute the set of instructions to cause the apparatus to perform: partitioning the computation graph into a plurality of subsets, each subset includes at least two nodes; generating one or more task allocation models for each subset of the plurality of subsets, wherein a task allocation model includes information of an execution order of operations represented by at least two nodes of the corresponding subset and of a target device of the one or more target devices for executing each of the operations; determining an optimized task allocation model for each of the plurality of subsets based on the generated one or more task allocation models; and combining each determined optimized task allocation model for each of the plurality of subsets into a combined model corresponding to the computation graph. 10 . The apparatus of claim 9 , wherein the task allocation model is represented by a sequence of nodes and a sequence of target devices. 11 . The apparatus of claim 9 , wherein partitioning the computation graph is performed by cutting a single edge connecting two subsets of the plurality of the subsets. 12 . The apparatus of claim 9 , wherein the one or more processors are configured to execute the set of instructions to cause the apparatus to further perform: replacing a subgraph including at least two nodes among the plurality of nodes included in the computation graph with a single node before partitioning the computation graph. 13 . The apparatus of claim 12 , wherein a target device among the one or more target devices for executing the single node replaced from the subgraph is determined based on a prior execution history. 14 . The apparatus of claim 9 , wherein: determining the optimized task allocation model is performed based on reinforcement learning using a policy network, the policy network receives the task allocation model as an input and outputs an action among possible actions based on probability distribution over the actions, the action corresponding to a change on at least one of the execution order of the operations or the target device for executing one or more of the operations, the policy network is updated according to a reward determined by performance evaluation of the action in runtime environments for executing the computation graph. 15 . The apparatus of claim 14 , wherein the reward is determined based on execution delay or memory usage efficiency. 16 . The apparatus of claim 9 , wherein the task allocation model further includes information of a processing element of the target device for executing each of the operations, and the task allocation model is represented by a sequence of nodes and a sequence of processing elements. 17 . A non-transitory computer readable medium that stores a set of instructions that is executable by at least one processor of a computing device to cause the computing device to perform a method for scheduling a computation graph on a heterogeneous computing resource including one or more target devices for executing the computation graph, the computation graph including a plurality of nodes and edges, each edge connecting two nodes among the plurality of nodes, the method comprising: partitioning the computation graph into a plurality of subsets, each subset includes at least two nodes; generating one or more task allocation models for each subset of the plurality of subsets, wherein a task allocation model includes information of an execution order of operations represented by at least two nodes of the corresponding subset and of a target device of the one or more target devices for executing each of the operations; determining an optimized task allocation model for each of the plurality of subsets based on the generated one or more task allocation models; and combining each determined optimized task allocation model for each of the plurality of subsets into a combined model corresponding to the computation graph. 18 . The computer readable medium of claim 17 , wherein the task allocation model is represented by a sequence of nodes and a sequence of target devices. 19 . The computer readable medium of claim 17 , wherein partitioning the computation graph is performed by cutting a single edge connecting two subsets of the plurality of the subsets. 20 . The computer readable
Learning methods · CPC title
Activation functions · CPC title
Combinations of networks · CPC title
Reinforcement learning · CPC title
considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.