Neural network processing system having host controlled kernel acclerators
US-2019114535-A1 · Apr 18, 2019 · US
US2020301739A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020301739-A1 |
| Application number | US-201916358547-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 19, 2019 |
| Priority date | Mar 19, 2019 |
| Publication date | Sep 24, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to a method for allocating resources of an accelerator to two or more neural networks for execution. The two or more neural networks may include a first neural network and a second neural network. The method comprises analyzing workloads of the first neural network and the second neural network, wherein the first neural network and second neural network each includes multiple computational layers, evaluating computational resources of the accelerator for executing each computational layer of the first and second neural networks, and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network.
Opening claim text (preview).
1 . A method comprising: analyzing workloads of a first neural network and a second neural network, wherein the first neural network and second neural network each includes multiple computational layers; evaluating computational resources of an accelerator for executing each computational layer of the first and second neural networks; and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network. 2 . The method of claim 1 , wherein the first neural network has a first pipeline interval, wherein an execution time for the one computational layer of the first neural network is shorter than the first pipeline interval, and wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network during a time period corresponding to a difference between the first pipeline interval and the execution time. 3 . The method of claim 1 , wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network before executing the one computational layer of the first neural network. 4 . The method of claim 1 , wherein evaluating computation resources of the accelerator further comprises: comparing a total amount of computational resources for executing the first and second neural networks with a total amount of available computational resources of the accelerator. 5 . The method of claim 4 , wherein scheduling computational resources of the accelerator is performed when the total amount of computational resources for executing the first and second neural networks is bigger than the total amount of computational resources of the accelerator. 6 . The method of claim 1 , further comprising: detennining a time period that the computational resources assigned for executing the first neural network are not used during execution of the first neural network, wherein the one or more computational layers of the second neural network are executed within the time period. 7 . The method of claim 1 , wherein the first neural network has a longer pipeline interval than the second neural network. 8 . The method of claim 1 , wherein the computational resources of the accelerator are scheduled to execute the one computational layer of the first neural network and the one or more computational layers of the second neural network before executing another computational layer subsequent to the one computation layer of the first neural network. 9 . An apparatus comprising: a memory storing a set of instructions; and one or more processors configured to execute the set of instructions to cause the apparatus to perform: analyzing workloads of a first neural network and a second neural network, wherein the first neural network and second neural network each includes multiple computational layers; evaluating computational resources of an accelerator for executing each computational layer of the first and second neural networks; and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network. 10 . The apparatus of claim 9 , wherein the first neural network has a first pipeline interval, wherein an execution time for the one computational layer of the first neural network is shorter than the first pipeline interval, and wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network during a time period corresponding to a difference between the first pipeline interval and the execution time. 11 . The apparatus of claim 9 , wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network before executing the one computational layer of the first neural network. 12 . The apparatus of claim 9 , wherein evaluating computation resources of the accelerator further comprises: comparing a total amount of computational resources for executing the first and second neural networks with a total amount of available computational resources of the accelerator. 13 . The apparatus of claim 9 , wherein scheduling computational resources of the accelerator is performed when the total amount of computational resources for executing the first and second neural networks is bigger than the total amount of computational resources of the accelerator. 14 . The apparatus of claim 9 , wherein the one or more processors are configured to execute the set of instructions to cause the apparatus to further perform: determining a time period that the computational resources assigned for executing the first neural network are not used during execution of the first neural network, wherein the one or more computational layers of the second neural network are executed within the time period. 15 . The apparatus of claim 9 , wherein the first neural network has a longer pipeline interval than the second neural network. 16 . A non-transitory computer readable medium that stores a set of instructions that is executable by at least one processor of a computing device to cause the computing device to perform a method comprising: analyzing workloads of a first neural network and a second neural network, wherein the first neural network and second neural network each includes multiple computational layers; evaluating computational resources of an accelerator for executing each computational layer of the first and second neural networks; and scheduling computational resources of the accelerator to execute one computational layer of the multiple computation layers of the first neural network and to execute one or more computational layers of the multiple computational layers of the second neural network. 17 . The computer readable medium of claim 16 , wherein the first neural network has a first pipeline interval, wherein an execution time for the one computational layer of the first neural network is shorter than the first pipeline interval, and wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network during a time period corresponding to a difference between the first pipeline interval and the execution time. 18 . The computer readable medium of claim 16 , wherein scheduling computational resources comprises: scheduling the computational resources of the accelerator to execute the one or more computational layers of the second neural network before executing the one computational layer of the first neural network. 19 . The computer readable medium of claim 16 , wherein evaluating computation resources of the accelerator further comprises: comparing a total amount of computational resources for executing the first and second neural networks with a total amount of available computational resources of the accelerator. 20 . The computer readable medium of
Combinations of networks · CPC title
Feedforward networks · CPC title
Techniques for rebalancing the load in a distributed system · CPC title
using electronic means · CPC title
considering hardware capabilities · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.