Neural network computation circuit, control circuit therefor, and control method therefor
US-2024411520-A1 · Dec 12, 2024 · US
US2019266015A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2019266015-A1 |
| Application number | US-201815906963-A |
| Country | US |
| Kind code | A1 |
| Filing date | Feb 27, 2018 |
| Priority date | Feb 27, 2018 |
| Publication date | Aug 29, 2019 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods, and computer-executable instructions for scheduling neural network workloads on an edge device. A performance model for each neural network model is received. Parameters for each neural network workload is determined based on an associated performance model. Processing core assignments are determined from a plurality of processing cores for each neural network workload based on the corresponding performance model and processing core utilization. Image streams are received and associated with a neural network workload. Each neural network workload is scheduled to run on the processing cores based on the processing core assignments.
Opening claim text (preview).
1 . A method for scheduling neural network workloads on an edge device, the method comprising operations performed using an electronic processor, the operations comprising: receiving a performance model for each of a plurality of neural network models; determining parameters for each of a plurality of neural network workloads, wherein each of the neural network workloads is associated with one of the plurality of neural network models, and the parameters are determined based on a performance model associated with the neural network model associated with the neural network workload; determining processing core assignments from a plurality of processing cores for each of the neural network workloads based on the corresponding performance model and processing core utilization; receiving, from a plurality of cameras, a plurality of image streams, wherein each image stream is associated with one of the neural network workloads; and scheduling each of the neural network workloads to run on the plurality of processing cores based on the processing core assignments to enable each of the neural network workloads to execute with the determined parameters on the determined parameters. 2 . The method of claim 1 , further comprising executing each of the neural network workloads with the determined parameters on the determined processing core based on the scheduling and the assigned image stream. 3 . The method of claim 1 , further comprising modeling the performance model for each of the plurality neural network models. 4 . The method of claim 3 , wherein the modeling comprises: running each of the plurality of neural network models a plurality of times with varying parameters and varying processing core utilizations to generate training data; and training the performance model based on the generated training data. 5 . The method of claim 1 , wherein the parameters comprise a batch size, a sampling rate, and precision. 6 . The method of claim 1 , wherein the scheduling comprises: determining a first neural network workload has an increased core processing need for a first initial period of time based on a configuration of a first neural network model associated with the first neural network workload; determining a second neural network workload has an increased core processing need for a second initial period of time based on a configuration of a second neural network model associated with the second neural network workload; and scheduling the second neural network workload to start execution after the first initial period of time. 7 . The method of claim 6 , wherein the first and second neural networks are convolutional neural networks. 8 . The method of claim 1 , wherein a first neural network workload is dependent on a second neural network workload, wherein determining the parameters comprises: detecting, using the second neural network workload, a match in the assigned image stream; and increasing a memory usage parameter of the first neural network workload based on the second neural network detecting the match. 9 . The method of claim 8 , wherein the first neural network uses a first neural network framework, and wherein the second neural network uses a second, different neural network framework. 10 . The method of claim 1 , wherein the determining processing core assignments further comprises: determining a plurality of unassigned processing cores, wherein a number of neural network workloads is less than a number of available processing cores; assigning additional processing cores to a neural network workload based on a runtime value from the corresponding performance model. 11 . A system for scheduling neural network workloads on an edge device, the system comprising: an allocator configured to: receive a performance model for each of a plurality of neural network models; determine parameters for each of a plurality of neural network workloads, wherein each of the neural network workloads is associated with one of the plurality of neural network models, and the parameters are determined based on a performance model associated with the neural network model associated with the neural network workload, determine processing core assignments from a plurality of processing cores for each of the neural network workloads based on the corresponding performance model and processing core utilization; and receive, from a plurality of cameras, a plurality of image streams, wherein each image stream is associated with one of the neural network workloads; and a scheduler configured to schedule each of the neural network workloads to run on the plurality of processing cores based on the processing core assignments to enable each of the neural network workloads to execute with the determined parameters on the determined parameters; and a plurality of processing cores configured to execute each of the neural networks with the determined parameters on the determined processing core based on the scheduling and the assigned image stream. 12 . The system of claim 11 , further comprising a profiler configured to model the performance model for each of the plurality neural network models. 13 . The system of claim 12 , wherein to model the performance model the profiler is further configured to: run each of the plurality of neural network models a plurality of times with varying parameters and varying processing core utilizations to generate training data; and train the performance model based on the generated training data. 14 . The system of claim 11 , wherein the parameters comprise a batch size, a sampling rate, and precision. 15 . The system of claim 11 , wherein to schedule the scheduler is further configured to: determine a first neural network workload has an increased core processing need for a first initial period of time based on a configuration of a first neural network model associated with the first neural network workload; determine a second neural network workload has an increased core processing need for a second initial period of time based on a second neural network model associated with the second neural network workload; and schedule the second neural network workload to start execution after the first initial period of time. 16 . The system of claim 15 , wherein the first and second neural networks are convolutional neural networks. 17 . A computer-readable storage media storing computer-executable instructions for scheduling neural network workloads on an edge device, the stored instructions comprising: instructions receive a performance model for each of a plurality of neural network models; instructions to determine parameters for each of a plurality of neural network workloads, wherein each of the neural network workloads is associated with one of the plurality of neural network models, and the parameters are determined based on a performance model associated with the neural network model associated with the neural network workload, instructions to determine processing core assignments from a plurality of processing cores for each of the neural network workloads based on the corresponding performance model and processing core utilization; and instructions to receive, from a plurality of cameras, a plurality of image streams, wherein each image stream is associated with one of the neural network workloads; and instructions to schedule each of the neural network workloads to run on the plurality of processing cores based on the processing core assignments to enable each of the neural network workloads to execute with the de
using electronic means · CPC title
Combinations of networks · CPC title
considering hardware capabilities · CPC title
the resource being the memory · CPC title
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.