Compiling models for dedicated hardware

US12020168B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12020168-B2
Application numberUS-201916262807-A
CountryUS
Kind codeB2
Filing dateJan 30, 2019
Priority dateSep 11, 2018
Publication dateJun 25, 2024
Grant dateJun 25, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The subject technology runs a compiled neural network (NN) model on a particular processor with multiple priority queues for executing different processes, the compiled NN model being assigned to a particular priority queue, and the compiled NN model includes context switch instructions that were previously inserted into a neural network (NN) model from which the compiled NN model was compiled. The subject technology determines that a particular context switch instruction has been executed by the particular processor. The subject technology determines that a different process is waiting to be executed, the different process being assigned to a different priority queue and the different process being a higher priority process than the running compiled NN model. In response to executing the particular context switch instruction, the subject technology performs a context switch to the different process assigned to the different priority queue when the different process is waiting to be executed.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: running a compiled neural network (NN) model on a particular processor, the particular processor having multiple priority queues for executing different processes and the compiled NN model being assigned to a particular priority queue from the multiple priority queues, wherein the compiled NN model includes one or more context switch instructions that were previously inserted, by a compiler, into the compiled NN model based at least in part on a latency parameter value included in source code of a neural network (NN) model from which the compiled NN model was compiled and includes one or more annotations that were previously inserted, by the compiler, into the compiled NN model and indicate a subset of a plurality of operations of the compiled NN model should be performed by the particular processor, the subset of the plurality of operations including the one or more context switch instructions, wherein each respective priority queue is associated with a respective priority level and a process is assigned the associated respective priority level by virtue of the process being included in the respective priority queue; determining that a particular context switch instruction, in the running compiled NN model, has been executed by the particular processor; determining that a different process is waiting to be executed, the different process being assigned to a different priority queue of the particular processor and the different process being a higher priority process than the running compiled NN model; and in response to executing the particular context switch instruction, performing a context switch to the different process assigned to the different priority queue of the particular processor when the different process is waiting to be executed. 2. The method of claim 1 , further comprising: running the different process assigned to the different priority queue; determining that the different process has completed execution; and in response to determining that the different process has completed, resuming execution of the compiled NN model. 3. The method of claim 2 , wherein the different process corresponds to a different compiled model than the compiled NN model. 4. The method of claim 2 , further comprising: determining that the compiled NN model has completed execution; determining that a second different process assigned to a second different priority queue of the particular processor is a lower priority process; and running the second different process. 5. The method of claim 4 , wherein the second different process comprises a different NN compiled model than the compiled NN model. 6. The method of claim 1 , wherein the latency parameter value indicates a period of latency in which a particular operation can be delayed before continuing to execute. 7. The method of claim 1 , wherein the particular processor is a neural processor. 8. The method of claim 7 , wherein the compiled NN model was compiled by a compiler running locally on an electronic device including the neural processor. 9. The method of claim 8 , wherein the compiled NN model was loaded from a cache provided by the electronic device, the cache storing different compiled NN models. 10. A system comprising; a processor; a memory device containing instructions, which when executed by the processor cause the processor to: run a compiled neural network (NN) model on a particular processor, the particular processor having multiple priority queues for executing different processes and the compiled NN model being assigned to a particular priority queue from the multiple priority queues, wherein the compiled NN model includes one or more context switch instructions that were previously inserted into the compiled NN model based at least in part on a latency parameter value included in source code of a neural network (NN) model from which the compiled NN model was compiled and includes one or more annotations that were previously inserted, by the compiler, into the compiled NN model and indicate a subset of a plurality of operations of the compiled NN model should be performed by the particular processor, the subset of the plurality of operations including the one or more context switch instructions; determine that a particular context switch instruction, in the running compiled NN model, has been executed by the particular processor; determine that a different process is waiting to be executed, the different process being assigned to a different priority queue of the particular processor and the different process being a different priority process than the running compiled NN model; and in response to executing the particular context switch instruction, perform a context switch to the different process assigned to the different priority queue of the particular processor when the different process is waiting to be executed. 11. The system of claim 10 , wherein the memory device contains further instructions, which when executed by the processor, further cause the processor to: run the different process assigned to the different priority queue; determine that the different process has completed execution; and in response to determining that the different process has completed, resume execution of the compiled NN model. 12. The system of claim 11 , wherein the different process corresponds to a different compiled model than the compiled NN model. 13. The system of claim 11 , wherein the memory device contains further instructions, which when executed by the processor, further cause the processor to: determine that the compiled NN model has completed execution; determine that a second different process assigned to a second different priority queue of the particular processor is a lower priority process; and run the second different process. 14. The system of claim 13 , wherein the second different process comprises a different NN compiled model than the compiled NN model. 15. The system of claim 10 , wherein the one or more context switch instructions were previously inserted based at least in part on a parameter indicating a period of latency in which an operation can be delayed before continuing to execute. 16. The system of claim 10 , wherein the different priority process comprises a higher priority process. 17. The system of claim 16 , wherein the compiled NN model was compiled by a compiler running locally on an electronic device including a neural processor. 18. The system of claim 17 , wherein the compiled NN model was loaded from a cache provided by the electronic device, the cache storing different compiled NN models. 19. A non-transitory computer-readable medium comprising instructions, which when executed by a computing device, cause the computing device to perform operations comprising: running a compiled neural network (NN) model on a particular processor, the particular processor having multiple priority queues for executing different processes and the compiled NN model being assigned to a particular priority queue from the multiple priority queues, wherein the compiled NN model includes one or more context switch instructions that were previously inserted into the compiled NN model based at least in part on a latency parameter value included in source code of a neural network (NN) model from which the compiled NN model was compiled and includes one or more annotations that were previously inserted, by the compiler, into the compiled NN model and indicate a subset of a plurality of operations of the

Assignees

Inventors

Classifications

  • considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

  • Saving or restoring of program or task context · CPC title

  • Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • G06N3/10Primary

    Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12020168B2 cover?
The subject technology runs a compiled neural network (NN) model on a particular processor with multiple priority queues for executing different processes, the compiled NN model being assigned to a particular priority queue, and the compiled NN model includes context switch instructions that were previously inserted into a neural network (NN) model from which the compiled NN model was compiled.…
Who is the assignee on this patent?
Apple Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 25 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).