What technology area does this patent fall under?

Primary CPC classification G06N3/10. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Compiling models for dedicated hardware

US11468338B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11468338-B2
Application number	US-201916262809-A
Country	US
Kind code	B2
Filing date	Jan 30, 2019
Priority date	Sep 11, 2018
Publication date	Oct 11, 2022
Grant date	Oct 11, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The subject technology provides receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations and some of the operations being executable on multiple processors of the target platform. The subject technology further sorts the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors. The subject technology determines, based at least in part on a cost of transferring the operations between the multiple processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations. Further, for each layer of the NN model, the subject technology includes an annotation to indicate the processor assigned for each of the operations.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations, at least some of the operations being executable on multiple processors of the target platform, the multiple processors comprising at least a CPU, a GPU, and a neural processor, wherein the CPU, the GPU, and the neural processor each have different computational specifications or capabilities; sorting the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors; determining, based at least in part on a cost of transferring the operations between the multiple processors and a cost of performing the operations at the respective processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations; and for each layer of the NN model, including an annotation to indicate the processor assigned for each of the operations. 2. The method of claim 1 , wherein determining, based at least in part on the cost of transferring the operations between the multiple processors, the assignment of one of the multiple processors for each of the sorted operations of each of the layers further comprises: generating a graph with operations sorted by an order of execution based on the sorted operations from the multiple layers; determining a path through nodes of the graph with an overall smallest cost to execute the operations from the multiple layers of the NN; and determining the assignment of one of the multiple processors for each of the sorted operations of each of the layers based at least in part on the determined path through the nodes of the graph. 3. The method of claim 2 , wherein each node in the graph represents a cost of an operation, from a particular layer, performed on a respective processor from the multiple processors of the target platform on which the operation is executable, and each edge in the graph represents a cost of transferring the operation from a first processor at a first layer to a second processor at a second layer of the NN. 4. The method of claim 1 , wherein the cost of transferring the operations comprises an amount of latency for transferring the operations between the multiple processors. 5. The method of claim 2 , wherein determining the path through nodes of the graph comprises determining a shortest path based on the overall smallest cost for traversing through each node of the graph, the shortest path corresponding to performing each operation in the multiple layers of the NN model. 6. The method of claim 1 , wherein the neural processor is specifically configured to perform operations related to neural network models. 7. The method of claim 6 , wherein the neural processor utilizes a lower amount of power when performing the operations when compared to the CPU or the GPU performing the operations. 8. The method of claim 1 , wherein the target platform comprises a mobile electronic device, and the mobile electronic device executes the NN model based at least in part on the annotation to indicate the processor assigned for each of the operations. 9. A system comprising; a processor; a memory device containing instructions, which when executed by the processor cause the processor to: receive a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations, at least some of the operations being executable on multiple processors of the target platform, the multiple processors comprising at least a CPU, a GPU, and a neural processor, wherein the CPU, the GPU, and the neural processor each have different computational specifications or capabilities; sort the operations from the multiple layers in a particular order based at least in part on grouping the operations that are executable by a particular processor of the multiple processors; determine, based at least in part on a cost of transferring the operations between the multiple processors and a cost of performing the operations at the respective processors, an assignment of one of the multiple processors for each of the sorted operations of each of the layers in a manner that minimizes a total cost of executing the operations; and for each layer of the NN model, include an annotation to indicate the processor assigned for each of the operations. 10. The system of claim 9 , wherein to determine, based at least in part on the cost of transferring the operations between the multiple processors, the assignment of one of the multiple processors for each of the sorted operations of each of the layers further causes the processor to: generate a graph with operations sorted by an order of execution based on the sorted operations from the multiple layers; determine a path through nodes of the graph with an overall smallest cost to execute the operations from the multiple layers of the NN; and determine the assignment of one of the multiple processors for each of the sorted operations of each of the layers based at least in part on the determined path through the nodes of the graph. 11. The system of claim 10 , wherein each node in the graph represents a cost of an operation, from a particular layer, performed on a respective processor from the multiple processors of the target platform on which the operation is executable, and each edge in the graph represents a cost of transferring the operation from a first processor at a first layer to a second processor at a second layer of the NN. 12. The system of claim 10 , wherein to determine the path through nodes of the graph with the overall smallest cost to execute the operations further causes the processor to: determine an amount of latency for transferring the operation performed on the respective processor to another processor. 13. The system of claim 10 , wherein to determine the path through nodes of the graph comprises determining a shortest path based on the overall smallest cost for traversing through each node of the graph, the shortest path corresponding to performing each operation in the multiple layers of the NN model. 14. The system of claim 9 , wherein the neural processor is configured to perform operations related to neural network models. 15. The system of claim 14 , wherein the neural processor utilizes a lower amount of power when performing the operations when compared to the CPU or the GPU performing the operations. 16. The system of claim 9 , wherein the target platform comprises a mobile electronic device, and the mobile electronic device executes the NN model based at least in part on the annotation to indicate the processor assigned for each of the operations. 17. A non-transitory computer-readable medium comprising instructions, which when executed by a computing device, cause the computing device to perform operations comprising: receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations, at least some of the operations being executable on multiple processors of the target platform, the multiple processors comprising at least a CPU, a GPU, and a neural processor, wherein the CPU, the GPU, and the neural processor each have different computational specifications or capabilities; sorting the operations from the multiple layers in a particular order based at least in part on gro

Assignees

Apple Inc

Inventors

Classifications

G06N3/0499
Feedforward networks · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/10Primary
Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title
G06N3/063Primary
using electronic means · CPC title

Patent family

Related publications grouped by family.

View patent family 69719947

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11468338B2 cover?: The subject technology provides receiving a neural network (NN) model to be executed on a target platform, the NN model including multiple layers that include operations and some of the operations being executable on multiple processors of the target platform. The subject technology further sorts the operations from the multiple layers in a particular order based at least in part on grouping th…
Who is the assignee on this patent?: Apple Inc
What technology area does this patent fall under?: Primary CPC classification G06N3/10. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Quantization for dnn accelerators

Machine learning accelerator mechanism

On-chip computational network

Multi-memory on-chip computational network

Neural network processing system having host controlled kernel acclerators

Static block scheduling in massively parallel software defined hardware systems

Neural network system and operating method of neural network system

Frequently asked questions