What technology area does this patent fall under?

Primary CPC classification G06F9/3001. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Oct 10 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Reconfigurable neural network processing based on subgraph recognition

US11782706B1 · US · B1

Patent metadata
Field	Value
Publication number	US-11782706-B1
Application number	US-202117361992-A
Country	US
Kind code	B1
Filing date	Jun 29, 2021
Priority date	Jun 29, 2021
Publication date	Oct 10, 2023
Grant date	Oct 10, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In one example, a method comprises: receiving input codes, wherein the input codes represent a computational dataflow graph; traversing the computational dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the computational dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor; determining a merged operator for each SESE subgraph; and generating executable instructions for the computational dataflow graph to be executed by a hardware accelerator having a first execution unit and a second execution unit, wherein the executable instructions comprise first executable instructions for the merged operators targeted at the first execution unit, and second executable instructions for other operators of the computational dataflow graph targeted at the second execution unit.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving input codes of a neural network, the neural network comprising neural network operators; compiling the input codes to generate an input data set representing a first dataflow graph of the neural network, wherein the first dataflow graph includes nodes connected by edges, each node comprising a neural network operator of the neural network operators, each edge between two nodes indicating a data dependency between two neural network operators represented by the two nodes; traversing the first dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the first dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise neural network operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor having a same number of data elements as the single input tensor; determining merged operators for the SESE subgraphs; generating a second dataflow graph based on replacing the SESE subgraphs in the first dataflow graph with the merged operators, wherein each merged operator has the single input tensor of the replaced SESE subgraph as an input and the single output tensor of the replaced SESE subgraph as an output; generating executable instructions of the second dataflow graph to be executed by a neural network hardware accelerator, wherein the executable instructions comprise first executable instructions for each of the merged operators targeted at a first execution unit, and second executable instructions for each of the neural network operators targeted at a second execution unit of the neural network accelerator; generating a schedule of execution of the executable instructions by the neural network hardware accelerator based on data dependencies among the neural network operators and the merged operators; and generating a program based on the executable instructions and the schedule of execution. 2. The method of claim 1 , wherein the merged operator is determined from a SESE-subgraph-to-merged-operator mapping table that maps merged operators to topologies of SESE subgraphs; and wherein each topology lists the element-wise neural network operators included in the SESE subgraph and edge connectivity among the nodes representing the element-wise neural network operators. 3. The method of claim 1 , wherein one of the first executable instructions includes: a first opcode that refers to a first mapping table of a first piecewise polynomial function of merged operator mapping tables of the neural network hardware accelerator, wherein the first piecewise polynomial function approximates a first relationship between an input and an output of a first merged operator; and a first operand as the input to the first merged operator. 4. The method of claim 3 , wherein one of the second executable instructions includes: a second opcode that refers to a set of arithmetic operations to be performed by a hardware computation engine of the neural network hardware accelerator; and a second operand as input to the second execution unit. 5. The method of claim 3 , further comprising: providing the program to the neural network hardware accelerator for execution; at a first time: storing a first mapping between the first opcode and a first SESE-subgraph-to-merged-operator mapping table of the neural network hardware accelerator; and executing the one of the first executable instructions based on the first mapping; and at a second time: storing a second mapping between the first opcode and a second SESE-subgraph-to-merged-operator mapping table of the neural network hardware accelerator; and executing the one of the first executable instructions based on the second mapping. 6. The method of claim 5 , wherein the first execution unit includes mappings representing piecewise polynomial functions; and wherein each piecewise polynomial function of a piecewise polynomial approximates a merged operator of the merged operators. 7. The method of claim 6 , wherein the first execution unit includes merged operator mapping tables, each merged operator mapping table comprising programmable registers that store candidate output values and a multiplexor to select one of the candidate output values based on an input value to implement a mapping representing a respective piecewise polynomial function; and wherein each of the first executable instructions includes a first opcode that identifies the first executable instruction, and a parameter that references one of the mapping tables. 8. The method of claim 7 , wherein the hardware accelerator further comprises a programmable opcode mapping memory that stores mapping between opcodes and the mapping tables. 9. The method of claim 8 , wherein the programmable opcode mapping memory maps the first opcode to a first mapping table representing a first merged operator, and a second opcode to a second mapping table representing a second merged operator; wherein the first merged operator represents a first number of operators; and wherein the second merged operator represents a second number of operators. 10. The method of claim 9 , further comprising: at a first time: storing a first mapping between the first opcode and the first mapping table in the opcode mapping memory; and causing the first execution unit to execute one of the first executable instructions including the first opcode based on the first mapping; and at a second time: storing a second mapping between the first opcode and the second mapping table in the opcode mapping memory; and causing the first execution unit to execute the one of the first executable instructions including the first opcode based on the second mapping. 11. The method of claim 8 , further comprising: generating an opcode for a SESE subgraph; generating candidate output values representing a merged operator; storing the opcode at the programmable opcode mapping memory; and storing the candidate output values at the first execution unit. 12. The method of claim 5 , wherein the second execution unit includes arithmetic circuits to perform arithmetic operations of the other neural network operators; and wherein each of the second executable instructions includes a second opcode that indicates a sequence of arithmetic operations to be performed by the arithmetic circuits. 13. A method comprising: receiving input codes, wherein the input codes represent a computational dataflow graph, wherein the computational dataflow graph includes nodes connected by edges, each node comprising an operator of neural network operators, each edge between two nodes indicating a data dependency between two neural network operators represented by the two nodes; traversing the computational dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the computational dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor; determining a merged operator for each SESE subgraph; and generating executable instructions for the computational dataflow graph to be executed by a hardware accelerator having a first execution unit and a second execution unit, wherein the executable instructions comprise first executable instructions for the merged operators targeted at the first execution unit, and second executable instructions for other op

Assignees

Amazon Tech Inc

Inventors

Classifications

G06F9/30036
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
G06F9/3001Primary
Arithmetic instructions · CPC title
G06F7/548Primary
Trigonometric functions; Co-ordinate transformations · CPC title
G06F8/433
Dependency analysis; Data or control flow analysis · CPC title
G06F8/443
Optimisation · CPC title

Patent family

Related publications grouped by family.

View patent family 88242291

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11782706B1 cover?: In one example, a method comprises: receiving input codes, wherein the input codes represent a computational dataflow graph; traversing the computational dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the computational dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise op…
Who is the assignee on this patent?: Amazon Tech Inc
What technology area does this patent fall under?: Primary CPC classification G06F9/3001. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Oct 10 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Graph model build and scoring engine

Restructuring deep neural networks to reduce the number of parameters

Data representation for dynamic precision in neural network cores

Multi-agent instruction execution engine for neural inference processing

Apparatus, methods, and systems for unstructured data flow in a configurable spatial accelerator

Processing computational graphs

Processing data from multiple sources

Frequently asked questions