Graph model build and scoring engine
US-2021390461-A1 · Dec 16, 2021 · US
US11782706B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11782706-B1 |
| Application number | US-202117361992-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jun 29, 2021 |
| Priority date | Jun 29, 2021 |
| Publication date | Oct 10, 2023 |
| Grant date | Oct 10, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In one example, a method comprises: receiving input codes, wherein the input codes represent a computational dataflow graph; traversing the computational dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the computational dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor; determining a merged operator for each SESE subgraph; and generating executable instructions for the computational dataflow graph to be executed by a hardware accelerator having a first execution unit and a second execution unit, wherein the executable instructions comprise first executable instructions for the merged operators targeted at the first execution unit, and second executable instructions for other operators of the computational dataflow graph targeted at the second execution unit.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving input codes of a neural network, the neural network comprising neural network operators; compiling the input codes to generate an input data set representing a first dataflow graph of the neural network, wherein the first dataflow graph includes nodes connected by edges, each node comprising a neural network operator of the neural network operators, each edge between two nodes indicating a data dependency between two neural network operators represented by the two nodes; traversing the first dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the first dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise neural network operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor having a same number of data elements as the single input tensor; determining merged operators for the SESE subgraphs; generating a second dataflow graph based on replacing the SESE subgraphs in the first dataflow graph with the merged operators, wherein each merged operator has the single input tensor of the replaced SESE subgraph as an input and the single output tensor of the replaced SESE subgraph as an output; generating executable instructions of the second dataflow graph to be executed by a neural network hardware accelerator, wherein the executable instructions comprise first executable instructions for each of the merged operators targeted at a first execution unit, and second executable instructions for each of the neural network operators targeted at a second execution unit of the neural network accelerator; generating a schedule of execution of the executable instructions by the neural network hardware accelerator based on data dependencies among the neural network operators and the merged operators; and generating a program based on the executable instructions and the schedule of execution. 2. The method of claim 1 , wherein the merged operator is determined from a SESE-subgraph-to-merged-operator mapping table that maps merged operators to topologies of SESE subgraphs; and wherein each topology lists the element-wise neural network operators included in the SESE subgraph and edge connectivity among the nodes representing the element-wise neural network operators. 3. The method of claim 1 , wherein one of the first executable instructions includes: a first opcode that refers to a first mapping table of a first piecewise polynomial function of merged operator mapping tables of the neural network hardware accelerator, wherein the first piecewise polynomial function approximates a first relationship between an input and an output of a first merged operator; and a first operand as the input to the first merged operator. 4. The method of claim 3 , wherein one of the second executable instructions includes: a second opcode that refers to a set of arithmetic operations to be performed by a hardware computation engine of the neural network hardware accelerator; and a second operand as input to the second execution unit. 5. The method of claim 3 , further comprising: providing the program to the neural network hardware accelerator for execution; at a first time: storing a first mapping between the first opcode and a first SESE-subgraph-to-merged-operator mapping table of the neural network hardware accelerator; and executing the one of the first executable instructions based on the first mapping; and at a second time: storing a second mapping between the first opcode and a second SESE-subgraph-to-merged-operator mapping table of the neural network hardware accelerator; and executing the one of the first executable instructions based on the second mapping. 6. The method of claim 5 , wherein the first execution unit includes mappings representing piecewise polynomial functions; and wherein each piecewise polynomial function of a piecewise polynomial approximates a merged operator of the merged operators. 7. The method of claim 6 , wherein the first execution unit includes merged operator mapping tables, each merged operator mapping table comprising programmable registers that store candidate output values and a multiplexor to select one of the candidate output values based on an input value to implement a mapping representing a respective piecewise polynomial function; and wherein each of the first executable instructions includes a first opcode that identifies the first executable instruction, and a parameter that references one of the mapping tables. 8. The method of claim 7 , wherein the hardware accelerator further comprises a programmable opcode mapping memory that stores mapping between opcodes and the mapping tables. 9. The method of claim 8 , wherein the programmable opcode mapping memory maps the first opcode to a first mapping table representing a first merged operator, and a second opcode to a second mapping table representing a second merged operator; wherein the first merged operator represents a first number of operators; and wherein the second merged operator represents a second number of operators. 10. The method of claim 9 , further comprising: at a first time: storing a first mapping between the first opcode and the first mapping table in the opcode mapping memory; and causing the first execution unit to execute one of the first executable instructions including the first opcode based on the first mapping; and at a second time: storing a second mapping between the first opcode and the second mapping table in the opcode mapping memory; and causing the first execution unit to execute the one of the first executable instructions including the first opcode based on the second mapping. 11. The method of claim 8 , further comprising: generating an opcode for a SESE subgraph; generating candidate output values representing a merged operator; storing the opcode at the programmable opcode mapping memory; and storing the candidate output values at the first execution unit. 12. The method of claim 5 , wherein the second execution unit includes arithmetic circuits to perform arithmetic operations of the other neural network operators; and wherein each of the second executable instructions includes a second opcode that indicates a sequence of arithmetic operations to be performed by the arithmetic circuits. 13. A method comprising: receiving input codes, wherein the input codes represent a computational dataflow graph, wherein the computational dataflow graph includes nodes connected by edges, each node comprising an operator of neural network operators, each edge between two nodes indicating a data dependency between two neural network operators represented by the two nodes; traversing the computational dataflow graph to identify single-entry-single-exit (SESE) subgraphs of the computational dataflow graph, wherein each SESE subgraph has a sequence of nodes comprising a root node and a child node and representing a sequence of element-wise operators, wherein the root node receives a single input tensor, and wherein the child node outputs a single output tensor; determining a merged operator for each SESE subgraph; and generating executable instructions for the computational dataflow graph to be executed by a hardware accelerator having a first execution unit and a second execution unit, wherein the executable instructions comprise first executable instructions for the merged operators targeted at the first execution unit, and second executable instructions for other op
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Arithmetic instructions · CPC title
Trigonometric functions; Co-ordinate transformations · CPC title
Dependency analysis; Data or control flow analysis · CPC title
Optimisation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.