System-on-chip interface architecture
US-2019303328-A1 · Oct 3, 2019 · US
US11366649B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11366649-B2 |
| Application number | US-202016865852-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 4, 2020 |
| Priority date | Jan 3, 2019 |
| Publication date | Jun 21, 2022 |
| Grant date | Jun 21, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for generating a program to run on multiple tiles. The method comprises: receiving an input graph comprising data nodes, compute vertices and edges; receiving an initial tile-mapping specifying which data nodes and vertices are allocated to which tile; and determining a subgraph of the input graph that meets one or more heuristic rules. The rules comprises: the subgraph comprises at least one data node, the subgraph spans no more than a threshold number of tiles in the initial tile-mapping, and the subgraph comprises at least a minimum number of edges outputting to one or more vertices on one or more other tiles. The method further comprises adapting the initial mapping to migrate the data nodes and any vertices of the determined subgraph to said one or more other tiles.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for generating an executable program to run on a processing system having a plurality of tiles, wherein each one of the tiles comprises its own respective processing unit and a memory, the method comprising: receiving an initial tile-mapping based on an input graph, the initial tile-mapping allocating a plurality of data nodes and a plurality of vertices to the plurality of tiles in the processing system, wherein the initial tile-mapping further includes a plurality of directional edges representing inputs and outputs between the plurality of data nodes and the plurality of vertices, wherein the input graph comprises a neural network, and the executable program comprises an algorithm configured to perform machine learning using the neural network; determining a subgraph of the input graph that meets a rule: the subgraph comprises a subset of the data nodes and has at least a first data node of the plurality of data nodes, wherein determining the subgraph comprises performing a search comprising: selecting one of the data nodes as a starting point; and expanding a candidate subgraph from the starting point and terminating expansion of the candidate subgraph upon encountering a node that fails to match the rule or any additional rules; generating an adapted mapping from the initial tile-mapping by migrating the subset of the data nodes to a subset of the tiles, the adapted mapping including a duplication of at least one portion of code of the executable program or a data node of the subset of the data nodes across two or more tiles of the subset of the tiles; and compiling the executable program to run on the subset of the tiles as specified by the adapted mapping. 2. The method of claim 1 , wherein each data node of the plurality of data nodes represents an item selected from a list consisting of: a variable and a constant. 3. The method of claim 1 , wherein a first one of the vertices represents a computation to perform on its respective input to result in its respective output. 4. The method of claim 1 , wherein determining the subgraph further includes the following rules: a) the subgraph spans no more than a threshold number of tiles in the initial tile-mapping, and b) the subgraph comprises at least a minimum number of edges outputting to one or more vertices on the subset of the tiles. 5. The method of claim 4 , wherein the threshold number of tiles is one. 6. The method of claim 4 , wherein the minimum number of edges is one. 7. The method of claim 1 , further comprising dividing the vertices among a plurality of compute sets ordered according to an order of execution; and wherein determining the subgraph further includes the following rule: any vertices in the subgraph are in a same compute set. 8. A computer comprising a processor and memory, the memory storing a software tool, the software tool comprising software configured so as when run on the computer, causes the computer to generate an executable program according to a method comprising: receiving an initial tile-mapping based on an input graph, the initial tile-mapping allocating a plurality of data nodes and a plurality of vertices to a plurality of tiles in a processing system, wherein each tile comprises its own respective processor and a memory, wherein the initial tile-mapping further includes a plurality of directional edges representing inputs and outputs between the plurality of data nodes and the plurality of vertices, wherein the input graph comprises a neural network, and the executable program comprises an algorithm configured to perform machine learning using the neural network; determining a subgraph of the input graph that meets a rule specifying that the subgraph comprises a subset of the data nodes and has at least a first data node of the plurality of data nodes, wherein determining the subgraph comprises performing a search comprising: selecting one of the data nodes as a starting point; and expanding a candidate subgraph from the starting point and terminating expansion of the candidate subgraph upon encountering a node that fails to match the rule or any additional rules; generating an adapted mapping from the initial tile-mapping by migrating the subset of the data nodes to a subset of the tiles, the adapted mapping including a duplication of at least one portion of code of the executable program or a data node of the subset of the data nodes across two or more tiles of the subset of the tiles; and compiling the executable program to run on the subset of the tiles as specified by the adapted mapping. 9. The computer of claim 8 , wherein each data node of the plurality of data nodes represents an item selected from a list consisting of: a variable and a constant. 10. The computer of claim 8 , wherein a first one of the vertices represents a computation to perform on its respective input to result in its respective output. 11. The computer of claim 8 , wherein the software causes the computer to determine the subgraph further using the following rules: a) the subgraph spans no more than a threshold number of tiles in the initial tile-mapping, and b) the subgraph comprises at least a minimum number of edges outputting to one or more vertices on the subset of the tiles. 12. The computer of claim 11 , wherein the threshold number of tiles is one. 13. The computer of claim 11 , wherein the minimum number of edges is one. 14. The computer of claim 8 , wherein the software causes the computer to divide the vertices among a plurality of compute sets ordered according to an order of execution; and wherein determining the subgraph further includes the following rule: any vertices in the subgraph are in a same compute set. 15. A non-transitory computer-readable medium having stored thereon computer-executable code which, when executed by a computer, causes the computer to generate an executable program according to a method comprising: receiving an initial tile-mapping based on an input graph, the initial tile-mapping allocating a plurality of data nodes and a plurality of vertices to a plurality of tiles in a processing system, wherein each tile comprises its own respective processor and a memory, wherein the initial tile-mapping further includes a plurality of directional edges representing inputs and outputs between the plurality of data nodes and the plurality of vertices, wherein the input graph comprises a neural network, and the executable program comprises an algorithm configured to perform machine learning using the neural network; determining a subgraph of the input graph that meets a rule specifying that the subgraph comprises a subset of the data nodes and has at least a first data node of the plurality of data nodes, wherein determining the subgraph comprises performing a search comprising: selecting one of the data nodes as a starting point; and expanding a candidate subgraph from the starting point and terminating expansion of the candidate subgraph upon encountering a node that fails to match the rule or any additional rules; generating an adapted mapping from the initial tile-mapping by migrating the subset of the data nodes to a subset of the tiles, the adapted mapping including a duplication of at least one portion of code of the executable program or a data node of the subset of the data nodes across two or more tiles of the subset of the tiles; and compiling the executable program to run on the subset of the tiles as specified by the adapted mapping. 16. The non-transitory computer-readable medium of claim 15 , wherein each data node of th
Synchronisation, e.g. post-wait, barriers, locks (synchronisation among tasks G06F9/52) · CPC title
Code distribution (considering CPU load at run-time G06F9/505; load rebalancing G06F9/5083) · CPC title
Learning methods · CPC title
Reducing the execution time required by the program code · CPC title
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.