Static versioning in the polyhedral model
US-2022229641-A1 · Jul 21, 2022 · US
US12106097B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12106097-B2 |
| Application number | US-202217649322-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 28, 2022 |
| Priority date | Feb 19, 2021 |
| Publication date | Oct 1, 2024 |
| Grant date | Oct 1, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method of implementing a computation using a hardware-accelerated instruction of a processor system by solving a constraint satisfaction problem. A solution to the constraint satisfaction problem represents a possible invocation of the hardware-accelerated instruction in the computation. The constraint satisfaction problem assigns nodes of a data flow graph of the computation to nodes of a data flow graph of the instruction. The constraint satisfaction problem comprises constraints enforcing that the assigned nodes of the computation data flow graph have equivalent data flow to the instruction data flow graph, and constraints restricting which nodes of the computation data flow graph can be assigned to the inputs of the hardware-accelerated instruction, with restrictions being imposed by the hardware-accelerated instruction and/or its programming interface.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method of implementing a computation using a hardware-accelerated instruction of a processor system, the method comprising the following steps: obtaining computation data, wherein the computation data defines a computation data flow graph representing the computation, and each node of a plurality of nodes of the computation data flow graph represents an input or an operation of the computation; obtaining instruction data, wherein the instruction data defines an instruction data flow graph representing the hardware-accelerated instruction that is accelerated by a hardware accelerator, and each node of a plurality of nodes of the instruction data flow graph represents an input or an operation of the hardware-accelerated instruction; based on the computation data and the instruction data, defining a constraint satisfaction problem, wherein a solution to the constraint satisfaction problem represents an invocation of the hardware-accelerated instruction in the computation, wherein the constraint satisfaction problem assigns nodes of the computation data flow graph to nodes of the instruction data flow graph, and wherein the constraint satisfaction problem includes at least: one or more constraints enforcing that the assigned nodes of the computation data flow graph have equivalent data flow to the instruction data flow graph, one or more constraints restricting which nodes of the computation data flow graph can be assigned to the inputs of the hardware-accelerated instruction, wherein the constraints are imposed by the hardware-accelerated instruction and/or a programming interface of the hardware-accelerated instruction; solving the constraint satisfaction problem to determine a solution corresponding to the invocation of the hardware-accelerated instruction in the computation, and outputting data defining the invocation. 2. The method of claim 1 , further comprising: generating machine-executable instructions implementing the computation, wherein the machine-executable instructions invoke the hardware-accelerated instruction according to the determined invocation. 3. The method of claim 1 , wherein the operation of the computation and/or the operation of the hardware-accelerated instruction is a scalar operation. 4. The method of claim 3 , wherein the hardware-accelerated instruction implements one or more of: a matrix multiplication or a convolution or a dot product or a matrix-vector product or a cumulative sum or a pooling or a Hadamard product. 5. The method of claim 4 , wherein the computation includes applying one or more of a convolution, a matrix multiplication, and a pooling operation of a neural network layer. 6. The method of claim 1 , wherein the instruction data flow graph includes nodes for only a subset of outputs of the hardware-accelerated instruction, and the method further comprises inferring from the invocation a mapping of further outputs of the hardware-accelerated instruction to operations of the computation. 7. The method of claim 1 , wherein the computation data includes a polyhedral representation symbolically defining a set of nodes of the computation data flow graph, and wherein the solving of the constraint satisfaction problem includes instantiating the polyhedral representation to obtain a node of the set of nodes and mapping a node of the instruction data flow graph to the obtained node. 8. The method of claim 1 , wherein the method further comprises defining one or more constraints to enforce that inputs of the hardware-accelerated instruction have an allowed memory layout and/or memory access pattern. 9. The method of claim 1 , wherein the method further comprises defining one or more constraints to enforce that pairs of mutually parallelizable operations of the hardware-accelerated instruction are mapped to pairs of mutually parallelizable operations of the computation. 10. The method of claim 1 , wherein a node of the instruction data flow graph represents a commutative reduction operation, and wherein the method further comprises defining one or more constraints to enforce that the node representing the commutative reduction operation is mapped to a corresponding node of the computation data flow graph. 11. The method of claim 1 , wherein the method further comprises defining the constraint satisfaction problem to allow mapping of operations of the hardware-accelerated instruction to dummy operations that do not affect the computation. 12. The method of claim 1 , wherein the method further comprises including in the constraint satisfaction problem one or more constraints enforcing inputs of the hardware-accelerated instruction to form a hyper-rectangle. 13. The method of claim 1 , further comprising: determining a finite number of multiple solutions to the constraint satisfaction problem; evaluating performance of the multiple solutions; and selecting a solution from the multiple solutions based on the performance evaluation. 14. The method of claim 1 , wherein the hardware accelerator is separate from a CPU of the processing system, and wherein the hardware accelerator includes one of a coprocessor, an application-specific instruction set processor (ASIP), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), or a complex programmable logic device (CPLD). 15. A system for implementing a computation using a hardware-accelerated instruction of a processor system, the system comprising: a data interface configured to access computation data and instruction data, wherein the computation data defines a computation data flow graph representing the computation, wherein each node of a plurality of nodes of the computation data flow graph represents an input or an operation of the computation, and wherein the instruction data defines an instruction data flow graph representing the hardware-accelerated instruction that is accelerated by a hardware accelerator, wherein each node of a plurality of nodes of the instruction data flow graph represents an input or an operation of the hardware-accelerated instruction; a processor subsystem configured to: obtain the computation data and the instruction data, based on the computation data and the instruction data, define a constraint satisfaction problem, wherein a solution to the constraint satisfaction problem represents an invocation of the hardware-accelerated instruction in the computation, wherein the constraint satisfaction problem assigns nodes of the computation data flow graph to nodes of the instruction data flow graph, and wherein the constraint satisfaction problem includes at least: one or more constraints enforcing that the assigned nodes of the computation data flow graph have equivalent data flow to the instruction data flow graph, one or more constraints restricting which nodes of the computation data flow graph can be assigned to the inputs of the hardware-accelerated instruction, wherein the constraints are imposed by the hardware-accelerated instruction and/or its programming interface; solve the constraint satisfaction problem to determine the invocation of the hardware-accelerated instruction in the computation, and output data defining the invocation. 16. The system of claim 15 , wherein the hardware accelerator is separate from a CPU of the processing system, and wherein the hardware accelerator includes one of a coprocessor, an application-specific instruction set processor (ASIP), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), or a complex programmable logic device (CP
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions · CPC title
Operand accessing · CPC title
Decoding the operand specifier, e.g. specifier format · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.