Using hardware-accelerated instructions

US12106097B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12106097-B2
Application numberUS-202217649322-A
CountryUS
Kind codeB2
Filing dateJan 28, 2022
Priority dateFeb 19, 2021
Publication dateOct 1, 2024
Grant dateOct 1, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method of implementing a computation using a hardware-accelerated instruction of a processor system by solving a constraint satisfaction problem. A solution to the constraint satisfaction problem represents a possible invocation of the hardware-accelerated instruction in the computation. The constraint satisfaction problem assigns nodes of a data flow graph of the computation to nodes of a data flow graph of the instruction. The constraint satisfaction problem comprises constraints enforcing that the assigned nodes of the computation data flow graph have equivalent data flow to the instruction data flow graph, and constraints restricting which nodes of the computation data flow graph can be assigned to the inputs of the hardware-accelerated instruction, with restrictions being imposed by the hardware-accelerated instruction and/or its programming interface.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method of implementing a computation using a hardware-accelerated instruction of a processor system, the method comprising the following steps: obtaining computation data, wherein the computation data defines a computation data flow graph representing the computation, and each node of a plurality of nodes of the computation data flow graph represents an input or an operation of the computation; obtaining instruction data, wherein the instruction data defines an instruction data flow graph representing the hardware-accelerated instruction that is accelerated by a hardware accelerator, and each node of a plurality of nodes of the instruction data flow graph represents an input or an operation of the hardware-accelerated instruction; based on the computation data and the instruction data, defining a constraint satisfaction problem, wherein a solution to the constraint satisfaction problem represents an invocation of the hardware-accelerated instruction in the computation, wherein the constraint satisfaction problem assigns nodes of the computation data flow graph to nodes of the instruction data flow graph, and wherein the constraint satisfaction problem includes at least: one or more constraints enforcing that the assigned nodes of the computation data flow graph have equivalent data flow to the instruction data flow graph, one or more constraints restricting which nodes of the computation data flow graph can be assigned to the inputs of the hardware-accelerated instruction, wherein the constraints are imposed by the hardware-accelerated instruction and/or a programming interface of the hardware-accelerated instruction; solving the constraint satisfaction problem to determine a solution corresponding to the invocation of the hardware-accelerated instruction in the computation, and outputting data defining the invocation. 2. The method of claim 1 , further comprising: generating machine-executable instructions implementing the computation, wherein the machine-executable instructions invoke the hardware-accelerated instruction according to the determined invocation. 3. The method of claim 1 , wherein the operation of the computation and/or the operation of the hardware-accelerated instruction is a scalar operation. 4. The method of claim 3 , wherein the hardware-accelerated instruction implements one or more of: a matrix multiplication or a convolution or a dot product or a matrix-vector product or a cumulative sum or a pooling or a Hadamard product. 5. The method of claim 4 , wherein the computation includes applying one or more of a convolution, a matrix multiplication, and a pooling operation of a neural network layer. 6. The method of claim 1 , wherein the instruction data flow graph includes nodes for only a subset of outputs of the hardware-accelerated instruction, and the method further comprises inferring from the invocation a mapping of further outputs of the hardware-accelerated instruction to operations of the computation. 7. The method of claim 1 , wherein the computation data includes a polyhedral representation symbolically defining a set of nodes of the computation data flow graph, and wherein the solving of the constraint satisfaction problem includes instantiating the polyhedral representation to obtain a node of the set of nodes and mapping a node of the instruction data flow graph to the obtained node. 8. The method of claim 1 , wherein the method further comprises defining one or more constraints to enforce that inputs of the hardware-accelerated instruction have an allowed memory layout and/or memory access pattern. 9. The method of claim 1 , wherein the method further comprises defining one or more constraints to enforce that pairs of mutually parallelizable operations of the hardware-accelerated instruction are mapped to pairs of mutually parallelizable operations of the computation. 10. The method of claim 1 , wherein a node of the instruction data flow graph represents a commutative reduction operation, and wherein the method further comprises defining one or more constraints to enforce that the node representing the commutative reduction operation is mapped to a corresponding node of the computation data flow graph. 11. The method of claim 1 , wherein the method further comprises defining the constraint satisfaction problem to allow mapping of operations of the hardware-accelerated instruction to dummy operations that do not affect the computation. 12. The method of claim 1 , wherein the method further comprises including in the constraint satisfaction problem one or more constraints enforcing inputs of the hardware-accelerated instruction to form a hyper-rectangle. 13. The method of claim 1 , further comprising: determining a finite number of multiple solutions to the constraint satisfaction problem; evaluating performance of the multiple solutions; and selecting a solution from the multiple solutions based on the performance evaluation. 14. The method of claim 1 , wherein the hardware accelerator is separate from a CPU of the processing system, and wherein the hardware accelerator includes one of a coprocessor, an application-specific instruction set processor (ASIP), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), or a complex programmable logic device (CPLD). 15. A system for implementing a computation using a hardware-accelerated instruction of a processor system, the system comprising: a data interface configured to access computation data and instruction data, wherein the computation data defines a computation data flow graph representing the computation, wherein each node of a plurality of nodes of the computation data flow graph represents an input or an operation of the computation, and wherein the instruction data defines an instruction data flow graph representing the hardware-accelerated instruction that is accelerated by a hardware accelerator, wherein each node of a plurality of nodes of the instruction data flow graph represents an input or an operation of the hardware-accelerated instruction; a processor subsystem configured to: obtain the computation data and the instruction data, based on the computation data and the instruction data, define a constraint satisfaction problem, wherein a solution to the constraint satisfaction problem represents an invocation of the hardware-accelerated instruction in the computation, wherein the constraint satisfaction problem assigns nodes of the computation data flow graph to nodes of the instruction data flow graph, and wherein the constraint satisfaction problem includes at least: one or more constraints enforcing that the assigned nodes of the computation data flow graph have equivalent data flow to the instruction data flow graph, one or more constraints restricting which nodes of the computation data flow graph can be assigned to the inputs of the hardware-accelerated instruction, wherein the constraints are imposed by the hardware-accelerated instruction and/or its programming interface; solve the constraint satisfaction problem to determine the invocation of the hardware-accelerated instruction in the computation, and output data defining the invocation. 16. The system of claim 15 , wherein the hardware accelerator is separate from a CPU of the processing system, and wherein the hardware accelerator includes one of a coprocessor, an application-specific instruction set processor (ASIP), a field-programmable gate array (FPGA), an application-specific integrated circuit (ASIC), or a complex programmable logic device (CP

Assignees

Inventors

Classifications

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

  • Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions · CPC title

  • Operand accessing · CPC title

  • Decoding the operand specifier, e.g. specifier format · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12106097B2 cover?
A computer-implemented method of implementing a computation using a hardware-accelerated instruction of a processor system by solving a constraint satisfaction problem. A solution to the constraint satisfaction problem represents a possible invocation of the hardware-accelerated instruction in the computation. The constraint satisfaction problem assigns nodes of a data flow graph of the computa…
Who is the assignee on this patent?
Bosch Gmbh Robert
What technology area does this patent fall under?
Primary CPC classification G06F9/3001. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 01 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).