Processor with an expandable instruction set architecture for dynamically configuring execution resources
US-2017161067-A1 · Jun 8, 2017 · US
US2020183686A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020183686-A1 |
| Application number | US-201816211820-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 6, 2018 |
| Priority date | Dec 6, 2018 |
| Publication date | Jun 11, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided are techniques for a hardware accelerator with locally stored macros. A plurality of macros are stored in a lookup memory of a hardware accelerator. In response to receiving an operation code, the operation code is mapped to one or more macros of the plurality of macros, wherein each of the one or more macros includes micro-instructions. Each of the micro-instructions of the one or more macros is routed to a function block of a plurality of function blocks. Each of the micro-instructions is processed with the plurality of function blocks. Data from the processing of each of the micro-instructions is stored in an accelerator memory of the hardware accelerator. The data is moved from the accelerator memory to a host memory.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method, comprising: storing a plurality of macros in a lookup memory of a hardware accelerator; in response to receiving an operation code, mapping the operation code to one or more macros of the plurality of macros, wherein each of the one or more macros includes micro-instructions; routing each of the micro-instructions of the one or more macros to a function block of a plurality of function blocks; processing each of the micro-instructions with the plurality of function blocks; storing data from the processing of each of the micro-instructions in an accelerator memory of the hardware accelerator; and moving the data from the accelerator memory to a host memory. 2 . The computer-implemented method of claim 1 , further comprising: executing code in the hardware accelerator to modify the plurality of macros by at least one of storing a new macro, updating an existing macro, and removing an existing macro. 3 . The computer-implemented method of claim 2 , further comprising: receiving a new opcode; and mapping the new opcode to the new macro. 4 . The computer-implemented method of claim 1 , wherein the plurality of function blocks comprise a map registers function block, a vector string register file function block, and a general purpose registers function block. 5 . The computer-implemented method of claim 1 , wherein the micro-instructions are processed in parallel by the plurality of function blocks. 6 . A computer system, comprising: a processor coupled to a bus; a host memory coupled to the bus; and a hardware accelerator coupled to the bus, wherein the hardware accelerator comprises an engine, and wherein the engine comprises: an instruction queue that stores a plurality of operation codes; a packetizer that stores a plurality of macros in a lookup memory, wherein the packetizer maps an operation code of the plurality of operation codes to one or more macros of the plurality of macros, and wherein each of the one or more macros includes micro-instructions; a dispatcher that routes each of the micro-instructions of the one or more macros to a function block of a plurality of function blocks that is to process that micro-instruction; an accelerator memory that stores data from processing of each of the micro-instructions by the plurality of function blocks; and a direct memory access that moves the data from the accelerator memory to a host memory. 7 . The computer system of claim 6 , wherein the hardware accelerator executes code to modify the plurality of macros by at least one of storing a new macro, updating an existing macro, and removing an existing macro. 8 . The computer system of claim 7 , wherein the packetizer maps a new opcode to the new macro. 9 . The computer system of claim 6 , wherein the plurality of function blocks comprise a map registers function block, a vector string register file function block, and a general purpose registers function block. 10 . The computer system of claim 6 , wherein the micro-instructions are processed in parallel by the plurality of function blocks. 11 . A hardware accelerator in a computer system, wherein the computer system includes a processor and a host memory, comprising: a plurality of engines, wherein each engine includes: an instruction queue; a packetizer; a dispatcher; an accelerator memory; a plurality of function blocks; a direct memory access; and control logic to perform operations, the operations comprising: storing, with the instruction queue, a plurality of operation codes; storing, with the packetizer, a plurality of macros in a lookup memory; mapping, with the packetizer, an operation code of the plurality of operation codes to one or more macros of the plurality of macros, wherein each of the one or more macros includes micro-instructions; routing, with the dispatcher, each of the micro-instructions of the one or more macros to a function block of a plurality of function blocks that is to process that micro-instruction; processing, with the plurality of function blocks, each of the micro-instructions to generate and store data in the accelerator memory; and moving, with the direct memory access, the data from the accelerator memory to the host memory. 12 . The hardware accelerator of claim 11 , wherein control logic performs operations, the operations comprising: executing code to modify the plurality of macros by at least one of storing a new macro, updating an existing macro, and removing an existing macro. 13 . The hardware accelerator of claim 12 , wherein control logic performs operations, the operations comprising: mapping, with the packetizer, a new opcode to the new macro. 14 . The hardware accelerator of claim 11 , wherein the plurality of function blocks comprise a map registers function block, a vector string register file function block, and a general purpose registers function block. 15 . The hardware accelerator of claim 11 , wherein the micro-instructions are processed in parallel by the plurality of function blocks.
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Arrangements for communication of instructions and data · CPC title
for non-native instruction execution, e.g. executing a command; for Java instruction set · CPC title
Details of memory controller · CPC title
using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.