Executing instruction sequence code blocks by using virtual cores instantiated by partitionable engines
US-2016210145-A1 · Jul 21, 2016 · US
US10289605B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10289605-B2 |
| Application number | US-201715853323-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 22, 2017 |
| Priority date | Apr 12, 2006 |
| Publication date | May 14, 2019 |
| Grant date | May 14, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A matrix of execution blocks form a set of rows and columns. The rows support parallel execution of instructions and the columns support execution of dependent instructions. The matrix of execution blocks process a single block of instructions specifying parallel and dependent instructions.
Opening claim text (preview).
What is claimed is: 1. An apparatus for executing instruction matrices, comprising: a memory to store a plurality of instruction matrices; and a plurality of matrices of execution units to execute the plurality of instruction matrices, wherein each of the plurality of instruction matrices includes rows and columns, wherein instructions in a same row of an instruction matrix are to be executed in parallel and instructions that are dependent upon instructions in a row of an instruction matrix are in a subsequent row of the instruction matrix, wherein the plurality of matrices of execution units is configurable to operate in different execution modes including a first execution mode, a second execution mode, and a third execution mode, wherein in the first execution mode, the plurality of matrices of execution units simultaneously executes a group of instruction matrices that form a super instruction matrix, wherein in the second execution mode, the plurality of matrices of execution units simultaneously executes instructions matrices belonging to separate threads, and wherein in the third execution mode, the plurality of matrices of execution units simultaneously executes non-dependent instruction matrices from a single thread. 2. The apparatus of claim 1 , further comprising: a register file having a plurality of register segments to store results for subsequent processing by the plurality of matrices of execution units, wherein the register file is configurable to support different execution modes including the first execution mode, the second execution mode, and the third execution mode, wherein in the first execution mode, the plurality of register segments forms a single register file where each register segment stores sources and results of a Multiple Instructions Multiple Data (MIMD) super instruction issuing simultaneous instructions where each individual instruction is a scalar or Single Instruction Multiple Data (SIMD) instruction, wherein in the second execution mode, the plurality of register segments forms individual independent register files with individual register state to support simultaneous processing of separate threads, wherein each instruction is associated with a separate thread and a separate register file segment, and wherein in the third execution mode, the plurality of register segments forms a single thread register file, wherein register segments are duplicated in multiple register segments of the register file to store results of simultaneously executed non-dependent instructions that are dynamically issued from a single thread instruction sequence. 3. The apparatus of claim 1 , wherein a matrix of execution units of the plurality of matrices of execution units includes a first row of execution units and a second row of execution units, wherein execution units in the first row of execution units operate in parallel and execution units in the second row of execution units operate in parallel and operate in dependency upon the execution units in the first row of execution units. 4. The apparatus of claim 3 , wherein each instruction matrix of the plurality of instruction matrices includes a first row of instructions and a second row of instructions, wherein instruction in the first row of instructions are executed in parallel by a subset of execution units of the first row of execution units, and wherein instructions in the second row of instructions are executed in parallel by a subset of execution units of the second row of execution units. 5. The apparatus of claim 1 , wherein the super instruction matrix is formed by a compiler. 6. The apparatus of claim 1 , wherein the plurality of instruction matrices is formed by a run-time system. 7. The apparatus of claim 1 , wherein the plurality of instruction matrices is formed by hardware. 8. The apparatus of claim 1 , wherein the plurality of instruction matrices is formed by a compiler. 9. The apparatus of claim 1 , wherein the plurality of matrices of execution units supports floating point, integer, Single Instruction Multiple Data (SIMD), and Multiple Instruction Multiple Data (MIMD) operations. 10. The apparatus of claim 9 , wherein an instruction matrix of the plurality of instruction matrices includes Single Instruction Multiple Data (SIMD) instructions. 11. The apparatus of claim 9 , wherein an instruction matrix of the plurality of instruction matrices includes Multiple Instruction Multiple Data (MIMD) instructions. 12. The apparatus of claim 9 , wherein an instruction matrix of the plurality of instruction matrices includes a combination of Single Instruction Multiple Data (SIMD) instructions and Multiple Instruction Multiple Data (MIMD) instructions. 13. The apparatus of claim 1 , wherein each of the plurality of instruction matrices is assigned a matrix number to enforce dependency maintenance between the plurality of instruction matrices. 14. The apparatus of claim 13 , further comprising a scheduler, wherein the scheduler uses matrix numbers to track register references. 15. The apparatus of claim 1 , wherein an instruction matrix of the plurality of instruction matrices specifies source operands and destination operands in fixed locations regardless of opcode. 16. The apparatus of claim 1 , the non-dependent instruction matrices from the single thread are determined to be non-dependent using a hardware dependency check. 17. A method by a processor for executing instruction matrices, comprising: fetching a plurality of instruction matrices, wherein each of the plurality of instruction matrices includes rows and columns, wherein instructions in a same row of an instruction matrix are to be executed in parallel and instructions that are dependent upon instructions in a row of an instruction matrix are in a subsequent row of the instruction matrix; and executing the plurality of instruction matrices in a first execution mode, a second execution mode, and a third execution mode, wherein in the first execution mode, a plurality of matrices of execution units of the processor simultaneously executes a group of instruction matrices that form a super instruction matrix, wherein in the second execution mode, the plurality of matrices of execution units simultaneously executes instructions matrices belonging to separate threads, and wherein in the third execution mode, the plurality of matrices of execution units simultaneously executes non-dependent instruction matrices from a single thread. 18. The method of claim 17 , further comprising: storing a result of executing the plurality of matrices of execution units in a register file having a plurality of register segments, wherein the register file is configurable to support the first execution mode, the second execution mode, and the third execution mode, wherein in the first execution mode, the plurality of register segments forms a single register file where each register segment stores sources and results of a Multiple Instructions Multiple Data (MIMD) super instruction issuing simultaneous instructions where each individual instruction is a scalar or Single Instruction Multiple Data (SIMD) instruction, wherein in the second execution mode, the plurality of register segments forms individual independent register files with individual register state to support simultaneous processing of separate threads, wherein each instruction is associated with a separate thread and a separate register file segment, and wherein in the third execution mode, the plurality of register segments forms a single thread register file, wh
with column wise addition of partial products, e.g. using Wallace tree, Dadda counters (G06F7/5324 takes precedence) · CPC title
Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title
single instruction multiple data [SIMD] multiprocessors · CPC title
using instruction pipelines · CPC title
according to context, e.g. thread buffers · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.