Gather using index array and finite state machine
US-9753889-B2 · Sep 5, 2017 · US
US10114651B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10114651-B2 |
| Application number | US-201815862407-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 4, 2018 |
| Priority date | Dec 22, 2009 |
| Publication date | Oct 30, 2018 |
| Grant date | Oct 30, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a destination register to store a plurality of data elements; a source register to store a plurality of index values, each of which corresponds to one of the plurality of data elements; a base address register to store a base address; a mask register to store mask values, each mask value corresponding to one of the plurality of data elements; a decoder to decode a vector gather instruction, the vector gather instruction having a scale field to specify a common scaling factor to be applied to the index values; and execution circuitry coupled to the decoder, the execution circuitry to perform operations associated with the vector gather instruction, the operations comprising conditionally accessing, based on corresponding mask values, one or more of the plurality of data elements and storing the one or more of the plurality of data elements in the destination register; wherein the execution circuitry is to scale the index values in accordance with the scale field to generate a corresponding plurality of scaled index values, and add the base address to each of the scaled index values to generate a corresponding plurality of non-contiguous system memory addresses for the data elements to be accessed and stored in the destination register. 2. The processor of claim 1 wherein system memory addresses are determined by adding a displacement value to the combination of the base address and the scaled index values. 3. The processor of claim 1 further comprising: a plurality of cores, wherein the destination register, source register, base address register, mask register, decoder, and execution circuitry are integral to a first core of the plurality of cores. 4. The processor of claim 1 further comprising: fetch circuitry and/or prefetch circuitry to fetch and/or prefetch the vector gather instruction from a system memory. 5. The processor of claim 3 further comprising: a memory controller to couple the cores to a system memory. 6. The processor of claim 1 further comprising: register rename circuitry to map one or more logical registers to physical registers of a register file. 7. The processor of claim 3 further comprising: a level 1 cache integral to the first core to store instructions and data. 8. The processor of claim 1 further comprising: a first interconnect to couple the processor to one or more system components. 9. The processor of claim 2 further comprising: a second interconnect to couple the processor to one or more other processors. 10. The processor of claim 9 further comprising: a third interconnect to couple the processor to a system memory.
LOAD or STORE instructions; Clear instruction · CPC title
Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
using a mask · CPC title
of multiple operands or results {(addressing multiple banks G06F12/06)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.