Gathering and scattering multiple data elements

US10114651B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10114651-B2
Application numberUS-201815862407-A
CountryUS
Kind codeB2
Filing dateJan 4, 2018
Priority dateDec 22, 2009
Publication dateOct 30, 2018
Grant dateOct 30, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during execution of the single instruction; and in response to the exception, delivering pending traps or interrupts to an exception handler prior to delivering the exception.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: a destination register to store a plurality of data elements; a source register to store a plurality of index values, each of which corresponds to one of the plurality of data elements; a base address register to store a base address; a mask register to store mask values, each mask value corresponding to one of the plurality of data elements; a decoder to decode a vector gather instruction, the vector gather instruction having a scale field to specify a common scaling factor to be applied to the index values; and execution circuitry coupled to the decoder, the execution circuitry to perform operations associated with the vector gather instruction, the operations comprising conditionally accessing, based on corresponding mask values, one or more of the plurality of data elements and storing the one or more of the plurality of data elements in the destination register; wherein the execution circuitry is to scale the index values in accordance with the scale field to generate a corresponding plurality of scaled index values, and add the base address to each of the scaled index values to generate a corresponding plurality of non-contiguous system memory addresses for the data elements to be accessed and stored in the destination register. 2. The processor of claim 1 wherein system memory addresses are determined by adding a displacement value to the combination of the base address and the scaled index values. 3. The processor of claim 1 further comprising: a plurality of cores, wherein the destination register, source register, base address register, mask register, decoder, and execution circuitry are integral to a first core of the plurality of cores. 4. The processor of claim 1 further comprising: fetch circuitry and/or prefetch circuitry to fetch and/or prefetch the vector gather instruction from a system memory. 5. The processor of claim 3 further comprising: a memory controller to couple the cores to a system memory. 6. The processor of claim 1 further comprising: register rename circuitry to map one or more logical registers to physical registers of a register file. 7. The processor of claim 3 further comprising: a level 1 cache integral to the first core to store instructions and data. 8. The processor of claim 1 further comprising: a first interconnect to couple the processor to one or more system components. 9. The processor of claim 2 further comprising: a second interconnect to couple the processor to one or more other processors. 10. The processor of claim 9 further comprising: a third interconnect to couple the processor to a system memory.

Assignees

Inventors

Classifications

  • LOAD or STORE instructions; Clear instruction · CPC title

  • G06F9/3861Primary

    Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • using a mask · CPC title

  • of multiple operands or results {(addressing multiple banks G06F12/06)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10114651B2 cover?
According to a first aspect, efficient data transfer operations can be achieved by: decoding by a processor device, a single instruction specifying a transfer operation for a plurality of data elements between a first storage location and a second storage location; issuing the single instruction for execution by an execution unit in the processor; detecting an occurrence of an exception during …
Who is the assignee on this patent?
Hughes Christopher J, Chen Yen Kuang Y K, Bomb Mayank, and 17 more
What technology area does this patent fall under?
Primary CPC classification G06F9/3861. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 30 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).