SIMD variable shift and rotate using control manipulation

US10296333B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10296333-B2
Application numberUS-201615391695-A
CountryUS
Kind codeB2
Filing dateDec 27, 2016
Priority dateDec 30, 2011
Publication dateMay 21, 2019
Grant dateMay 21, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Vector single instruction multiple data (SIMD) shift and rotate instructions are provided specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, and a second vector register. Vector data fields of a first element size are duplicated. Duplicate vector data fields are stored as corresponding data fields of twice the first element size. Control logic receives an element size for performing a SIMD shift or rotation operation. Through selectors corresponding to a vector element, portions are selected from the duplicated data fields, the selectors corresponding to any particular vector element select all portions similarly from the duplicated data fields for that particular vector element responsive to the first element size, but selectors corresponding to any particular vector element select at least two portions from the duplicated data fields differently for that particular vector element responsive to a second element size.

First claim

Opening claim text (preview).

What is claimed is: 1. A machine implemented method comprising: decoding an instruction; executing the decoding instruction to perform a shift or rotate operation by: duplicating a plurality of vector data fields of a first element size, storing the duplicate vector data fields as corresponding duplicated data fields of twice the first element size, receiving in a control logic an element size for performing a SIMD rotation on vector elements, and selecting, through a plurality of selectors corresponding to a particular vector element, portions from the duplicated data fields, the plurality of selectors corresponding to any particular vector element selecting portions similarly from the duplicated data fields for that particular vector element responsive to the control logic receiving the first element size for performing the SIMD rotation on vector elements, but the plurality of selectors corresponding to any particular vector element selecting at least two portions from the duplicated data fields differently for that particular vector element responsive to the control logic receiving a second element size for performing the SIMD rotation on vector elements. 2. The machine implemented method of claim 1 , wherein which way said portions are selected similarly is dependent upon a corresponding shift count for the rotation of that particular vector element of the first element size. 3. The machine implemented method of claim 2 , wherein the first element size is 64 bits. 4. The machine implemented method of claim 2 , wherein selecting portions similarly from the duplicated data fields for a particular vector element responsive to the control logic receiving the first element size causes at least two middle portions to be duplicates. 5. The machine implemented method of claim 4 , wherein selecting portions similarly from the duplicated data fields for a particular vector element responsive to the control logic receiving the first element size causes at least two end portions to be duplicates. 6. The machine implemented method of claim 1 , wherein which way said at least two portions are selected differently is dependent upon a corresponding shift count for the rotation of that particular vector element of the second element size. 7. The machine implemented method of claim 6 , wherein the second element size is 32 bits. 8. The machine implemented method of claim 6 , wherein selecting at least two portions from the duplicated data fields differently for a particular vector element responsive to the control logic receiving a second element size causes at least two portions selected differently to be duplicates. 9. A processing system comprising: a memory; and a first plurality of processors, each of the first plurality of processors comprising: one or more vector registers each comprising a plurality of data fields to store values of vector elements; decode circuitry to decode an instruction specifying: a destination operand of one or more vector registers each comprising a plurality of data fields to store values of vector elements, a first operand of the one or more vector registers, a size of the vector elements, a second operand of the one or more vector registers; and execution circuitry comprising: duplication circuitry to duplicate data fields of a first element size from the second operand and to store the duplicate data fields as corresponding duplicated data fields of twice the first element size; a control circuitry to receive the size of the vector elements; and selector circuitry partitioned into a plurality of selectors corresponding to a particular vector element and coupled with the duplication stage, the plurality of selectors corresponding to any particular vector element to select portions from the duplicated data fields similarly for that particular vector element responsive to the control logic receiving the first element size for the size of the vector elements, but the plurality of selectors corresponding to any particular vector element to select at least two portions differently from the duplicated data fields for that particular vector element responsive to the control logic receiving a second element size for the size of the vector elements. 10. The processing system of claim 9 , wherein which way said portions are selected similarly is dependent upon a corresponding shift count from the first operand for the rotation of that particular vector element of the first element size. 11. The processing system of claim 9 , wherein which way said at least two portions are selected differently is dependent upon a corresponding shift count from the first operand for the rotation of that particular vector element of the second element size. 12. An apparatus comprising: decode circuitry to decode an instruction specifying: a destination operand of one or more vector registers each comprising a plurality of data fields to store values of vector elements, a first operand of the one or more vector registers, a size of the vector elements, a second operand of the one or more vector registers; and execution circuitry comprising: duplication circuitry to duplicate data fields of a first element size from the second operand and to store the duplicate data fields as corresponding duplicated data fields of twice the first element size; a control circuitry to receive the size of the vector elements; and selector circuitry partitioned into a plurality of selectors corresponding to a particular vector element and coupled with the duplication stage, the plurality of selectors corresponding to any particular vector element to select portions from the duplicated data fields similarly for that particular vector element responsive to the control logic receiving the first element size for the size of the vector elements, but the plurality of selectors corresponding to any particular vector element to select at least two portions differently from the duplicated data fields for that particular vector element responsive to the control logic receiving a second element size for the size of the vector elements. 13. The apparatus of claim 12 , wherein which way said portions are selected similarly is dependent upon a corresponding shift count from the first operand for the rotation of that particular vector element of the first element size. 14. The apparatus of claim 12 , wherein the first element size is 64 bits. 15. The apparatus of claim 12 , wherein selecting portions similarly from the duplicated data fields for a particular vector element responsive to the control logic receiving the first element size causes at least two middle portions to be duplicates. 16. The apparatus of claim 12 , wherein selecting portions similarly from the duplicated data fields for a particular vector element responsive to the control logic receiving the first element size causes at least two end portions to be duplicates. 17. The apparatus of 12 , wherein which way said at least two portions are selected differently is dependent upon a corresponding shift count from the first operand for the rotation of that particular vector element of the second element size. 18. The apparatus of claim 12 , wherein said instruction is a rotation instruction. 19. The apparatus of 18 , wherein the second element size is 32 bits. 20. The apparatus of 19 , wherein selecting at least two portions from the duplicated data fields differently for a particular vector element responsive to the control logic receiving a second element size causes at least tw

Assignees

Inventors

Classifications

  • Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title

  • Register renaming · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10296333B2 cover?
Vector single instruction multiple data (SIMD) shift and rotate instructions are provided specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, and a second vector register. Vector data fields of a first element size are duplicated. Duplicate vector data fields are stored as corresponding data fields of twice the fi…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/30032. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 21 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).