Information processing apparatus
US-2024385843-A1 · Nov 21, 2024 · US
US10296333B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10296333-B2 |
| Application number | US-201615391695-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 27, 2016 |
| Priority date | Dec 30, 2011 |
| Publication date | May 21, 2019 |
| Grant date | May 21, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Vector single instruction multiple data (SIMD) shift and rotate instructions are provided specifying: a destination vector register comprising fields to store vector elements, a first vector register, a vector element size, and a second vector register. Vector data fields of a first element size are duplicated. Duplicate vector data fields are stored as corresponding data fields of twice the first element size. Control logic receives an element size for performing a SIMD shift or rotation operation. Through selectors corresponding to a vector element, portions are selected from the duplicated data fields, the selectors corresponding to any particular vector element select all portions similarly from the duplicated data fields for that particular vector element responsive to the first element size, but selectors corresponding to any particular vector element select at least two portions from the duplicated data fields differently for that particular vector element responsive to a second element size.
Opening claim text (preview).
What is claimed is: 1. A machine implemented method comprising: decoding an instruction; executing the decoding instruction to perform a shift or rotate operation by: duplicating a plurality of vector data fields of a first element size, storing the duplicate vector data fields as corresponding duplicated data fields of twice the first element size, receiving in a control logic an element size for performing a SIMD rotation on vector elements, and selecting, through a plurality of selectors corresponding to a particular vector element, portions from the duplicated data fields, the plurality of selectors corresponding to any particular vector element selecting portions similarly from the duplicated data fields for that particular vector element responsive to the control logic receiving the first element size for performing the SIMD rotation on vector elements, but the plurality of selectors corresponding to any particular vector element selecting at least two portions from the duplicated data fields differently for that particular vector element responsive to the control logic receiving a second element size for performing the SIMD rotation on vector elements. 2. The machine implemented method of claim 1 , wherein which way said portions are selected similarly is dependent upon a corresponding shift count for the rotation of that particular vector element of the first element size. 3. The machine implemented method of claim 2 , wherein the first element size is 64 bits. 4. The machine implemented method of claim 2 , wherein selecting portions similarly from the duplicated data fields for a particular vector element responsive to the control logic receiving the first element size causes at least two middle portions to be duplicates. 5. The machine implemented method of claim 4 , wherein selecting portions similarly from the duplicated data fields for a particular vector element responsive to the control logic receiving the first element size causes at least two end portions to be duplicates. 6. The machine implemented method of claim 1 , wherein which way said at least two portions are selected differently is dependent upon a corresponding shift count for the rotation of that particular vector element of the second element size. 7. The machine implemented method of claim 6 , wherein the second element size is 32 bits. 8. The machine implemented method of claim 6 , wherein selecting at least two portions from the duplicated data fields differently for a particular vector element responsive to the control logic receiving a second element size causes at least two portions selected differently to be duplicates. 9. A processing system comprising: a memory; and a first plurality of processors, each of the first plurality of processors comprising: one or more vector registers each comprising a plurality of data fields to store values of vector elements; decode circuitry to decode an instruction specifying: a destination operand of one or more vector registers each comprising a plurality of data fields to store values of vector elements, a first operand of the one or more vector registers, a size of the vector elements, a second operand of the one or more vector registers; and execution circuitry comprising: duplication circuitry to duplicate data fields of a first element size from the second operand and to store the duplicate data fields as corresponding duplicated data fields of twice the first element size; a control circuitry to receive the size of the vector elements; and selector circuitry partitioned into a plurality of selectors corresponding to a particular vector element and coupled with the duplication stage, the plurality of selectors corresponding to any particular vector element to select portions from the duplicated data fields similarly for that particular vector element responsive to the control logic receiving the first element size for the size of the vector elements, but the plurality of selectors corresponding to any particular vector element to select at least two portions differently from the duplicated data fields for that particular vector element responsive to the control logic receiving a second element size for the size of the vector elements. 10. The processing system of claim 9 , wherein which way said portions are selected similarly is dependent upon a corresponding shift count from the first operand for the rotation of that particular vector element of the first element size. 11. The processing system of claim 9 , wherein which way said at least two portions are selected differently is dependent upon a corresponding shift count from the first operand for the rotation of that particular vector element of the second element size. 12. An apparatus comprising: decode circuitry to decode an instruction specifying: a destination operand of one or more vector registers each comprising a plurality of data fields to store values of vector elements, a first operand of the one or more vector registers, a size of the vector elements, a second operand of the one or more vector registers; and execution circuitry comprising: duplication circuitry to duplicate data fields of a first element size from the second operand and to store the duplicate data fields as corresponding duplicated data fields of twice the first element size; a control circuitry to receive the size of the vector elements; and selector circuitry partitioned into a plurality of selectors corresponding to a particular vector element and coupled with the duplication stage, the plurality of selectors corresponding to any particular vector element to select portions from the duplicated data fields similarly for that particular vector element responsive to the control logic receiving the first element size for the size of the vector elements, but the plurality of selectors corresponding to any particular vector element to select at least two portions differently from the duplicated data fields for that particular vector element responsive to the control logic receiving a second element size for the size of the vector elements. 13. The apparatus of claim 12 , wherein which way said portions are selected similarly is dependent upon a corresponding shift count from the first operand for the rotation of that particular vector element of the first element size. 14. The apparatus of claim 12 , wherein the first element size is 64 bits. 15. The apparatus of claim 12 , wherein selecting portions similarly from the duplicated data fields for a particular vector element responsive to the control logic receiving the first element size causes at least two middle portions to be duplicates. 16. The apparatus of claim 12 , wherein selecting portions similarly from the duplicated data fields for a particular vector element responsive to the control logic receiving the first element size causes at least two end portions to be duplicates. 17. The apparatus of 12 , wherein which way said at least two portions are selected differently is dependent upon a corresponding shift count from the first operand for the rotation of that particular vector element of the second element size. 18. The apparatus of claim 12 , wherein said instruction is a rotation instruction. 19. The apparatus of 18 , wherein the second element size is 32 bits. 20. The apparatus of 19 , wherein selecting at least two portions from the duplicated data fields differently for a particular vector element responsive to the control logic receiving a second element size causes at least tw
Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title
Register renaming · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
controlled by a single instruction for multiple data lanes [SIMD] · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.