Method and apparatus for performing a vector bit shuffle
US-2016188532-A1 · Jun 30, 2016 · US
US2016188335A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016188335-A1 |
| Application number | US-201414583639-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 27, 2014 |
| Priority date | Dec 27, 2014 |
| Publication date | Jun 30, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus and method for performing a vector bit gather. For example, one embodiment of a processor comprises: a first vector register to store one or more source data elements; a second vector register to store one or more control elements, each of the control elements comprising a plurality of bit fields, each bit field to be associated with a corresponding bit position in a destination vector register and to identify a bit from the one or more source data elements to be copied to each of the particular bit positions; and vector bit gather logic to read each bit field from the second vector register to identify a bit from the one or more source data elements and to responsively copy the bit from each of the one or more source data elements to each of the corresponding bit positions in the destination vector register.
Opening claim text (preview).
What is claimed is: 1 . A processor comprising: a first vector register to store one or more source data elements; a second vector register to store one or more control elements, each of the control elements comprising a plurality of bit fields, each bit field to be associated with a corresponding bit position in a destination vector register and to identify a bit from the one or more source data elements to be copied to each of the particular bit positions; and vector bit gather logic to read each bit field from the second vector register to identify a bit from the one or more source data elements and to responsively copy the bit from each of the one or more source data elements to each of the corresponding bit positions in the destination vector register. 2 . The processor as in claim 1 wherein the vector bit gather logic comprises one or more multiplexers to select a set of bits from each of the source data elements in accordance with the bit fields in each of the control elements. 3 . The processor as in claim 1 wherein each of the source data elements comprises a 64-bit data element and wherein each bit field comprises at least 6 bits to identify a bit from each of the 64-bit data elements. 4 . The processor as in claim 3 wherein each of the bit fields comprises a control byte and wherein the 6 bits are to be selected from each of the control bytes to identify each bit from each of the one or more 64 bit data elements. 5 . The processor as in claim 4 wherein eight bits from each data element are to be selected using eight of the control bytes. 6 . The processor as in claim 5 wherein the first vector register is to store eight of the 64 bit data elements and wherein the destination mask register is to store eight corresponding 8-bit values selected from the eight 64-bit data elements. 7 . The processor as in claim 6 further comprising: a mask register to store a mask bit associated with each of the 8-bit values in the destination vector register. 8 . The processor as in claim 7 wherein the bits within the mask register are to be used to determine whether the corresponding values in the destination vector register should remain unchanged or be set to 0. 9 . The processor as in claim 1 wherein the vector bit gather logic is to operate responsive to a vector bit gather instruction decoded by decode logic in the processor and executed by execution logic in the processor. 10 . A method comprising: storing a plurality of source data elements in a first vector register; storing a plurality of control elements in a second vector register, each of the control elements comprising a plurality of bit fields, each bit field to be associated with a corresponding bit position in a destination mask register and to identify a bit from each of the source data elements to be copied to each of the particular bit positions; and reading each bit field from the second vector register to identify a bit from each of the source data elements and responsively copying the bit from each of the source data elements to each of the corresponding bit positions in the destination mask register. 11 . The method as in claim 10 further comprising selecting a set of bits from each of the source data elements with one or more multiplexers in accordance with the bit fields in each of the control elements. 12 . The method as in claim 10 wherein each of the source data elements comprises a 64-bit data element and wherein each bit field comprises at least 6 bits to identify a bit from each of the 64-bit data elements. 13 . The method as in claim 12 wherein each of the bit fields comprises a control byte and wherein the 6 bits are to be selected from each of the control bytes to identify each bit from each of the 64 bit data elements. 14 . The method as in claim 13 wherein eight bits from each data element are to be selected using eight of the control bytes. 15 . The method as in claim 14 wherein the first vector register is to store eight of the 64 bit data elements and wherein the destination mask register is to store eight corresponding 8-bit values selected from the eight 64-bit data elements. 16 . The method as in claim 15 further comprising storing in a mask register a mask bit associated with each of the 8-bit values in the destination vector register. 17 . The method as in claim 16 wherein the bits within the mask register are to be used to determine whether the corresponding values in the destination vector register should remain unchanged or be set to 0. 18 . A system comprising: a memory to store program code and data; a cache hierarchy comprising multiple cache levels to cache the program code and data in accordance with a specified cache management policy; an input device to receive input from a user; a processor to execute the program code and process the data responsive to the input from the user, the processor comprising: a first vector register to store one or more source data elements; a second vector register to store one or more control elements, each of the control elements comprising a plurality of bit fields, each bit field to be associated with a corresponding bit position in a destination vector register and to identify a bit from the one or more source data elements to be copied to each of the particular bit positions; and vector bit gather logic to read each bit field from the second vector register to identify a bit from the one or more source data elements and to responsively copy the bit from each of the one or more source data elements to each of the corresponding bit positions in the destination vector register. 19 . The system as in claim 18 wherein the vector bit gather logic comprises one or more multiplexers to select a set of bits from each of the source data elements in accordance with the bit fields in each of the control elements. 20 . The system as in claim 18 wherein each of the source data elements comprises a 64-bit data element and wherein each bit field comprises at least 6 bits to identify a bit from each of the 64-bit data elements. 21 . The system as in claim 20 wherein each of the bit fields comprises a control byte and wherein the 6 bits are to be selected from each of the control bytes to identify each bit from each of the one or more 64 bit data elements. 22 . The system as in claim 21 wherein eight bits from each data element are to be selected using eight of the control bytes. 23 . The system as in claim 22 wherein the first vector register is to store eight of the 64 bit data elements and wherein the destination mask register is to store eight corresponding 8-bit values selected from the eight 64-bit data elements. 24 . The system as in claim 23 further comprising: a mask register to store a mask bit associated with each of the 8-bit values in the destination vector register. 25 . The system as in claim 24 wherein the bits within the mask register are to be used to determine whether the corresponding values in the destination vector register should remain unchanged or be set to 0.
Register arrangements · CPC title
Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title
using a mask · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Bit or string instructions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.