Accelerating eight-way parallel keccak execution
US-2024211268-A1 · Jun 27, 2024 · US
US9389858B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9389858-B2 |
| Application number | US-201213730846-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 29, 2012 |
| Priority date | Dec 2, 1994 |
| Publication date | Jul 12, 2016 |
| Grant date | Jul 12, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus includes an instruction decoder, first and second source registers and a circuit coupled to the decoder to receive packed data from the source registers and to unpack the packed data responsive to an unpack instruction received by the decoder. A first packed data element and a third packed data element are received from the first source register. A second packed data element and a fourth packed data element are received from the second source register. The circuit copies the packed data elements into a destination register resulting with the second packed data element adjacent to the first packed data element, the third packed data element adjacent to the second packed data element, and the fourth packed data element adjacent to the third packed data element.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: a register file having a plurality of registers; a decoder coupled to the register file, the decoder to decode a first instruction and a packed data pack instruction, the first instruction to: have a 32-bit instruction format, include a first field to identify a first source register of the register file that is to store a first plurality of packed 8-bit integers, include a second field to identify a second source register of the register file that is to store a second plurality of packed 8-bit integers, and include a third field to identify a destination register, wherein each of the first and second pluralities of packed 8-bit integers are to include four packed 8-bit integers; and a functional unit including circuitry coupled to the decoder, the functional unit to generate a result that is to be stored in the destination register responsive to the first instruction, the destination register to have a same number of bits as each of the first and second source registers, wherein the result is to include a third plurality of packed 8-bit integers, the third plurality to include four packed 8-bit integers, the third plurality of packed 8-bit integers to include only half of the 8-bit integers from the first plurality of packed 8-bit integers and only half of the 8-bit integers from the second plurality of packed 8-bit integers, the third plurality of packed 8-bit integers to include two pairs of corresponding 8-bit integers from the first and second pluralities of packed 8-bit integers, the corresponding 8-bit integers of each pair to have same bit positions in the first and second source registers, and wherein the first plurality of packed 8-bit integers is to include a more significant 8-bit integer and a less significant 8-bit integer, wherein the second plurality of packed 8-bit integers is to include a more significant 8-bit integer and a less significant 8-bit integer, wherein the more significant 8-bit integer of the first plurality of packed 8-bit integers is to be stored in a more significant position in the result than the less significant 8-bit integer of the first plurality of packed 8-bit integers, and wherein the more significant 8-bit integer of the second plurality of packed 8-bit integers is to be stored in a more significant position in the result than the less significant 8-bit integer of the second plurality of packed 8-bit integers, and wherein the processor is to perform the packed data pack instruction with saturation. 2. The processor of claim 1 , wherein a least significant 8-bit integer of the result is to include a least significant 8-bit integer of the second plurality of packed 8-bit integers. 3. The processor of claim 1 , wherein the first and second source registers are 32-bit integer registers. 4. The processor of claim 1 , wherein the processor is operable to perform a plurality of other packed data instructions, including at least a packed data addition instruction, a packed data subtraction instruction, and a packed data multiplication instruction. 5. The processor of claim 4 , wherein the processor is further operable to perform a packed data shift instruction, and a packed data compare instruction. 6. A system comprising: communications hardware that is operable to couple; a display; a microphone; and a speaker; and a processor coupled with the communications hardware, the processor comprising: a register file having a plurality of registers; a decoder coupled with the register file, the decoder to decode a first instruction and a packed data pack instruction, the first instruction having a 32-bit instruction format, the first instruction having a first field to identify a first source register of the register file that is to store a first plurality of packed 8-bit integers and a second field to identify a second source register of the register file that is to store a second plurality of packed 8-bit integers, the first and second pluralities of packed 8-bit integers each to include four packed 8-bit integers; and a functional unit including circuitry coupled with the decoder, the functional unit to generate a result responsive to the first instruction that is to be stored in a destination register that is to be identified by a third field of the first instruction, the destination register to have a same number of bits as each of the first and second source registers, the result to include a third plurality of packed 8-bit integers, the third plurality to include four packed 8-bit integers, the third plurality of packed 8-bit integers to include only half of the 8-bit integers from the first plurality of packed 8-hit integers and only half of the 8-bit integers from the second plurality of packed 8-bit integers, the third plurality of packed 8-bit integers to include two pairs of corresponding 8-bit integers from the first and second pluralities of packed 8-bit integers, the corresponding 8-bit integers of each pair to have same bit positions in the first and second source registers, and wherein the 8-bit integers from the first plurality of packed 8-bit integers are to be stored in a same order with respect to bit significance in the result as they are to be stored the first plurality of packed 8-bit integers, and the 8-bit integers from the second plurality of packed 8-bit integers are to be stored in a same order with respect to bit significance in the result as they are to be stored in the second plurality of packed 8-bit integers, and wherein the processor is to perform the packed data pack instruction with saturation. 7. The system of claim 6 , wherein a least significant 8-bit integer of the result is to include a least significant 8-bit integer of the second plurality of packed 8-bit integers. 8. The system of claim 6 , wherein the processor is operable to perform a plurality of other packed data instructions, including at least a packed data addition instruction, a packed data subtraction instruction, and a packed data multiplication instruction. 9. The system of claim 8 , wherein the processor is further operable to perform a packed data shift instruction, and a packed data compare instruction. 10. The system of claim 6 , further comprising the display, and wherein the display is to comprise a touch screen display. 11. The system of claim 6 , wherein the communications hardware is further operable to couple a video digitizing device, the video digitizing device to capture video images. 12. A method comprising: decoding a packed data pack instruction; performing the packed data pack instruction including saturating at least one source data element identified by the packed data pack instruction to a saturation value; decoding a first instruction, the first instruction having: a 32-bit instruction format, a first field identifying a first source register of a register file that stores a first plurality of packed 8-bit integers, a second field identifying a second source register of the register file that stores a second plurality of packed 8-bit integers, and a third field identifying a destination register, wherein each of the first and second pluralities of packed 8-bit integers include four packed 8-bit integers; and generating a result and storing the result in the destination register responsive to the first instruction, the destination register having a same number of bits as each of the first and second source registers, wherein the result includes a third plurality of packed 8-bit integers, the third plurality including four packed 8-bit integers, wherein the third plurality of packed 8-bit integers includes only half of the 8-bit integers fro
of immediate specifier, e.g. constants · CPC title
of variable length instructions · CPC title
Saturation, i.e. clipping the result to a minimum or maximum value · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.