Accelerating eight-way parallel keccak execution
US-2024211268-A1 · Jun 27, 2024 · US
US9760371B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9760371-B2 |
| Application number | US-201113976885-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 22, 2011 |
| Priority date | Dec 22, 2011 |
| Publication date | Sep 12, 2017 |
| Grant date | Sep 12, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of an aspect includes receiving a packed data operation mask register arithmetic combination instruction. The packed data operation mask register arithmetic combination instruction indicates a first packed data operation mask register, indicates a second packed data operation mask register, and indicates a destination storage location. An arithmetic combination of at least a portion of bits of the first packed data operation mask register and at least a corresponding portion of bits of the second packed data operation mask register is stored in the destination storage location in response to the packed data operation mask register arithmetic combination instruction. Other methods, apparatus, systems, and instructions are disclosed.
Opening claim text (preview).
What is claimed is: 1. A method comprising: receiving a packed data operation mask register binary arithmetic combination instruction, the packed data operation mask register binary arithmetic combination instruction indicating a first packed data operation mask register of a set of architectural packed data operation mask registers, indicating a second packed data operation mask register of a set of packed data operation mask registers, and indicating a destination storage location, wherein the first and second packed data operation mask registers do not store packed data, and wherein the packed data operation mask register binary arithmetic combination instruction is included in an instruction set with a plurality of packed data instructions that indicate registers in the set of packed data operation mask registers as predicate operands to predicate packed data operations; and storing a binary arithmetic combination of at least a portion of bits of the first packed data operation mask register and at least a corresponding portion of bits of the second packed data operation mask register in the destination storage location in response to the packed data operation mask register binary arithmetic combination instruction. 2. The method of claim 1 , wherein receiving the instruction comprises receiving a packed data operation mask register addition instruction, and wherein storing the binary arithmetic combination comprises storing a sum of said at least the corresponding portions of the bits of the first and second packed data operation mask registers. 3. The method of claim 1 , wherein receiving the instruction comprises receiving a packed data operation mask register subtraction instruction, and wherein storing the binary arithmetic combination comprises storing a difference of said at least the corresponding portions of the bits of the first and second packed data operation mask registers. 4. The method of claim 1 , wherein storing the binary arithmetic combination comprises storing a sum of corresponding lowest order portions of the bits of the first and second packed data operation mask registers in a corresponding lowest order portion of bits of the destination storage location, which is a packed data operation mask register, and zeroing a highest order portion of the bits of the destination storage location. 5. The method of claim 1 , wherein storing comprises storing the binary arithmetic combination of corresponding same size portions of the bits of the first and second packed data operation mask registers, and wherein the same size portions are one of 8-bits, 16-bits, 32-bits, and 64-bits. 6. The method of claim 1 , wherein each bit of the portion of the bits of the first packed data operation mask register corresponds to at least a different data element of a first packed data. 7. The method of claim 1 , wherein each bit of the portion of the bits of the first packed data operation mask register comprises carry out information to indicate whether or not a carry out has occurred as a result of an addition of a different corresponding pair of data elements of a first packed data and a second packed data. 8. The method of claim 1 , further comprising receiving a second instruction that indicates a packed data register and the first packed data operation mask register as a mask to predicate an operation of the second instruction. 9. The method of claim 1 , wherein the binary arithmetic combination comprises a sum, and further comprising using the sum to add large integers that are each 128-bits or wider. 10. An apparatus comprising: a set of general-purpose registers; a set of packed data registers; a set of packed data operation mask registers; a first packed data operation mask register of the set of packed data operation mask registers; a second packed data operation mask register of the set of packed data operation mask registers; and an execution unit coupled with the first and second packed data operation mask registers, the execution unit operable, in response to a packed data operation mask register binary arithmetic combination instruction that is to indicate the first packed data operation mask register, that is to indicate the second packed data operation mask register, and that is to indicate a destination storage location, to store a binary arithmetic combination of at least a portion of bits of the first packed data operation mask register and at least a corresponding portion of bits of the second packed data operation mask register in the destination storage location. 11. The apparatus of claim 10 , wherein the instruction comprises a packed data operation mask register addition instruction, and wherein the execution unit is operable, in response to the instruction, to store a sum of at least the corresponding portions of the bits of the first and second packed data operation mask registers in the destination storage location. 12. The apparatus of claim 10 , wherein the instruction comprises a packed data operation mask register subtraction instruction, and wherein the execution unit is operable, in response to the instruction, to store a difference of at least the corresponding portions of the bits of the first and second packed data operation mask registers in the destination storage location. 13. The apparatus of claim 10 , wherein the execution unit is operable, in response to the instruction, to store a sum of corresponding lowest order portions of the bits of the first and second packed data operation mask registers in a corresponding lowest order portion of bits of the destination storage location, which is a packed data operation mask register, and to zero a highest order portion of the bits of the destination storage location. 14. The apparatus of claim 10 , wherein the execution unit is operable, in response to the instruction, to store a binary arithmetic combination of corresponding same size portions of the bits of the first and second packed data operation mask registers, and wherein the same size portions are one of 8-bits, 16-bits, 32-bits, and 64-bits. 15. The apparatus of claim 10 , wherein each bit of the portion of the bits of the first packed data operation mask register is to correspond to at least a different data element of a first packed data. 16. The apparatus of claim 10 , wherein each bit of the portion of the bits of the first packed data operation mask register is to include carry out information to indicate whether or not a carry out has occurred as a result of an addition of a different corresponding pair of data elements of a first packed data and a second packed data. 17. The apparatus of claim 10 , wherein the packed data operation mask register binary arithmetic combination instruction is operable to explicitly specify the first packed data operation mask register, is operable to explicitly specify the second packed data operation mask register, and is operable to explicitly specify the destination storage location which is also a packed data operation mask register. 18. The apparatus of claim 10 , wherein the first and second packed data operation mask registers are each 64-bit registers, and wherein the corresponding portions of the bits of the first and second packed data operation mask registers are same size portions one of 8-bits, 16-bits, 32-bits, and 64-bits. 19. The apparatus of claim 10 , further comprising a packed data operation mask register file having the first and second packed data operation mask registers. 20. A system comprising: an interconnect; a pr
with variable precision · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Bit or string instructions · CPC title
Arithmetic instructions · CPC title
using a mask · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.