Accelerating eight-way parallel keccak execution
US-2024211268-A1 · Jun 27, 2024 · US
US9459866B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9459866-B2 |
| Application number | US-201113993058-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 30, 2011 |
| Priority date | Dec 30, 2011 |
| Publication date | Oct 4, 2016 |
| Grant date | Oct 4, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A processor core that includes a hardware decode unit to decode a vector frequency compress instruction that includes a source operand and a destination operand. The source operand specifying a source vector register that includes a plurality of source data elements including one or more runs of identical data elements that are each to be compressed in a destination vector register as a value and run length pair. The destination operand identifies the destination vector register. The processor core also includes an execution engine unit to execute the decoded vector frequency compress instruction which causes, for each source data element, a value to be copied into the destination vector register to indicate that source data element's value. One or more runs of the source data elements equal are encoded in the destination vector register as the predetermined compression value followed by a run length for that run.
Opening claim text (preview).
What is claimed is: 1. A method of performing an instruction in a computer processor, comprising: fetching the instruction that includes a source operand and a destination operand, wherein the source operand specifies a single source vector register that includes a plurality of source data elements including one or more runs of identical data elements, wherein the destination operand identifies a destination vector register and wherein each of the one or more runs of identical values that are to be compressed in the destination vector register as a value and run length pair; decoding the fetched instruction; and executing the decoded instruction causing, for each source data element, a value to be copied into the destination vector register to indicate that source data element's value wherein one or more runs of one or more source data elements equal to a compression value are encoded in the destination vector register as the predetermined compression value followed by a run length for that run. 2. The method of claim 1 , wherein the instruction further comprises the compression value that is to be encoded to a value and run length pair. 3. The method of claim 1 , wherein the executing the decoded instruction further causes an exception be raised when the source data elements cannot be compressed into the destination vector register because the source data elements do not contain values that are optimized for run length encoding. 4. The method of claim 1 , wherein the executing the decoded instruction further causes a value be written in a used element indicator to indicate which elements in the destination vector register were written during compression. 5. The method of claim 4 , wherein the fetched instruction further comprises a used element indicator destination to indicate where the used element indicator should be written. 6. The method of claim 1 , wherein the fetched instruction further comprises a control mask that indicates one or more values from the source data elements to be copied to the destination vector register. 7. The method of claim 6 , wherein the executing the decoded instruction further causes determining the compression value by reading the control mask. 8. A processor core, comprising: a hardware decode unit to decode an instruction, wherein the vector frequency compress instruction includes a single source operand and a destination operand, wherein the source operand specifies a source vector register that includes a plurality of source data elements including one or more runs of identical data elements, wherein the destination operand identifies a destination vector register and wherein each of the one or more runs of identical values that are to be compressed in the destination vector register as a value and run length pair; and an execution engine unit to execute the decoded instruction which causes, for each source data element, a value to be copied into the destination vector register to indicate that source data element's value wherein one or more runs of one or more source data elements equal to a compression value are encoded in the destination vector register as the predetermined compression value followed by a run length for that run. 9. The processor core of claim 8 , wherein the instruction further comprises the compression value that is to be encoded to a value and run length pair. 10. The processor core of claim 8 , the execution unit further causes an exception be raised when the source data elements cannot be compressed into the destination vector register because the source data elements do not contain values that are optimized for run length encoding. 11. The processor core of claim 8 , the execution unit further causes a value be written in a used element indicator to indicate which elements in the destination vector register were written during compression. 12. The processor core of claim 11 , wherein the instruction further comprises a used element indicator destination to indicate where the used element indicator should be written. 13. The processor core of claim 8 , wherein the instruction further comprises a control mask that indicates one or more values from the source data elements to be copied to the destination vector register. 14. The processor core of claim 13 , the execution unit further causes determining the compression value by reading the control mask. 15. An article of manufacture, comprising: a non-transitory machine-readable storage medium having stored thereon an instruction, wherein the instruction includes a single source operand and a destination operand, wherein the source operand specifies a source vector register that includes a plurality of source data elements including one or more runs of identical data elements, wherein the destination operand identifies a destination vector register and wherein each of the one or more runs of identical values that are to be compressed in the destination vector register as a value and run length pair; and wherein the instruction includes an opcode, which instructs a machine to execute the instruction that causes, for each source data element, a value to be copied into the destination vector register to indicate that source data element's value wherein one or more runs of one or more source data elements equal to a compression value are encoded in the destination vector register as the predetermined compression value followed by a run length for that run. 16. The article of manufacture of claim 15 , wherein the instruction further comprises the compression value that is to be encoded to a value and run length pair. 17. The article of manufacture of claim 15 , wherein the instruction further causes the machine to raise an exception when the source data elements cannot be compressed into the destination vector register because the source data elements do not contain values that are optimized for run length encoding. 18. The article of manufacture of claim 15 , wherein the instruction further causes the machine to write a value in a used element indicator to indicate which elements in the destination vector register were written during compression. 19. The article of manufacture of claim 18 , wherein the instruction further comprises a used element indicator destination to indicate where the used element indicator should be written. 20. The article of manufacture of claim 15 , wherein the instruction further comprises a control mask that indicates one or more values from the source data elements to be copied to the destination vector register.
Decoding the operand specifier, e.g. specifier format · CPC title
Decoder aspects · CPC title
Bit or string instructions · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Conversion to or from run-length codes, i.e. by representing the number of consecutive digits, or groups of digits, of the same kind by a code word and a digit indicative of that kind · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.