Accelerating eight-way parallel keccak execution
US-2024211268-A1 · Jun 27, 2024 · US
US2017192781A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2017192781-A1 |
| Application number | US-201514984124-A |
| Country | US |
| Kind code | A1 |
| Filing date | Dec 30, 2015 |
| Priority date | Dec 30, 2015 |
| Publication date | Jul 6, 2017 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Detailed herein are systems, apparatuses, and methods for strided loads. In an embodiment, an apparatus includes a decoder to decode an instruction, wherein the instruction to include fields a starting source memory address operand and a starting destination register operand; and execution circuitry to execute the decoded instruction to extract data elements of a defined number of types from contiguous memory beginning at the starting source memory address and, for each type, store the extracted data elements in a packed data register dedicated to that type beginning with starting destination register operand.
Opening claim text (preview).
What is claimed is: 1 . An apparatus comprising: a decoder to decode an instruction, wherein the instruction to include fields a starting source memory address operand and a starting destination register operand; and execution circuitry to execute the decoded instruction to extract data elements of a defined number of types from contiguous memory beginning at the starting source memory address and, for each type, store the extracted data elements in a packed data register dedicated to that type beginning with starting destination register operand. 2 . The apparatus of claim 1 , wherein the instruction to include an opcode indicating the defined number of types. 3 . The apparatus of claim 2 , wherein the defined number of types are two, three, and four. 4 . The apparatus of claim 1 , wherein the defined number of types indicates a number of destination packed data registers. 5 . The apparatus of claim 1 , wherein the instruction to indicate a size of the data elements. 6 . The apparatus of claim 1 , wherein the instruction to include a writemask operand. 7 . The apparatus of claim 7 , the execution circuitry to store extracted data element based on values of the writemask operand. 8 . An method comprising: decoding an instruction, wherein the instruction to include fields a starting source memory address operand and a starting destination register operand; and executing the decoded instruction to extract data elements of a defined number of types from contiguous memory beginning at the starting source memory address and, for each type, store the extracted data elements in a packed data register dedicated to that type beginning with starting destination register operand. 9 . The method of claim 8 , wherein the instruction to include an opcode indicating the defined number of types. 10 . The method of claim 9 , wherein the defined number of types are two, three, and four. 11 . The method of claim 8 , wherein the defined number of types indicates a number of destination packed data registers. 12 . The method of claim 8 , wherein the instruction to indicate a size of the data elements. 13 . The method of claim 8 , wherein the instruction to include a writemask operand. 14 . The method of claim 8 , wherein the storing of extracted data element is based on values of the writemask operand. 15 . A non-transitory machine readable medium storing an instruction, which when executed causes a processor to perform a method, the method comprising: decoding an instruction, wherein the instruction to include fields a starting source memory address operand and a starting destination register operand; and executing the decoded instruction to extract data elements of a defined number of types from contiguous memory beginning at the starting source memory address and, for each type, store the extracted data elements in a packed data register dedicated to that type beginning with starting destination register operand.
Register arrangements · CPC title
Decoding the operand specifier, e.g. specifier format · CPC title
using a mask · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
having multiple operands in a single register · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.