Accelerating eight-way parallel keccak execution
US-2024211268-A1 · Jun 27, 2024 · US
US9600280B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9600280-B2 |
| Application number | US-201314034651-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 24, 2013 |
| Priority date | Sep 24, 2013 |
| Publication date | Mar 21, 2017 |
| Grant date | Mar 21, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A hazard check instruction has operands that specify addresses of vector elements to be read by first and second vector memory operations. The hazard check instruction outputs a dependency vector identifying, for each element position of the first vector corresponding to the first vector memory operation, which element position of the second vector that the element of the first vector depends on (if any). In an embodiment, at least one of the vector memory operations has addresses specified using a scalar address in the operands (and a vector attribute associated with the vector). In an embodiment, the operands may include predicates for one or both of the vector memory operations, indicating which vector elements are active. The dependency vector may be qualified by the predicates, indicating dependencies only for active elements.
Opening claim text (preview).
What is claimed is: 1. A processor comprising: an execution core configured to execute an instruction having a plurality of operands stored in a plurality of operand registers identified by the instruction, wherein: the plurality of operands of the instruction specify a first one or more addresses and a second one or more addresses; at least the first one or more addresses are contiguous in an address range; the address range begins at a scalar address that is included in the plurality of operands of the instruction; the address range ends at a second address determined from the scalar address and a vector attribute that is stored in one of the plurality of operand registers identified by the instruction, wherein the vector attribute specifies size information of a first vector having vector elements stored in the address range; and the execution core is configured, responsive to executing the instruction, to: detect whether or not a dependency exists between the first one or more addresses specified by the plurality of operand of the instructions and the second one or more addresses specified by the plurality of operands of the instruction; and generate a dependency vector that indicates, for each first element of the first vector stored in the address range that depends on a second element of a second vector having elements stored at the second one or more addresses, which second element that the first element depends on. 2. The processor as recited in claim 1 wherein the plurality of operands include a second scalar address on which the second one or more addresses are based. 3. The processor as recited in claim 1 wherein the second one or more addresses are specified by a vector of addresses in the plurality of operands. 4. The processor as recited in claim 1 wherein the plurality of operands further include a predicate corresponding to the second vector memory operation, wherein the predicate defines active elements of a second vector accessed by the second vector memory, and wherein the execution core is configured to indicate dependencies on the second elements only if the second elements are active. 5. The processor as recited in claim 4 wherein the attribute is stored in a first operand register of the plurality of operand registers, and the first operand register also stores the predicate, and wherein the attribute specifies at least one of a size of a vector element, a size of a vector, or a number of elements in the vector. 6. The processor as recited in claim 4 wherein the plurality of operands further include a second predicate corresponding to the first memory operation. 7. A method comprising executing an instruction in a processor, the instruction having a plurality of operands stored in a plurality of operand registers identified by the instruction, wherein: the plurality of operands of the instruction specify a first one or more addresses and a second one or more addresses; at least the first one or more addresses are contiguous in an address range; the address range begins at a scalar address in the plurality of operands of the instruction; the address range ends at a second address determined from the scalar address and a vector attribute that is stored in one of the plurality of operand registers identified by the instruction; and the vector attribute specifies size information of a first vector having vector elements stored in the address range; and executing the instruction includes: detecting whether or not a dependency exists between one or more addresses specified by the plurality of operands of the instructions and the second one or more addresses specified by the plurality of operands of the instruction; and generating a dependency vector that indicates, where a dependency exists for a first element of the first vector stored in the address range on a second element of a second vector having elements stored at the second one or more addresses, an element position of the second element. 8. The method as recited in claim 7 wherein the scalar address defines a beginning of the address range. 9. The method as recited in claim 8 wherein the vector attribute defines an extent of the address range. 10. The method as recited in claim 7 wherein the plurality of operands further include a predicate corresponding to the second vector memory operation, wherein the predicate defines active elements of a second vector accessed by the second vector memory, and generating the dependency vector comprises indicating dependencies only if the second elements are active. 11. A processor comprising: an execution core configured to execute an instruction having a plurality of operands stored in a plurality of operand registers identified by the instruction, wherein: the plurality of operands of the instruction specify a first one or more and a second one or more addresses; the plurality of operands further include a first predicate from a first register of the plurality of operand registers, the first predicate defining active elements of a vector and the first register stores, in addition to the first predicate, an attribute that defines at least one of a size of vector elements in the corresponding vector memory operation, a number of vector elements in the corresponding vector memory operation, or a size of a vector; and the execution core is configured, responsive to executing the instruction, to: detect whether or not a dependency exists between the first one or more addresses specified by the plurality of operands of the instruction and the second one or more addresses specified by the plurality of operands of the instruction, and the detection of whether or not a dependency exists is based on the attribute, and generate a dependency vector that indicates, for each first element of a first vector stored at the first one or more addresses that depends on a second element of a second vector memory operation stored at the second one or more addresses, which second element that the first element depends on. 12. The processor as recited in claim 11 wherein the first one or more addresses are defined by a scalar address in the one or more operand registers and the attribute. 13. The processor as recited in claim 12 wherein the second one or more addresses are defined by a second scalar address and a second attribute. 14. The processor as recited in claim 12 wherein the second one or more addresses are a vector of addresses.
Compare instructions, e.g. Greater-Than, Equal-To, MINMAX · CPC title
according to data content, e.g. floating-point registers, address registers · CPC title
to perform conditional operations, e.g. using predicates or guards · CPC title
Dependency mechanisms, e.g. register scoreboarding · CPC title
Details on data memory access · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.