Accelerating eight-way parallel keccak execution
US-2024211268-A1 · Jun 27, 2024 · US
US2016342418A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016342418-A1 |
| Application number | US-201615226714-A |
| Country | US |
| Kind code | A1 |
| Filing date | Aug 2, 2016 |
| Priority date | Dec 28, 2012 |
| Publication date | Nov 24, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus is described having a functional unit of an instruction execution pipeline. The functional unit has a plurality of compare-and-exchange circuits coupled to network circuitry to implement a vector sorting tree for a vector sorting instruction. Each of the compare-and-exchange circuits has a respective comparison circuit that compares a pair of inputs. Each of the compare-and-exchange circuits have a same sided first output for presenting a higher of the two inputs and a same sided second output for presenting a lower of the two inputs, said comparison circuit to also support said functional unit's execution of a prefix min and/or prefix add instruction.
Opening claim text (preview).
1 . An apparatus, comprising: a functional unit of an instruction execution pipeline have a plurality of compare-and-exchange circuits coupled to network circuitry to implement a vector sorting tree for a vector sorting instruction, each of said compare-and-exchange circuits having a respective comparison circuit that compares a pair of inputs, each of said compare-and-exchange circuits having a same sided first output for presenting a higher of the two inputs and a same sided second output for presenting a lower of the two inputs, said comparison circuit to also support said functional unit's execution of a prefix min and/or prefix add instruction. 2 . The apparatus of claim 1 wherein said functional unit supports sorting of different sized vectors. 3 . The apparatus of claim 2 wherein a particular one of said sizes is specified with an immediate operand of said vector sorting instruction. 4 . The apparatus of claim 2 wherein said different sized vectors include 2 elements, 4 elements, 8 elements and 16 elements. 5 . The apparatus of claim 2 wherein said functional unit can simultaneously sort two vectors whose size is less than a maximum vector size that can be sorted through said vector sorting tree. 6 . The apparatus of claim 1 wherein said network circuitry includes a configurable switching network. 7 . The apparatus of claim 6 wherein said functional unit includes a memory circuit containing microcode that present control signals to said configurable switching network for said vector sorting instruction. 8 . An apparatus, comprising: a functional unit of an instruction execution pipeline have a plurality of compare-and-exchange circuits coupled to network circuitry to implement a vector sorting tree for a vector sorting instruction, each of said compare-and-exchange circuits having a respective comparison circuit that compares a pair of inputs, each of said compare-and-exchange circuits having a same sided first output for presenting a higher of the two inputs and a same sided second output for presenting a lower of the two inputs, each of said circuits also having any of: an adder to implement a prefix add instruction with said functional unit; a multiplier to implement a prefix multiply instruction with said functional unit. 9 . The apparatus of claim 8 wherein said functional unit supports sorting of different sized vectors. 10 . The apparatus of claim 9 wherein a particular one of said sizes is specified with an immediate operand of said vector sorting instruction. 11 . The apparatus of claim 9 wherein said different sized vectors include 2 elements, 4 elements, 8 elements and 16 elements. 12 . The apparatus of claim 9 wherein said functional unit can simultaneously sort two vectors whose size is less than a maximum vector size that can be sorted through said vector sorting tree. 13 . The apparatus of claim 8 wherein said network circuitry includes a configurable switching network. 14 . The apparatus of claim 13 wherein said functional unit includes a memory circuit containing microcode that present control signals to said configurable switching network for said vector sorting instruction. 15 . The apparatus of claim 8 wherein said comparator of each of said circuits is also used to implement any of the following with said functional unit: a prefix min instruction; a prefix max instruction. 16 . A method, comprising: performing the following with functional unit circuitry of an instruction execution pipeline to perform a vector sorting instruction: simultaneously receiving a first vector and a second vector; passing elements of said first and second vectors through a plurality of comparison-and-exchange circuits that implement a sorting tree to sort said first and second vectors, wherein, each of said comparison-and-exchange circuits perform the following: compare a pair of said elements; present a higher of the pair of elements on a same sided first output; present a lower of the pair of elements on a same sided second output. 17 . The method of claim 16 wherein said instruction specifies a size of said first and second vectors. 18 . The method of claim 17 wherein said functional unit uses said size to determine how many stages of said sorting tree said elements are to pass through. 19 . The method of claim 16 further comprising executing any of a prefix sum instruction or prefix add instruction with said functional unit. 20 . The method of claim 16 further comprising executing any of a prefix min instruction or prefix max instruction with said functional unit.
Sorting, i.e. extracting data from one or more carriers, rearranging the data in numerical or other ordered sequence, and rerecording the sorted data on the original carrier or on a different carrier or set of carriers {sorting methods in general}(G06F7/36 takes precedence) · CPC title
controlled in tandem, e.g. multiplier-accumulator · CPC title
Arithmetic instructions · CPC title
Compare instructions, e.g. Greater-Than, Equal-To, MINMAX · CPC title
Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.