Outer product-based matrix-vector multiplication operation apparatus for accelerating vector operation and method using the same
US-2024362297-A1 · Oct 31, 2024 · US
US9606770B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9606770-B2 |
| Application number | US-201414559160-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 3, 2014 |
| Priority date | Sep 24, 2010 |
| Publication date | Mar 28, 2017 |
| Grant date | Mar 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method is described that involves executing a first instruction with a functional unit. The first instruction is a multiply-add instruction. The method further includes executing a second instruction with the functional unit. The second instruction is a round instruction.
Opening claim text (preview).
The invention claimed is: 1. A processor, comprising: a decode unit to decode a multiply-add instruction and a round instruction, wherein the round instruction is to have an input term to specify how many places a mantissa value is to be rounded to; an execution unit to execute the multiply-add instruction, the round instruction, and a plurality of other types of floating point instructions, the execution unit having exponent difference calculation logic, a multiplier and an adder to support the multiply-add instruction, the execution unit also having a rounder to support the round instruction that is to have the input term to specify how many binary places the mantissa value is to be rounded to, the exponent difference calculation logic having a shifter to support execution of the multiply-add instruction and the round instruction. 2. The processor of claim 1 , wherein the shifter is also to support a scale instruction to be executed by the execution unit. 3. The processor of claim 2 , wherein the exponent difference calculation logic includes a second shifter to support execution of the multiply-add instruction, wherein, the second shifter also supports a get exponent instruction to be executed by the execution unit. 4. The processor of claim 1 , wherein the exponent difference calculation logic includes a second shifter to support execution of the multiply-add instruction, wherein, the second shifter is also to support a get exponent instruction to be executed by the execution unit. 5. The processor of claim 4 , wherein the first shifter, the second shifter and the rounder are also to support a reduce instruction to be executed by the execution unit. 6. The processor of claim 1 , wherein the first shifter and the rounder are also to support a reduce instruction to be executed by the execution unit. 7. The processor of claim 1 , wherein the round instruction is to specify a packed data operand. 8. The processor of claim 1 , wherein the multiply-add instruction is to specify a packed data operand. 9. The processor of claim 1 , further comprising a masking layer to enable an operation for at least one element of an operand of the multiply-add instruction and to disable an operation for at least one other element of the operand. 10. A processor comprising: an execution unit to execute a multiply-add instruction and a round instruction, the execution unit having exponent difference calculation logic, a multiplier and an adder to support the multiply-add instruction, the execution unit also having a rounder to support the round instruction that is to specify a number of binary places a mantissa value is to be rounded to, the exponent difference calculation logic having a shifter to support execution of the multiply-add instruction and the round instruction, wherein the exponent difference calculation logic, when being used to execute the round instruction, is to right shift the mantissa value based on the number of binary places the mantissa value is to be rounded to which is to be specified by the round instruction. 11. The processor of claim 10 , wherein the shifter is also to support a scale instruction to be executed by the execution unit. 12. The processor of claim 11 , wherein the exponent difference calculation logic includes a second shifter to support execution of the multiply-add instruction, wherein, the second shifter is also to support a get exponent instruction to be executed by the execution unit. 13. The processor of claim 10 , wherein the exponent difference calculation logic includes a second shifter to support execution of the multiply-add instruction, wherein, the second shifter is also to support a get exponent instruction to be executed by the execution unit. 14. The processor of claim 13 , wherein the shifter, the second shifter and the rounder are also to support a reduce instruction to be executed by the execution unit. 15. The processor of claim 10 , wherein the execution unit is also to execute a plurality of other types of floating point instructions. 16. The processor of claim 10 , wherein the round instruction is to specify a packed data operand. 17. A processor comprising: a decode unit to decode instructions including a multiply-add instruction and a scale instruction, the scale instruction to have a first floating point value and a second floating point value; an execution unit to execute the instructions including the multiply-add instruction and the scale instruction, wherein the execution unit is to execute the scale instruction to scale the first floating point value by a floor of the second floating point value, the execution unit including: an exponent calculation logic; and a mantissa calculation logic including: a multiplier and an adder to support the multiply-add instruction; and an exponent difference logic, the exponent difference logic including a shifter to support execution of the multiply-add instruction and the scale instruction, wherein the exponent difference logic, when being used to execute the scale instruction, is to left shift a mantissa of the second floating point value by an exponent of the second floating point value. 18. The processor of claim 17 , wherein the scale instruction is to specify a packed data operand. 19. The processor of claim 17 , wherein the execution unit is also to execute a plurality of other types of floating point instructions.
in floating-point computations · CPC title
Implementation of IEEE-754 Standard · CPC title
controlled in tandem, e.g. multiplier-accumulator · CPC title
Rounding · CPC title
Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.