Outer product-based matrix-vector multiplication operation apparatus for accelerating vector operation and method using the same
US-2024362297-A1 · Oct 31, 2024 · US
US9430190B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9430190-B2 |
| Application number | US-201414169864-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 31, 2014 |
| Priority date | Feb 27, 2013 |
| Publication date | Aug 30, 2016 |
| Grant date | Aug 30, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for operating a fused-multiply-add pipeline in a floating-point unit of a processor is disclosed. A multiplication is initially performed between a first operand and a second operand in a multiplier block to obtain a set of partial product results. The partial product results are sent to a carry-save adder block. A partial product reduction is performed on the partial product results to generate a carry-save result having a sum term and a carry term. The carry-save result is then formatted to generate a carry-out bit. The carry-save result is added to a third operand to generate a final result.
Opening claim text (preview).
What is claimed is: 1. A method for operating a fused-multiply-add pipeline in a floating-point unit of a processor, said method comprising: performing a multiplication between a first operand and a second operand in a multiplier block to obtain a set of partial product results; inputting said partial product results to a carry-save adder block; performing a partial product reduction of said partial product results to generate a carry-save result having a sum term and a carry term; performing an XNOR operation on the sixth most significant bit of said sum term and the six most significant bit of said carry term of said carry-save result to generate a carry-out bit; adding said carry-save result to a third operand to generate a sum; and generating a final result by combining said carry-out bit to said sum. 2. The method of claim 1 , wherein said multiplication of said first and second operands is performed by a Booth encoding algorithm. 3. The method of claim 1 , wherein said plurality of operands has any number of leading zero bits less than the number of bits per operand. 4. The method of claim 1 , wherein said method further includes left aligning said carry-save result by shifting out six leading zero bits of said sum and carry terms. 5. The method of claim 1 , wherein said adding further includes suppressing a carry-out bit generated from adding said carry-save result to said third operand. 6. The method of claim 1 , wherein said adding further includes suppressing a carry-out bit generated from adding said third operand to said carry-save result. 7. The method of claim 1 , wherein said carry-save adder block includes a 4-to-2 carry save adder block. 8. A computer-readable device having a computer program product for operating a fused-multiply-add pipeline in a floating-point unit of a processor, said computer-readable device comprising: program code for performing a multiplication between a first operand and a second operand in a multiplier block to obtain a set of partial product results; program code for inputting said partial product results to a carry-save adder block; program code for performing a partial product reduction of said partial product results to generate a carry-save result having a sum term and a carry term; program code for performing an XNOR operation on the sixth most significant bit of said sum term and the six most significant bit of said carry term of said carry-save result to generate a carry-out bit; program code for adding said carry-save result to a third operand to generate a sum; and program code r generating a final result by combining said carry-out bit to said sum. 9. The computer-readable device of claim 8 , wherein said multiplication of said first and second operands is performed by a Booth encoding algorithm. 10. The computer-readable device of claim 8 , wherein said plurality of operands has any number of leading zero bits less than the number of bits per operand. 11. The computer-readable device of claim 8 , wherein said computer-readable device further includes program code for left aligning said carry-save result by shifting out six leading zero bits of said sum and carry terms. 12. The computer-readable device of claim 8 , wherein said program code for adding further includes program code for suppressing a carry-out bit generated from adding said carry-save result to said third operand. 13. The computer-readable device of claim 8 , wherein said program code for adding further includes program code for suppressing a carry-out bit generated from adding said third operand to said carry-save result. 14. The computer-readable device of claim 8 , wherein said carry-save adder block includes a 4-to-2 carry save adder block. 15. A processor having a fused-multiply-add pipeline within a floating-point unit, said processor comprising: a multiplier for multiplying a first operand and a second operand to obtain a set of partial product results; a carry-save adder for performing a partial product reduction on said partial product results to generate a carry-save result having a sum term and a carry term; an aligner for performing an XNOR operation on the sixth most significant bit of said sum term and the six most significant bit of said carry term of said carry-save result to generate a carry-out bit; an adder for adding said carry-save result to a third operand to generate a sum, and for generating a final result by combining said carry-out bit to said sum. 16. The processor of claim 15 , wherein said multiplication of said first and second operands is performed by a Booth encoding algorithm. 17. The processor of claim 15 , wherein said plurality of operands has any number of leading zero bits less than the number of bits per operand. 18. The processor of claim 15 , wherein said aligner performs said left aligning said carry-save result by shifting out six leading zero bits of said sum and carry terms. 19. The processor of claim 15 , wherein said carry-out bit generated from adding said carry-save result to said third operand is suppressed. 20. The processor of claim 15 , wherein said 4-to-2 adder further suppresses a carry-out bit generated from adding said third operand to said carry-save result.
each bitgroup having two new bits, e.g. 2nd order MBA · CPC title
Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title
Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title
using carry save adders · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.