Information processing apparatus
US-2024385843-A1 · Nov 21, 2024 · US
US9436435B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9436435-B2 |
| Application number | US-201113996529-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 23, 2011 |
| Priority date | Dec 23, 2011 |
| Publication date | Sep 6, 2016 |
| Grant date | Sep 6, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus is described that includes a semiconductor chip having an instruction execution pipeline having one or more execution units with respective logic circuitry to: a) execute a first instruction that multiplies a first input operand and a second input operand and presents a lower portion of the result, where, the first and second input operands are respective elements of first and second input vectors; b) execute a second instruction that multiplies a first input operand and a second input operand and presents an upper portion of the result, where, the first and second input operands are respective elements of first and second input vectors; and, c) execute an add instruction where a carry term of the add instruction's adding is recorded in a mask register.
Opening claim text (preview).
What is claimed is: 1. A method comprising: decoding a first instruction, a second instruction, a third instruction, a fourth instruction, and a fifth instruction with a hardware decoder of a hardware processor; executing the first instruction with a hardware execution unit of the hardware processor to multiply a first input operand and a second input operand and present a lower portion of a result, said first input operand representing a first digit of a multiplier, said second input operand representing a first digit of a multiplicand; executing the second instruction with the hardware execution unit of the hardware processor to multiply said first input operand and said second input operand and present an upper portion of a result; executing the third instruction with the hardware execution unit of the hardware processor to multiply said first input operand and a third input operand and present a lower portion of a result, said third input operand representing a digit of said multiplicand that neighbors said first digit of said multiplicand; executing the fourth instruction with the hardware execution unit of the hardware processor to multiply said first input operand and said third input operand and present an upper portion of a result; and executing the fifth instruction with the hardware execution unit of the hardware processor to add aligned digits of the upper and lower portions and record a carry term in a mask register. 2. The method of claim 1 wherein the first and second instructions are executed in a same recursion. 3. The method of claim 1 wherein a multiplexer of said hardware execution unit of the hardware processor is to output a low half from a multiplier for the first instruction and a high half from the multiplier for the second instruction. 4. The method of claim 1 wherein the carry term is one of a plurality of carry terms that are separately tracked in mask register space. 5. The method of claim 1 wherein said executing the fifth instruction with the hardware execution unit of the hardware processor is to also add in an input carry term from the mask register. 6. The method of claim 1 wherein the carry term is more than one bit. 7. The method of claim 6 wherein said carry term is written as least significant bits of a next higher ordered accumulated partial product term. 8. A hardware processor comprising: a hardware decoder to decode a first instruction, a second instruction, and an add instruction; and a hardware execution unit to: execute the first instruction to multiply a first input operand and a second input operand and present a lower portion of a result, said first and second input operands being respective elements of first and second input vectors, execute the second instruction to multiply the first input operand and the second input operand and present an upper portion of a result, said first and second input operands being respective elements of first and second input vectors, and execute the add instruction that is to add aligned digits of the upper and lower portions and cause a carry term of said add instruction's adding to be recorded in a mask register. 9. The hardware processor of claim 8 wherein said add instruction comprises an operand to identify the mask register. 10. The hardware processor of claim 8 wherein the carry term is a plurality of bits. 11. The hardware processor of claim 8 wherein said add instruction accepts an input carry term through said mask register. 12. The hardware processor of claim 11 wherein said add instruction writes said input carry term as least significant bits of its add resultant. 13. The hardware processor of claim 8 wherein a multiplexer of said hardware execution unit is to output a low half from a multiplier for the first instruction and a high half from the multiplier for the second instruction. 14. The hardware processor of claim 8 wherein said first and second instructions are vector instructions that multiply respective elements of first and second input vectors, said first input operand being an element of said first input vector and said second input operand being an element of a second input vector. 15. The hardware processor of claim 8 , wherein the carry term is one of a plurality of carry terms that are separately tracked in mask register space. 16. A non-transitory machine readable medium containing program code that when processed by a processing unit causes a method to be performed, said method comprising: decoding a first instruction, a second instruction, a third instruction, a fourth instruction, and a fifth instruction with a hardware decoder of a hardware processor; executing the first instruction with a hardware execution unit of the hardware processor to multiply a first input operand and a second input operand and present a lower portion of a result, said first input operand representing a first digit of a multiplier, said second input operand representing a first digit of a multiplicand; executing the second instruction with the hardware execution unit of the hardware processor to multiply said first input operand and said second input operand and present an upper portion of a result; executing the third instruction with the hardware execution unit of the hardware processor to multiply said first input operand and a third input operand and present a lower portion of a result, said third input operand representing a digit of said multiplicand that neighbors said first digit of said multiplicand; executing the fourth instruction with the hardware execution unit of the hardware processor to multiply said first input operand and said third input operand and present an upper portion of a result; and executing the fifth instruction with the hardware execution unit of the hardware processor to add aligned digits of the upper and lower portions and record a carry term in a mask register. 17. The non-transitory machine readable medium of claim 16 wherein the first and second instructions are executed in a same recursion. 18. The non-transitory machine readable medium of claim 16 wherein a multiplexer of said hardware execution unit of the hardware processor is to output a low half from a multiplier for the first instruction and a high half from the multiplier for the second instruction. 19. The non-transitory machine readable medium of claim 16 wherein the carry term is one of a plurality of carry terms that are separately tracked in mask register space. 20. The non-transitory machine readable medium of claim 16 wherein said executing the fifth instruction with the hardware execution unit of the hardware processor is to also add in an input carry term from the mask register. 21. The non-transitory machine readable medium of claim 16 wherein the carry term is more than one bit. 22. The non-transitory machine readable medium of claim 21 wherein said carry term is an input carry term and said input carry term is written as least significant bits of said adding's resultant.
Arithmetic instructions · CPC title
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations {(G06F7/49, G06F7/491 take precedence)} · CPC title
controlled in tandem, e.g. multiplier-accumulator · CPC title
using a mask · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.