What technology area does this patent fall under?

Primary CPC classification G06F9/3001. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 31 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Apparatus and method for vector instructions for large integer arithmetic

US10037210B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10037210-B2
Application number	US-201615257833-A
Country	US
Kind code	B2
Filing date	Sep 6, 2016
Priority date	Dec 23, 2011
Publication date	Jul 31, 2018
Grant date	Jul 31, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus is described that includes a semiconductor chip having an instruction execution pipeline having one or more execution units with respective logic circuitry to: a) execute a first instruction that multiplies a first input operand and a second input operand and presents a lower portion of the result, where, the first and second input operands are respective elements of first and second input vectors; b) execute a second instruction that multiplies a first input operand and a second input operand and presents an upper portion of the result, where, the first and second input operands are respective elements of first and second input vectors; and, c) execute an add instruction where a carry term of the add instruction's adding is recorded in a mask register.

First claim

Opening claim text (preview).

What is claimed is: 1. A hardware processor comprising: a hardware decoder to decode a first instruction into a decoded first instruction, a second instruction into a decoded second instruction, and an add instruction into a decoded add instruction; and a hardware execution unit to: execute the decoded first instruction to multiply a first input operand and a second input operand and store a lower portion of a result, said first and second input operands being respective elements of a first input vector and a second input vector, execute the decoded second instruction to multiply the first input operand and the second input operand and store an upper portion of a result, said first and second input operands being the respective elements of the first input vector and the second input vector, and execute the decoded add instruction to add aligned elements of the upper portion and the lower portion with a previous, corresponding carry term from an input operand and store a result. 2. The hardware processor of claim 1 , wherein the hardware execution unit is to execute the decoded add instruction to further cause a next carry term of said add instruction's adding to be stored. 3. The hardware processor of claim 2 , wherein the next carry term is stored in a same register as the previous, corresponding carry term. 4. The hardware processor of claim 3 , wherein said add instruction comprises an input operand to identify the same register. 5. The hardware processor of claim 3 , wherein the same register is a mask register. 6. The hardware processor of claim 1 , wherein the previous, corresponding carry term is a plurality of bits. 7. The hardware processor of claim 1 , wherein a multiplexer of said hardware execution unit is to output a low half from a multiplier for the first instruction and a high half from the multiplier for the second instruction. 8. A method comprising: decoding a first instruction into a decoded first instruction, a second instruction into a decoded second instruction, and an add instruction into a decoded add instruction with a hardware decoder of a hardware processor; executing the decoded first instruction with a hardware execution unit of the hardware processor to multiply a first input operand and a second input operand and store a lower portion of a result, said first and second input operands being respective elements of a first input vector and a second input vector; executing the decoded second instruction with the hardware execution unit of the hardware processor to multiply the first input operand and the second input operand and store an upper portion of a result, said first and second input operands being the respective elements of the first input vector and the second input vector; and executing the decoded add instruction with the hardware execution unit of the hardware processor to add aligned elements of the upper portion and the lower portion with a previous, corresponding carry term from an input operand and store a result. 9. The method of claim 8 , wherein executing the decoded add instruction is to further cause a next carry term of said add instruction's adding to be stored. 10. The method of claim 9 , wherein the next carry term is stored in a same register as the previous, corresponding carry term. 11. The method of claim 10 , wherein said add instruction comprises an input operand to identify the same register. 12. The method of claim 10 , wherein the same register is a mask register. 13. The method of claim 8 , wherein the previous, corresponding carry term is a plurality of bits. 14. The method of claim 8 , wherein a multiplexer of said hardware execution unit is to output a low half from a multiplier for the first instruction and a high half from the multiplier for the second instruction. 15. A non-transitory machine readable medium containing program code that when processed by a processing unit causes a method to be performed, said method comprising: decoding a first instruction into a decoded first instruction, a second instruction into a decoded second instruction, and an add instruction into a decoded add instruction with a hardware decoder of a hardware processor; executing the decoded first instruction with a hardware execution unit of the hardware processor to multiply a first input operand and a second input operand and store a lower portion of a result, said first and second input operands being respective elements of a first input vector and a second input vector; executing the decoded second instruction with the hardware execution unit of the hardware processor to multiply the first input operand and the second input operand and store an upper portion of a result, said first and second input operands being the respective elements of the first input vector and the second input vector; and executing the decoded add instruction with the hardware execution unit of the hardware processor to add aligned elements of the upper portion and the lower portion with a previous, corresponding carry term from an input operand and store a result. 16. The non-transitory machine readable medium of claim 15 , wherein executing the decoded add instruction is to further cause a next carry term of said add instruction's adding to be stored. 17. The non-transitory machine readable medium of claim 16 , wherein the next carry term is stored in a same register as the previous, corresponding carry term. 18. The non-transitory machine readable medium of claim 17 , wherein said add instruction comprises an input operand to identify the same register. 19. The non-transitory machine readable medium of claim 17 , wherein the same register is a mask register. 20. The non-transitory machine readable medium of claim 15 , wherein the previous, corresponding carry term is a plurality of bits. 21. The non-transitory machine readable medium of claim 15 , wherein a multiplexer of said hardware execution unit is to output a low half from a multiplier for the first instruction and a high half from the multiplier for the second instruction.

Assignees

Intel Corp

Inventors

Classifications

G06F9/3001Primary
Arithmetic instructions · CPC title
G06F7/57
Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations {(G06F7/49, G06F7/491 take precedence)} · CPC title
G06F9/30036Primary
Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title
G06F9/3016Primary
Decoding the operand specifier, e.g. specifier format · CPC title
G06F9/3893
controlled in tandem, e.g. multiplier-accumulator · CPC title

Patent family

Related publications grouped by family.

View patent family 48669267

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10037210B2 cover?: An apparatus is described that includes a semiconductor chip having an instruction execution pipeline having one or more execution units with respective logic circuitry to: a) execute a first instruction that multiplies a first input operand and a second input operand and presents a lower portion of the result, where, the first and second input operands are respective elements of first and seco…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F9/3001. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 31 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).