Systems, apparatuses, and methods for performing a shuffle and operation (shuffle-op)

US9218182B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9218182-B2
Application numberUS-201213539116-A
CountryUS
Kind codeB2
Filing dateJun 29, 2012
Priority dateJun 29, 2012
Publication dateDec 22, 2015
Grant dateDec 22, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode, the method comprising steps of: executing the single data element shuffle and an operation instruction to shuffle data elements of the first source register by a number of data elements wherein the number of data elements is defined by the immediate value of the instruction and perform the operation defined by the opcode on the shuffled data elements of the first source vector register with non-shuffled data elements of the second source vector register; and storing a result of each operation in a corresponding packed data element position of the destination vector register. 2. The method of claim 1 , wherein the first source vector register and the destination vector register are the same register. 3. The method of claim 1 , wherein the operation is a mathematical operation. 4. The method of claim 1 , wherein the operation is a Boolean operation. 5. The method of claim 1 , wherein the data elements of the source and destination vector registers are 8-bit, 16-bit, 32-bit, or 64-bit in size. 6. The method of claim 1 , wherein the source and destination vector registers are 128-bit, 256-bit, or 512-bit in size. 7. The method of claim 1 , wherein the instruction includes an opmask register operand and the storing a result of each operation in a corresponding packed data element position of the destination vector register is defined by said opmask register operand. 8. A non-transitory machine-readable storage medium having stored thereon an an instruction having a format that specifies as its first source operand a first vector register, as its second source operand a second vector register, as its destination a single destination vector register, and includes an immediate value, wherein when the instruction is executed by a a machine, the machine to perform a method comprising: shuffling of data elements of the first source register by a number of data elements wherein the number of data elements is defined by the immediate value and performing of the operation defined by the opcode on the shuffled data elements of the first source vector register with non-shuffled data elements of the second source vector register; and storing a result of each operation in a corresponding packed data element position of the destination vector register. 9. The non-transitory machine-readable storage medium of claim 8 , wherein the first source vector register and the destination vector register are the same register. 10. The non-transitory machine-readable storage medium of claim 8 , wherein the operation is a mathematical operation. 11. The non-transitory machine-readable storage medium of claim 8 , wherein the operation is a Boolean operation. 12. The non-transitory machine-readable storage medium of claim 8 , wherein the data elements of the source and destination vector registers are 8-bit, 6-bit, 32-bit, or 64-bit in size. 13. The non-transitory machine-readable storage medium of claim 8 , wherein the source and destination vector registers are 128-bit, 256-bit, or 512-bit in size. 14. The non-transitory machine-readable storage medium of claim 8 , wherein the instruction includes an opmask register operand and the storing a result of each operation in a corresponding packed data element position of the destination vector register is defined by said opmask register operand. 15. An apparatus comprising; a hardware decoder to decode a data element shuffle and operation instruction, the data element shuffle and operation instruction including a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode; execution logic to execute the data element shuffle and an operation instruction to shuffle data elements of the first source register by a number of data elements wherein the number of data elements is defined by the immediate value of the instruction and perform the operation defined by the opcode on the shuffled data elements of the first source vector register with non-shuffled data elements of the second source vector register, and store a result of each operation in a corresponding packed data element position of the destination vector register. 16. The apparatus of claim 15 , wherein the first source vector register and the destination vector register are the same register. 17. The apparatus of claim 15 , wherein the operation is a mathematical operation. 18. The apparatus of claim 15 , wherein the operation is a Boolean operation. 19. The apparatus of claim 15 , wherein the data elements of the source and destination vector registers are 8-bit, 16-bit, 32-bit, or 64-bit in size. 20. The apparatus of claim 15 , wherein the source and destination vector registers are 128-bit, 256-bit, or 512-bit in size.

Assignees

Inventors

Classifications

  • Instruction analysis, e.g. decoding, instruction word fields · CPC title

  • G06F9/3001Primary

    Arithmetic instructions · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title

  • using a mask · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9218182B2 cover?
Embodiments of systems, apparatuses, and methods for performing in a computer processor a data element shuffle and an operation on the shuffled data elements in response to a single data element shuffle and an operation instruction that includes a destination vector register operand, a first and second source vector register operands, an immediate value, and an opcode are described.
Who is the assignee on this patent?
Ermolaev Igor, Ould-Ahmed-Vall Elmoustapha, Toll Bret, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06F9/3001. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 22 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).