Packed rotate processors, methods, systems, and instructions

US9864602B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9864602-B2
Application numberUS-201113977229-A
CountryUS
Kind codeB2
Filing dateDec 30, 2011
Priority dateDec 30, 2011
Publication dateJan 9, 2018
Grant dateJan 9, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of an aspect includes receiving a masked packed rotate instruction. The instruction indicates a first source packed data including a plurality of packed data elements, a packed data operation mask having a plurality of mask elements, at least one rotation amount, and a destination storage location. A result packed data is stored in the destination storage location in response to the instruction. The result packed data includes result data elements that each correspond to a different one of the mask elements in a corresponding relative position. Result data elements that are not masked out by the corresponding mask element include one of the data elements of the first source packed data in a corresponding position that has been rotated. Result data elements that are masked out by the corresponding mask element include a masked out value. Other methods, apparatus, systems, and instructions are disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving a masked packed rotate instruction, the masked packed rotate instruction indicating a first source packed data including a plurality of packed data elements, specifying with a field of the masked packed rotate instruction an architectural mask register having a packed data operation mask having a plurality of mask elements that each correspond to a different one of the plurality of packed data elements, indicating at least one rotation amount, and indicating a destination storage location; and storing a result packed data in the destination storage location in response to the masked packed rotate instruction, the result packed data including a plurality of result data elements that each correspond to a different one of the mask elements in the architectural mask register in a corresponding relative position, in which one or more result data elements that are not masked out by the corresponding mask element each include a value of one of the data elements of the first source packed data in a corresponding position that has been rotated, and in which one or more result data elements that are masked out by the corresponding mask element each include a predetermined masked out value. 2. The method of claim 1 , wherein storing comprises storing the result packed data in which the result data elements that are masked out include a zeroed value. 3. The method of claim 1 , wherein storing comprises storing the result packed data in which the result data elements that are masked out include a merged value that has been merged from a data element in a corresponding position of the source packed data. 4. The method of claim 1 , wherein receiving comprises receiving the instruction indicating the packed data operation mask having the mask elements which each comprise a single bit. 5. The method of claim 1 , wherein receiving comprises receiving the instruction indicating a second source packed data including a second plurality of packed data elements, each of the data elements of the second plurality representing a rotation amount. 6. The method of claim 1 , wherein receiving comprises receiving the instruction having an immediate that indicates a single rotation amount. 7. The method of claim 1 , wherein the masked packed rotate instruction is a masked packed rotate with data element broadcast of rotation amounts instruction, and wherein the masked packed rotate with data element broadcast of rotation amounts instruction indicates a single data element representing a rotation amount. 8. The method of claim 7 , wherein storing comprises storing result data elements, which are not masked out, which include said one of the data elements of the first source packed data in the corresponding position, which has been rotated by a broadcasted replica of the rotation amount from the single data element. 9. The method of claim 1 , wherein storing comprises storing at least sixteen result data elements, and wherein the result data elements each comprise at least 32-bits. 10. An apparatus comprising: a plurality of packed data registers; and an execution unit coupled with the plurality of the packed data registers, the execution unit operable, in response to a masked packed rotate instruction that is to indicate a first source packed data that is to include a plurality of packed data elements, that is to indicate a packed data operation mask in an architectural mask register that is to have a plurality of mask elements that are each to correspond to a different one of the plurality of packed data elements, that is to indicate at least one rotation amount, and that is to indicate a destination storage location, to store a result packed data in the destination storage location, the result packed data to include a plurality of result data elements that each are to correspond to a different one of the mask elements in a corresponding relative position, in which result data elements that are not masked out by the corresponding mask element are to include one of the data elements of the first source packed data in a corresponding position that is to have been rotated, and in which result data elements that are masked out by the corresponding mask element are to include a masked out value. 11. The apparatus of claim 10 , wherein the mask elements each comprise a single bit. 12. The apparatus of claim 10 , wherein the instruction is to indicate a second source packed data that is to include a plurality of packed data elements that are to represent rotation amounts. 13. The apparatus of claim 10 , wherein the instruction comprises an immediate to indicate a single rotation amount. 14. The apparatus of claim 10 , wherein the masked packed rotate instruction is a masked packed rotate with data element broadcast of rotation amounts instruction, and wherein the masked packed rotate with data element broadcast of rotation amounts instruction is to indicate a single data element that is to represent a rotation amount that is to be broadcast. 15. The apparatus of claim 14 , wherein the execution unit, in response to the instruction, is to store the not masked out result data elements, which are to include said one of the data elements of the first source packed data in the corresponding position, which is to have been rotated by a broadcasted replica of the rotation amount from the single data element. 16. The apparatus of claim 10 , wherein the execution unit, in response to the instruction, is to store merged values in the result data elements that are masked out. 17. The apparatus of claim 10 , wherein the execution unit, in response to the instruction, is to store at least four result data elements, and wherein the result data elements are to comprise quadwords. 18. A system comprising: an interconnect; a processor coupled with the interconnect, the processor to process a masked packed rotate instruction that is to indicate a first source packed data that is to include a plurality of packed data elements, that is to indicate a packed data operation mask in an architectural mask register that is to have a plurality of mask elements that are each to correspond to a different one of the plurality of packed data elements in a same relative position, that is to indicate at least one rotation amount, and that is to indicate a destination storage location, the processor operable, in response to the masked packed rotate instruction to store a result packed data in the destination storage location, the result packed data to include a plurality of result data elements that each are to correspond to a different one of the mask elements in a corresponding relative position, in which result data elements that are not masked out by the corresponding mask element are to include one of the data elements of the first source packed data in a corresponding position that is to have been rotated, and in which result data elements that are masked out by the corresponding mask element are to include a predetermined value that is not to be a rotated source data element; and a dynamic random access memory (DRAM) coupled with the interconnect. 19. The system of claim 18 , wherein the mask elements each comprise a single bit, and wherein the values that are not the rotated source data elements comprise one of zeroed values and merged values. 20. The system of claim 18 , wherein the instruction is a masked packed rotate with data element broadcast of rotation amounts instruction that is to indicate a second source single data element tha

Assignees

Inventors

Classifications

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title

  • using a mask · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9864602B2 cover?
A method of an aspect includes receiving a masked packed rotate instruction. The instruction indicates a first source packed data including a plurality of packed data elements, a packed data operation mask having a plurality of mask elements, at least one rotation amount, and a destination storage location. A result packed data is stored in the destination storage location in response to the in…
Who is the assignee on this patent?
Ould-Ahmed-Vall Elmoustapha, Valentine Robert, Corbal San Andrian Jesus, and 6 more
What technology area does this patent fall under?
Primary CPC classification G06F9/30032. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 09 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).