Systems, apparatuses, and methods for performing vector packed compression and repeat

US9870338B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9870338-B2
Application numberUS-201113992209-A
CountryUS
Kind codeB2
Filing dateDec 23, 2011
Priority dateDec 23, 2011
Publication dateJan 16, 2018
Grant dateJan 16, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed compression and repeat in response to a single vector packed compression and repeat instruction that includes a first and second source vector register operand, a destination vector register operand, and an opcode are described.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: decoding an instruction; executing the decoded instruction to determine, for each packed data element position of a first source vector register, a number of times that packed data element position's packed data element is to be stored in a destination vector register based solely on the value of a corresponding packed data element position of a second source vector register, wherein the number of times is up to a plurality of times and store each packed data element of a packed data element position of the first source vector register into the destination vector register the value number of times based on the determination of the corresponding data element of the second source vector register's value. 2. The method of claim 1 , wherein the storing begins at the least significant packed data element position of the destination vector register and the packed data elements are stored in consecutive packed data element positions of the destination vector register. 3. The method of claim 1 , wherein the executing and storing steps further comprise: determining a value of a least significant packed data element position of the second source vector register; determining if that value is greater than 0; if the value is greater than 0, storing a corresponding packed data element position of the first source vector register's packed data element value number of times, wherein these packed data elements are stored in consecutively beginning at a least significant packed data element position of the destination vector register; and if the value is 0, determining a value of a next least significant packed data element position of the second source vector register; if the value of the next least significant data element position is greater than 0, storing a corresponding packed data element position of the first source vector register's packed data element value number of times, wherein these packed data elements are stored in consecutively beginning at a least significant packed data element position of the destination vector register that has not been previously written to. 4. The method of claim 3 , further comprising: repeating the determining and storing steps until all of the packed data element positions of the second source vector register's values have been evaluated. 5. The method of claim 4 , further comprising: writing a preset value into all unused packed data element positions of the destination vector register after all of the packed data element positions of the first source vector register have been written into the destination vector register. 6. The method of claim 5 , wherein the preset value is a value of all 1s. 7. The method of claim 1 , providing a programmer visible exception when all of packed data element positions of the destination vector register have been written to, but there are still packed data elements from the first source vector register that are to be written to the destination vector register. 8. The method of claim 1 , wherein the vector registers are all a same size of 128-bit, 256-bit, or 512-bit. 9. An article of manufacture comprising: a non-transitory machine-readable storage medium having stored thereon an occurrence of an instruction, wherein the instruction's format specifies as its source operands a first and second vector register and specifies as its destination a single vector register, and wherein the instruction format includes an opcode which instructs a machine, responsive to the single occurrence of the single instruction, to cause a determination, for each packed data element position of the first source vector register, a number of times that packed data element position's packed data element is to be stored in the destination vector register based solely on the value of a corresponding packed data element position of the second source vector register, storage of each packed data element of a packed data element position of the first source vector register into the destination vector register the value number of times based on the determination of the corresponding data element of the second source vector register's value, wherein the number of times is up to a plurality of times. 10. The article of manufacture of claim 9 , wherein the storing begins at the least significant packed data element position of the destination vector register and the packed data elements are stored in consecutive packed data element positions of the destination vector register. 11. The article of manufacture of claim 9 , further to cause: a determination of a value of a least significant packed data element position of the second source vector register; a determination of if that value is greater than 0; if the value is greater than 0, storage of a corresponding packed data element position of the first source vector register's packed data element value number of times, wherein these packed data elements are stored in consecutively beginning at a least significant packed data element position of the destination vector register; and if the value is 0, a determination of a value of a next least significant packed data element position of the second source vector register; if the value of the next least significant data element position is greater than 0, storage of a corresponding packed data element position of the first source vector register's packed data element value number of times, wherein these packed data elements are stored in consecutively beginning at a least significant packed data element position of the destination vector register that has not been previously written to. 12. The article of manufacture of claim 9 , further to: repeat until all of the packed data element position of the second source vector register's values have been evaluated. 13. The article of manufacture of claim 9 , further to: write a preset value into all unused packed data element positions of the destination vector register after all of the packed data element positions of the first source vector register have been written into the destination vector register. 14. The article of manufacture of claim 9 , wherein the preset value is a value of all 1s. 15. The article of manufacture of claim 9 , wherein the vector registers are all a same size of 128-bit, 256-bit, or 512-bit. 16. An apparatus comprising: a hardware decoder to decode a single instruction that includes a first and second source vector register operand, a destination vector register operand, and an opcode; execution circuitry to execute the decoded single instruction to determine, for each packed data element position of the first source vector register, a number of times that packed data element position's packed data element is to be stored in the destination vector register based solely on the value of a corresponding packed data element position of the second source vector register and store each packed data element of a packed data element position of the first source vector register into the destination vector register the value number of times based on the determination of the corresponding data element of the second source vector register's value, wherein the number of times is up to a plurality of times. 17. The apparatus of claim 16 , wherein the storage begins at the least significant packed data element position of the destination vector register and the packed data elements are stored in consecutive packed data element positions of the destination vector register. 18. The apparatus of claim 17 , wherein the exec

Assignees

Inventors

Classifications

  • G06F15/78Primary

    comprising a single central processing unit · CPC title

  • Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Loop control instructions; iterative instructions, e.g. LOOP, REPEAT · CPC title

  • using a mask · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9870338B2 cover?
Embodiments of systems, apparatuses, and methods for performing in a computer processor vector packed compression and repeat in response to a single vector packed compression and repeat instruction that includes a first and second source vector register operand, a destination vector register operand, and an opcode are described.
Who is the assignee on this patent?
Ould-Ahmed-Vall Elmoustapha, Willhalm Thomas, Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F15/78. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 16 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).