Streaming engine with separately selectable element and group duplication
US-11860790-B2 · Jan 2, 2024 · US
US9354877B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9354877-B2 |
| Application number | US-201113992709-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 23, 2011 |
| Priority date | Dec 23, 2011 |
| Publication date | May 31, 2016 |
| Grant date | May 31, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of systems, apparatuses, and methods for performing in a computer processor mask bit compression in response to a single mask bit compression instruction that includes a source writemask register operand, a destination writemask register operand, and an opcode are described.
Opening claim text (preview).
What is claimed is: 1. A method of performing in a computer processor mask bit compression in response to a single mask bit compression instruction that includes a source writemask register operand, a destination writemask register operand, and an opcode, the method comprising steps of: executing the mask bit compression instruction to determine which writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register, wherein each bit of the source and destination writemask registers is a writemask bit; and storing the determined writemask bits into the least significant bit positions of the destination writemask register. 2. The method of claim 1 , wherein the writemask registers are 16-bit registers each having 16 writemasks. 3. The method of claim 1 , wherein the writemask registers are 64-bit registers each having 64 writemasks. 4. The method of claim 1 , wherein the opcode sets a number writemask bits of the source writemask register to evaluate for determining which of the writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register. 5. The method of claim 1 , wherein all writemask bits of the source writemask register are to be evaluated for determining which of the writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register. 6. The method of claim 5 , wherein only the eight least significant writemask bits of the source writemask register are to be evaluated for determining which of the writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register. 7. The method of claim 1 , further comprising: setting all of the destination writemask's bits to be 0 prior to determining which writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register. 8. The method of claim 1 , wherein the executing and storing steps further comprise: determining if a least significant bit position of the source writemask register is 1; when the a least significant bit position of the source writemask register is 1, writing a 1 into a least significant bit position of the destination writemask register that does not have a 1 already stored there; and when the a least significant bit position of the source writemask register is 0, determining if a next least significant bit position of the source writemask register is 1 when the a least significant bit position of the source writemask register is 0. 9. An article of manufacture comprising: a non-transitory tangible machine-readable storage medium having stored thereon an occurrence of an instruction, wherein the instruction's format specifies as its only source operand a source operand a single writemask register and specifies as its destination a single writemask register, and wherein the instruction format includes an opcode which instructs a machine, responsive to the single occurrence of the single instruction, to cause at least some of the source operand's writemask bits to be written into one or more least significant bits of the destination operand's writemask register. 10. The article of manufacture of claim 9 , wherein the writemask registers are 16-bit registers each having 16 writemasks. 11. The article of manufacture of claim 9 , wherein the writemask registers are 64-bit registers each having 64 writemasks. 12. The article of manufacture of claim 9 , wherein the opcode sets a number writemask bits of the source writemask register to evaluate for determining which of the writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register. 13. The article of manufacture of claim 12 , wherein all writemask bits of the source writemask register are to be evaluated for determining which of the writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register. 14. The article of manufacture of claim 12 , wherein only the eight least significant writemask bits of the source writemask register are to be evaluated for determining which of the writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register. 15. The article of manufacture of claim 9 , further comprising: setting all of the destination writemask's bits to be 0 prior to causing at least some of the source operand's writemask bits to be written into one or more least significant bits of the destination operand's writemask register. 16. An apparatus comprising; a hardware decoder to decode an mask bit compression instruction, wherein the mask bit compression instruction includes a source writemask register operand, a destination writemask register operand, and an opcode; execution logic to determine which writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register, wherein each bit of the source and destination writemask registers is a writemask bit, and store the determined writemask bits into the least significant bit positions of the destination writemask register. 17. The apparatus of claim 16 , wherein the writemask registers are 16-bit registers each having 16 writemasks. 18. The apparatus of claim 16 , wherein the writemask registers are 64-bit registers each having 64 writemasks. 19. The apparatus of claim 16 , wherein the opcode sets a number writemask bits of the source writemask register to evaluate for determining which of the writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register. 20. The apparatus of claim 16 , the execution logic further to set all of the destination writemask's bits to be 0 prior to determining which writemask bits of the source writemask register are to be written into the least significant bit positions of the destination writemask register.
Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion · CPC title
Instruction operation extension or modification · CPC title
Register arrangements · CPC title
Bit or string instructions · CPC title
using a mask · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.