Processing device and method for managing tasks thereof
US-2024320037-A1 · Sep 26, 2024 · US
US9665368B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9665368-B2 |
| Application number | US-201213631666-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 28, 2012 |
| Priority date | Sep 28, 2012 |
| Publication date | May 30, 2017 |
| Grant date | May 30, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, apparatuses, and methods of performing in a computer processor broadcasting data in response to a single vector packed broadcasting instruction that includes a source writemask register operand, a destination vector register operand, and an opcode. In some embodiments, the data of the source writemask register is zero extended prior to broadcasting.
Opening claim text (preview).
What is claimed is: 1. A method of comprising: executing a single instruction that includes a source writemask register operand, a source vector register operand, a destination writemask register operand, and an opcode to: logically AND data from the source writemask register operand with each data element of the source vector register operand, determine of which of the logical AND operations indicate a conflict to create a conflict check result, and logically AND the conflict check result with the data from the source writemask operand; and storing the result of the logical ANDing of the conflict check result with the data from the source writemask operand into the destination writemask register operand. 2. The method of claim 1 , further comprising: zero extending data of the source writemask register operand such that the zero extended data will be of the same size as each data element of the source vector register operand. 3. The method of claim 1 , further comprising: broadcasting the zero extended data of the source writemask register operand to a temporary vector register that has a same number and size data elements as the source vector register operand. 4. The method of claim 1 , wherein the source vector register operand is of size 128-bit, 256-bit, or 512-bit. 5. The method of claim 1 , wherein the destination writemask register operand is 64 bits. 6. The method of claim 1 , wherein the destination writemask register operand is 16 bits. 7. The method of claim 1 , wherein data elements of the source vector register operand are of 8-bit, 16-bit, 32-bit, 64-bit, 128-bit, or 256-bit in size. 8. An apparatus comprising: decode circuitry to decode a single instruction that includes a source writemask register operand, a source vector register operand, a destination writemask register operand, and an opcode; execution circuitry to execute the decoded single vector packed conflict testing instruction to: logically AND data from the source writemask register operand with each data element of the source vector register operand, determine of which of the logical AND operations indicate a conflict to create a conflict check result, and logically AND the conflict check result with the data from the source writemask operand, store the result of the logical ANDing of the conflict check result with the data from the source writemask operand into the destination writemask register operand. 9. The apparatus of claim 8 , wherein the execution circuitry to further: zero extend data of the source writemask register operand such that the zero extended data will be of the same size as each data element of the source vector register operand. 10. The apparatus of claim 8 , wherein the execution circuitry to further: broadcast the zero extended data of the source writemask register operand to a temporary vector register that has a same number and size data elements as the source vector register operand. 11. The apparatus of claim 8 , wherein the source vector register operand is of size 128-bit, 256-bit, or 512-bit. 12. The apparatus of claim 8 , wherein the destination writemask register operand is 64 bits. 13. The apparatus of claim 8 , wherein the destination writemask register operand is 16 bits. 14. The apparatus of claim 8 , wherein data elements of the source vector register operand are of 8-bit, 16-bit, 32-bit, 64-bit, 128-bit, or 256-bit in size. 15. A non-transitory machine-readable medium storing an instruction which when executed by a hardware processor to cause the hardware processor to perform a method, the method comprising: executing a single instruction that includes a source writemask register operand, a source vector register operand, a destination writemask register operand, and an opcode to: logically AND data from the source writemask register operand with each data element of the source vector register operand, determine of which of the logical AND operations indicate a conflict to create a conflict check result, and logically AND the conflict check result with the data from the source writemask operand; and storing the result of the logical ANDing of the conflict check result with the data from the source writemask operand into the destination writemask register operand. 16. The non-transitory machine-readable medium of claim 15 , wherein the method further comprises: zero extending data of the source writemask register operand such that the zero extended data will be of the same size as each data element of the source vector register operand. 17. The non-transitory machine-readable medium of claim 15 , wherein the method further comprises: broadcasting the zero extended data of the source writemask register operand to a temporary vector register that has a same number and size data elements as the source vector register operand. 18. The non-transitory machine-readable medium of claim 15 , wherein the source vector register operand is of size 128-bit, 256-bit, or 512-bit. 19. The non-transitory machine-readable medium of claim 15 , wherein the destination writemask register operand is 64 bits. 20. The non-transitory machine-readable medium of claim 15 , wherein the destination writemask register operand is 16 bits. 21. The non-transitory machine-readable medium of claim 15 , wherein data elements of the source vector register operand are of 8-bit, 16-bit, 32-bit, 64-bit, 128-bit, or 256-bit in size.
Dependency mechanisms, e.g. register scoreboarding · CPC title
LOAD or STORE instructions; Clear instruction · CPC title
Compare instructions, e.g. Greater-Than, Equal-To, MINMAX · CPC title
Arithmetic instructions · CPC title
having multiple operands in a single register · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.