Processors, methods, and systems to implement partial register accesses with masked full register accesses

US9934032B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9934032-B2
Application numberUS-201615331940-A
CountryUS
Kind codeB2
Filing dateOct 24, 2016
Priority dateMar 30, 2013
Publication dateApr 3, 2018
Grant dateApr 3, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method includes receiving a packed data instruction indicating a first narrower source packed data operand and a narrower destination operand. The instruction is mapped to a masked packed data operation indicating a first wider source packed data operand that is wider than and includes the first narrower source operand, and indicating a wider destination operand that is wider than and includes the narrower destination operand. A packed data operation mask is generated that includes a mask element for each corresponding result data element of a packed data result to be stored by the masked packed data operation. All mask elements that correspond to result data elements to be stored by the masked operation that would not be stored by the packed data instruction are masking out. The masked operation is performed using the packed data operation mask. The packed data result is stored in the wider destination operand.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor comprising: a plurality of packed data operation mask registers, wherein at least one instruction of an instruction set of the processor has a field to specify one of the plurality of the packed data operation mask registers; a decode unit to decode a packed data instruction that is to indicate at least a first narrower source packed data operand and a narrower destination operand; and an execution unit coupled with the plurality of the packed data operation mask registers and coupled with the decode unit, the execution unit in response to the decode of the packed data instruction to perform a masked packed data operation that is to involve at least a first wider source packed data operand that is to be wider than the first narrower source packed data operand, that is to involve a wider destination operand that is to be wider than the narrower destination operand, and that is to involve a packed data operation mask, the packed data operation mask to be stored in a packed data operation mask register of the plurality of packed data operation mask registers, the packed data operation mask to include a mask element for each corresponding result data element of a packed data result that is to be stored by the masked packed data operation, wherein all mask elements that correspond to result data elements to be stored by the masked packed data operation that do not correspond to the packed data instruction are to be masking out, the execution unit to store the packed data result in the wider destination operand. 2. The processor of claim 1 , wherein the execution unit is to write an entire width of a register that is to correspond to the wider destination operand, and wherein the narrower destination operand is to represent only a portion of the width of the register. 3. The processor of claim 1 , wherein the decode unit is to decode the packed data instruction that is also to indicate a second narrower source packed data operand, and wherein the masked packed data operation is also to involve a second wider source packed data operand that is to be wider than the second narrower source packed data operand. 4. The processor of claim 1 , wherein the decode unit is to decode the packed data instruction that is not to indicate a packed data operation mask. 5. The processor of claim 1 , wherein the decode unit is to decode the packed data instruction that is to indicate a packed data operation mask having fewer mask elements than the packed data operation mask that is to be used by the execution unit to perform the masked packed data operation. 6. The processor of claim 1 , wherein the execution unit is to store the packed data result in which a value of each result data element that corresponds to a masked out mask element is to be unchanged, and in which a value of each result data element that corresponds to an unmasked mask element is to be updated by an operation associated with the packed data instruction. 7. The processor of claim 1 , wherein said all mask elements that correspond to the result data elements to be stored by the masked packed data operation that do not correspond to the packed data instruction are to have a value of a corresponding data element of the at least one wider source packed data operand. 8. A method in a processor comprising: receiving a packed data instruction indicating at least a first narrower source packed data operand and a narrower destination operand; mapping the packed data instruction to a masked packed data operation involving at least a first wider source packed data operand that is wider than the first narrower source packed data operand, and involving a wider destination operand that is wider than the narrower destination operand; accessing a packed data operation mask from a set of packed data operation mask registers of the processor that are capable of being specified by instructions of an instruction set of the processor, the packed data operation mask including a mask element for each corresponding result data element of a packed data result to be stored by the masked packed data operation, wherein all mask elements that correspond to result data elements to be stored by the masked packed data operation that do not correspond to the packed data instruction are masking out; performing the masked packed data operation using the packed data operation mask; and storing the packed data result in the wider destination operand. 9. The method of claim 8 , wherein storing the packed data result comprises writing an entire width of a register that corresponds to the wider destination operand, and wherein the narrower destination operand represents only a portion of the width of the register. 10. The method of claim 8 , wherein receiving comprises receiving the packed data instruction that also indicates a second narrower source packed data operand, and wherein mapping includes mapping the packed data instruction to the masked packed data operation that also involves a second wider source packed data operand that is wider than the second narrower source packed data operand. 11. The method of claim 8 , wherein receiving comprises receiving the packed data instruction that does not indicate a packed data operation mask. 12. The method of claim 8 , wherein receiving comprises receiving the packed data instruction that indicates a second packed data operation mask that has a lesser number of mask elements than the accessed packed data operation mask. 13. The method of claim 8 , wherein the first narrower source packed data operand is aliased on the first wider source packed data operand in a register. 14. The method of claim 8 , wherein storing comprises storing the packed data result in which a value of each result data element that corresponds to a masked out mask element is unchanged, and in which a value of each result data element that corresponds to an unmasked mask element is updated by an operation associated with the packed data instruction. 15. A processor comprising: a plurality of packed data operation mask registers; a decode unit to decode a packed data instruction that is to indicate at least a first narrower source packed data, and that has a field to specify one of the packed data operation mask registers as a storage location for a narrower packed data operation mask; and an execution unit coupled with the plurality of the packed data operation mask registers and coupled with the decode unit, the execution unit in response to the decode of the packed data instruction to perform a masked packed data operation that is to involve at least a first wider source packed data operand that is to be wider than the first narrower source packed data operand, and that is to involve a wider packed data operation mask that is to be wider than the narrower packed data operation mask, the wider packed data operation mask to include a mask element for each corresponding result data element of a packed data result that is to be stored by the masked packed data operation, wherein all mask elements that correspond to result data elements beyond a width of a packed data result that would be generated by the packed data instruction are to be masking out. 16. The processor of claim 15 , wherein the masked out result data elements are to have a value of a corresponding data element of the at least one wider source packed data operand.

Assignees

Inventors

Classifications

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title

  • Decoding the operand specifier, e.g. specifier format · CPC title

  • using a mask · CPC title

  • Bit or string instructions · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9934032B2 cover?
A method includes receiving a packed data instruction indicating a first narrower source packed data operand and a narrower destination operand. The instruction is mapped to a masked packed data operation indicating a first wider source packed data operand that is wider than and includes the first narrower source operand, and indicating a wider destination operand that is wider than and include…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/30036. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 03 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).