Vector friendly instruction format and execution thereof

US12086594B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12086594-B2
Application numberUS-202318239106-A
CountryUS
Kind codeB2
Filing dateAug 28, 2023
Priority dateApr 1, 2011
Publication dateSep 10, 2024
Grant dateSep 10, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data element width field, wherein the first instruction format supports different versions of base operations and different augmentation operations through placement of different values in the base operation field, the modifier field, the alpha field, the beta field, and the data element width field, and wherein only one of the different values may be placed in each of the base operation field, the modifier field, the alpha field, the beta field, and the data element width field on each occurrence of an instruction in the first instruction format in instruction streams.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: a processor to execute an instruction set, wherein the instruction set includes a first instruction format, wherein the first instruction format includes a first plurality of templates, wherein the first instruction format has a plurality of fields including a base operation field, a data element width field, and a write mask field, wherein the first instruction format supports, through different values in the base operation field, specification of different vector operations, wherein each of the vector operations is to generate a destination vector operand including a plurality of data elements at different data element positions, wherein the first instruction format supports, through different values in the data element width field, specification of different data element widths, wherein the base operation field, the data element width field, and the write mask field may each store only one value on each occurrence of an instruction in the first instruction format in instruction streams, the processor including, a decode unit to decode the occurrences of the instructions in the first plurality of templates, including to: distinguish, for each of the occurrences, which one of the data element widths to use based on a value in the data element width field; and distinguish, for each of the occurrences, which of the data element positions of the destination vector operand are or are not to include corresponding data elements resulting from the vector operation of the occurrence based on the value in the write mask field and the data element width for the occurrence, wherein different values that may be stored in the write mask field distinguish different write mask registers, of a set of write mask registers, that are to store configurable write masks, and wherein the data element width for the occurrence distinguishes which of the data element positions of the destination vector operand correspond with which bits of the configurable write masks. 2. The apparatus of claim 1 , wherein a write mask register of the set of write mask registers cannot be used as a write mask by the occurrences of the instructions. 3. The apparatus of claim 1 , wherein data elements in data element positions of the destination vector operand that do not result from the vector operation are to be preserved. 4. The apparatus of claim 1 , wherein a single bit of a configurable write mask is to be used for each of the data element positions of the destination vector operand. 5. The apparatus of claim 1 , wherein the first instruction format supports through different values in the data element width field specification of a 32-bit data element width and a 64-bit data element width. 6. The apparatus of claim 1 , wherein the decode unit is to distinguish, for each of the occurrences, whether to use either one of merging masking and zeroing masking, based on a value in a field. 7. The apparatus of claim 1 , wherein a single bit of a configurable write mask is to be used for each of the data element positions of the destination vector operand, and wherein the first instruction format supports through different values in the data element width field specification of a 32-bit data element width and a 64-bit data element width. 8. The apparatus of claim 7 , wherein a write mask register of the set of write mask registers cannot be used as a write mask by the occurrences of the instructions, and wherein data elements in data element positions of the destination vector operand that do not result from the vector operation are to be preserved. 9. An apparatus comprising: a processor to execute an instruction set, wherein the instruction set includes a first instruction format, wherein the first instruction format includes a first plurality of templates, wherein the first instruction format has a plurality of fields including a base operation field, a data element width field, and a write mask field, wherein the first instruction format supports, through different values in the base operation field, specification of different vector operations, wherein each of the vector operations is to generate a destination vector operand including a plurality of data elements at different data element positions, wherein the first instruction format supports, through different values in the data element width field, specification of different data element widths, wherein the base operation field, the data element width field, and the write mask field may each store only one value on each occurrence of an instruction in the first instruction format in instruction streams, the processor including, a decode unit to decode the occurrences of the instructions in the first plurality of templates, including to: distinguish, for each of the occurrences, which one of the data element widths to use based on a value in the data element width field; and distinguish, for each of the occurrences, the data elements resulting from the vector operation of the occurrence to be reflected in the destination vector operand's corresponding data element positions based on the write mask field's content and the data element width for that occurrence, wherein different values that may be stored in the write mask field distinguish different write mask registers, of a set of write mask registers, that are to store configurable write masks, and wherein the data element width for the occurrence distinguishes which of the data element positions of the destination vector operand correspond with which bits of the configurable write masks. 10. The apparatus of claim 9 , wherein a write mask register of the set of write mask registers cannot be used as a write mask by the occurrences of the instructions. 11. The apparatus of claim 9 , wherein data elements in data element positions of the destination vector operand that do not result from the vector operation of the occurrence are to be preserved. 12. The apparatus of claim 9 , wherein a single bit of a configurable write mask is to be used for each of the data element positions of the destination vector operand. 13. The apparatus of claim 9 , wherein the first instruction format supports through different values in the data element width field specification of a 32-bit data element width and a 64-bit data element width. 14. The apparatus of claim 9 , wherein the decode unit is to distinguish, for each of the occurrences, whether to use either one of merging masking and zeroing masking, based on a value in a field. 15. The apparatus of claim 9 , wherein a write mask register of the set of write mask registers cannot be used as a write mask by the occurrences of the instructions, and wherein data elements in data element positions of the destination vector operand that do not result from the vector operation of the occurrence are to be preserved. 16. The apparatus of claim 15 , wherein a single bit of a configurable write mask is to be used for each of the data element positions of the destination vector operand, and wherein the first instruction format supports through different values in the data element width field specification of a 32-bit data element width and a 64-bit data element width. 17. An apparatus comprising: a processor configured to execute an instruction set, wherein the instruction set includes a first instruction format, wherein the first instruction format includes a first plurality of templates that each include a plurality of fields including a base operation field, a data element width (W) field, a vector length field, a write mask control field, an

Assignees

Inventors

Classifications

  • the IGFETs characterised by having different channel structures · CPC title

  • Devices controlled by electric currents or voltages · CPC title

  • Spacers formed inside holes at the prospective gate locations, e.g. holes left by removing dummy gates · CPC title

  • characterised by the structure of the channel, e.g. transverse or longitudinal shape or doping profile (TFTs having channel structures for preventing kink or snapback effects H10D30/6708; TFTs having lightly-doped source or drain extensions H10D30/6715) · CPC title

  • having multiple independently-addressable gate electrodes influencing the same channel (FinFETs having multiple distinct gate electrodes H10D30/6215; multi-gate TFT H10D30/6733) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12086594B2 cover?
A vector friendly instruction format and execution thereof. According to one embodiment of the invention, a processor is configured to execute an instruction set. The instruction set includes a vector friendly instruction format. The vector friendly instruction format has a plurality of fields including a base operation field, a modifier field, an augmentation operation field, and a data elemen…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/30014. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 10 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).