Hardware apparatuses and methods relating to elemental register accesses

US2016188334A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016188334-A1
Application numberUS-201414582784-A
CountryUS
Kind codeA1
Filing dateDec 24, 2014
Priority dateDec 24, 2014
Publication dateJun 30, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatuses relating to a vector instruction with a register operand with an elemental offset are described. In one embodiment, a hardware processor includes a decode unit to decode a vector instruction with a register operand with an elemental offset to access a first number of elements in a register specified by the register operand, wherein the first number is a total number of elements in the register minus the elemental offset, access a second number of elements in a next logical register, wherein the second number is the elemental offset, and combine the first number of elements and the second number of elements as a data vector, and an execution unit to execute the vector instruction on the data vector.

First claim

Opening claim text (preview).

What is claimed is: 1 . A hardware processor comprising: a decode unit to decode a vector instruction with a register operand with an elemental offset to: access a first number of elements in a register specified by the register operand, wherein the first number is a total number of elements in the register minus the elemental offset; access a second number of elements in a next register, wherein the second number is the elemental offset; and combine the first number of elements and the second number of elements as a data vector; and an execution unit to execute the vector instruction on the data vector. 2 . The hardware processor of claim 1 , wherein the next register is a next logical register from the register specified by the register operand. 3 . The hardware processor of claim 1 , wherein the next register is specified by a second register operand of the vector instruction. 4 . The hardware processor of claim 1 , comprising a circuit to: concatenate the elements in the register and the elements in the next register to form a concatenated vector; and shift the concatenated vector by the elemental offset to form the data vector. 5 . The hardware processor of claim 4 , wherein the circuit comprises a multiplier to multiply the elemental offset by a size of each element in the register to determine a number of bits to shift the concatenated vector to form the data vector. 6 . The hardware processor of claim 1 , comprising: a banked register file that includes the elements in the register and the elements in the next register; and bank logic to combine the first number of elements and the second number of elements from the banked register file as the data vector. 7 . The hardware processor of claim 1 , wherein the register, the next register, and the data vector have a same total number of elements and a same size of each element. 8 . The hardware processor of claim 1 , wherein the decode unit is to output the data vector to the execution unit without an output of the data vector back into the decode unit. 9 . A method comprising: decoding, with a decode unit, a vector instruction with a register operand with an elemental offset to: access a first number of elements in a register specified by the register operand, wherein the first number is a total number of elements in the register minus the elemental offset; access a second number of elements in a next register, wherein the second number is the elemental offset; and combine the first number of elements and the second number of elements as a data vector; and executing the vector instruction on the data vector with an execution unit. 10 . The method of claim 9 , wherein the next register is a next logical register from the register specified by the register operand. 11 . The method of claim 9 , wherein the next register is specified by a second register operand of the vector instruction. 12 . The method of claim 9 , comprising: concatenating the elements in the register and the elements in the next register to form a concatenated vector; and shifting the concatenated vector by the elemental offset to form the data vector. 13 . The method of claim 12 , wherein the shifting comprises multiplying the elemental offset by a size of each element in the register to determine a number of bits to shift the concatenated vector to form the data vector. 14 . The method of claim 9 , comprising: providing a banked register file that includes the elements in the register and the elements in the next register; and combining the first number of elements and the second number of elements from the banked register file as the data vector. 15 . The method of claim 9 , wherein the register, the next register, and the data vector have a same total number of elements and a same size of each element. 16 . The method of claim 9 , further comprising outputting the data vector to the execution unit from the decode unit without outputting the data vector back into the decode unit. 17 . An apparatus comprising: a set of one or more processors; and a set of one or more data storage devices that stores code, that when executed by the set of processors causes the set of one or more processors to perform the following: decoding, with a decode unit, a vector instruction with a register operand with an elemental offset to: access a first number of elements in a register specified by the register operand, wherein the first number is a total number of elements in the register minus the elemental offset; access a second number of elements in a next register, wherein the second number is the elemental offset; and combine the first number of elements and the second number of elements as a data vector; and executing the vector instruction on the data vector with an execution unit. 18 . The apparatus of claim 17 , wherein the next register is a next logical register from the register specified by the register operand. 19 . The apparatus of claim 17 , wherein the next register is specified by a second register operand of the vector instruction. 20 . The apparatus of claim 17 , wherein the set of data storage devices further stores code, that when executed by the set of processors causes the set of processors to perform the following: concatenating the elements in the register and the elements in the next register to form a concatenated vector; and shifting the concatenated vector by the elemental offset to form the data vector. 21 . The apparatus of claim 20 , wherein the set of data storage devices further stores code, that when executed by the set of processors causes the set of processors to perform the following: wherein the shifting comprises multiplying the elemental offset by a size of each element in the register to determine a number of bits to shift the concatenated vector to form the data vector. 22 . The apparatus of claim 17 , wherein the set of data storage devices further stores code, that when executed by the set of processors causes the set of processors to perform the following: providing a banked register file that includes the elements in the register and the elements in the next register; and combining the first number of elements and the second number of elements from the banked register file as the data vector. 23 . The apparatus of claim 17 , wherein the register, the next register, and the data vector have a same total number of elements and a same size of each element. 24 . The apparatus of claim 17 , wherein the set of data storage devices further stores code, that when executed by the set of processors causes the set of processors to perform the following: further comprising outputting the data vector to the execution unit from the decode unit without outputting the data vector back into the decode unit.

Assignees

Inventors

Classifications

  • of variable length instructions · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • Register stacks; shift registers · CPC title

  • using a mask · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016188334A1 cover?
Methods and apparatuses relating to a vector instruction with a register operand with an elemental offset are described. In one embodiment, a hardware processor includes a decode unit to decode a vector instruction with a register operand with an elemental offset to access a first number of elements in a register specified by the register operand, wherein the first number is a total number of e…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/30036. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 30 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).