Implementing 128-bit simd operations on a 64-bit datapath

US2016140079A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016140079-A1
Application numberUS-201514940585-A
CountryUS
Kind codeA1
Filing dateNov 13, 2015
Priority dateNov 14, 2014
Publication dateMay 19, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of implementing a processor architecture and corresponding system includes operands of a first size and a datapath of a second size. The second size is different from the first size. Given a first array of registers and a second array of registers, each register of the first and second arrays being of the second size, selecting a first register and corresponding second register from the first array and the second array, respectively, to perform operations of the first size. This allows a user, who is interfacing with the hardware processor through software, to provide data of the datapath bit-width instead of the register bit-width. Advantageously, the user is agnostic to the size of the registers.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of implementing a processor architecture including operands of a first size and a datapath of a second size, the second size being different from the first size, the method comprising: given a first array of registers and a second array of registers, each register of the first and second array being of the second size, selecting a first register and corresponding second register from the first array and the second array, respectively, to perform operations of the first size. 2 . The method of claim 1 , wherein each register of the first array corresponds with a particular register of the second array. 3 . The method of claim 2 , wherein the first register of the first array, corresponding with the second register of the second array, stores a plurality of bits of a different significance than a plurality of bits stored in the second register. 4 . The method of claim 1 , wherein selecting the first register and the second register includes selecting the first register in response to receiving a first non-null select value and selecting the second register in response to receiving a second non-null select value. 5 . The method of claim 1 , wherein selecting the first register and the second register includes selecting the first register in response to receiving a first non-null select value and not selecting the second register in the absence of receiving a second non-null select value. 6 . The method of claim 1 , further comprising: performing a first operation on data of the first register and performing a second operation on data in the corresponding second register at a functional unit, the first operation and second operation related to a same instruction. 7 . The method of claim 1 , further comprising: after beginning an operation on data of the first register: issuing a stall cycle; and reading data of the corresponding second register during the stall cycle. 8 . The method of claim 1 , further comprising: returning a result of the first size, the result stored partially in a first destination register and partially in a corresponding second destination register in the first array and second array, respectively. 9 . The method of claim 1 , further comprising: returning a result of the second size stored in a destination register in either the first array or second array. 10 . The method of claim 6 , wherein performing operations of the first size includes reading bits of a first operand in the second register in a same clock cycle as bits of a second operand in a first register when the operation is any one of a pairwise instruction and an across-vector instruction, and further includes reading bits of the first operand in a third register in a next clock cycle as bits of the second operand in a fourth register when the instruction is a pairwise instruction. 11 . The method of claim 6 , wherein performing operations of the first size includes reading bits of a plurality of registers of the second array in a first clock cycle prior to reading bits of a plurality of registers of the first array in a second clock cycle. 12 . A system for implementing a processor architecture including operands of a first size and a datapath of a second size, the second size being different from the first size, the system comprising: a first array of registers, each register of the first array being of the second size; a second array of registers, each register of the second array being of the second size; a selection module configured to select a first register and corresponding second register from the first array and the second array, respectively, to perform operations of the first size. 13 . The system of claim 12 , wherein the first array of registers and second array of registers correspond such that each register of the first array corresponds with a particular register of the second array. 14 . The system of claim 13 , wherein the first register of the first array, corresponding with the second register of the second array, stores a plurality of bits of a different significance than a plurality of bits stored in the second register. 15 . The system of claim 12 , wherein the selection module is further configured to select the first register in response to receiving a first non-null select value and select the second register in response to receiving a second non-null select value. 16 . The system of claim 12 , wherein the selection module is further configured to select the first register in response to receiving a first non-null select value and to not select the second register in the absence of receiving a second non-null select value. 17 . The system of claim 12 , further comprising: a functional unit configured to perform a first operation on data of the first register and perform a second operation on data in the corresponding second register at a functional unit, the first operation and second operation related to a same instruction. 18 . The system of claim 12 , further comprising: an issue unit configured to, after beginning an operation on data of the first register: issue a stall cycle and read data of the corresponding second register during the stall cycle. 19 . The system of claim 12 , further comprising: an output module configured to return a result of the first size stored partially in a first destination register and partially in a corresponding second destination register in the first array and second array, respectively. 20 . The system of claim 12 , further comprising an output module configured to return a result of the second size stored in a destination register in either the first array or second array. 21 . The system of claim 17 , wherein the functional unit is further configured to read bits of the second register in a same clock cycle as bits of a first register based on the operation being based on at least one of a the group of a pairwise instruction and an across-vector instruction, and to read bits of the first operand in a third register in a next clock cycle as bits of the second operand in a fourth register when the instruction is a pairwise instruction. 22 . The system of claim 17 , wherein the functional unit is further configured to read bits of a plurality of registers of the second array in a first clock cycle prior to reading bits of a plurality of registers of the first array in a second clock cycle. 23 . A non-transitory computer-readable medium having computer-readable program codes embedded thereon including instructions for causing a processor to execute a method for implementing a processor architecture including operands of a first size and a datapath of a second size, the second size being different from the first size, that, when executed by one or more processors, cause the one or more processors to: given a first array of registers and a second array of registers, each register of the first and second arrays being of the second size, select a first register and corresponding second register from the first array and second array, respectively, to perform operations of the first size. 24 . The non-transitory computer-readable medium of claim 23 , wherein the first array of registers and the second array of registers correspond such that each register of the first array corresponds with a particular register of the second array. 25 . The non-transitory computer-readable medium of

Assignees

Inventors

Classifications

  • with variable precision · CPC title

  • comprising data of variable length · CPC title

  • Organisation of register space, e.g. banked or distributed register file · CPC title

  • single instruction multiple data [SIMD] multiprocessors · CPC title

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016140079A1 cover?
A method of implementing a processor architecture and corresponding system includes operands of a first size and a datapath of a second size. The second size is different from the first size. Given a first array of registers and a second array of registers, each register of the first and second arrays being of the second size, selecting a first register and corresponding second register from th…
Who is the assignee on this patent?
Cavium Inc
What technology area does this patent fall under?
Primary CPC classification G06F15/8007. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 19 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).