Processors Supporting Endian Agnostic SIMD Instructions and Methods

US2017123792A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2017123792-A1
Application numberUS-201514930740-A
CountryUS
Kind codeA1
Filing dateNov 3, 2015
Priority dateNov 3, 2015
Publication dateMay 4, 2017
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor includes a register and a load store unit (LSU). The LSU loads data into the register from a memory. When in little endian mode, bytes from sequentially increasing memory addresses are loaded in order of corresponding sequentially increasing byte memory addresses from a first end (right end) of the register to a second end (left end) of the register. When in big endian mode, bytes from sequentially increasing memory addresses are loaded in order of corresponding sequentially increasing memory addresses from the second end (left end) of the register to the first end (right) of the register. Therefore, regardless of operating in little or big endian mode, the data in the register has its most significant byte on its left side and its least significant byte on its right side which simplifies the execution of SIMD instructions because the data is aligned the same for both endian modes.

First claim

Opening claim text (preview).

What is claimed is: 1 . A processor system comprising: a register; and a load store unit (LSU) configured to load data into the register from a memory, wherein when in little endian mode bytes from sequentially increasing memory addresses are loaded in order of corresponding sequentially increasing byte memory addresses from a first end of the register to a second end of the register, and wherein when in big endian mode bytes from sequentially increasing memory addresses are loaded in order of corresponding sequentially increasing memory addresses from the second end of the register to the first end of the register. 2 . The processor system of claim 1 wherein bytes are stored within the register as data elements, wherein the data elements are sized according to one of the group of: byte, half-word, word, double-word, and quad-word. 3 . The processor system of claim 2 wherein the register is a first source register and further comprising: an arithmetic local unit (ALU) configured to execute an instruction using data from the first source register and a second source register, wherein when performing one or more operations from the group of: addition, subtraction, and multiplication bits within individual data elements that are input to the ALU from the first source register and the second source register are not reordered regardless of whether the system is operating in big endian mode or little endian mode. 4 . The processor system of claim 3 wherein the instruction is a single instruction multiple data (SIMD) instruction. 5 . The processor system of claim 3 further comprising: input reordering logic configured to rearrange an order of bits from the first source register before data from the first source register is input to the ALU based, at least in part, whether the processor system is operating in big ending mode or little ending mode. 6 . The processor system of claim 5 wherein the SIMD instruction represents four instructions operating on four data element word pairs, wherein word data element pair i[3] and i[7] is processed together, word data element pair i[2] and i[6] is processed together, word data element pair i[1] and i[5] is processed together, and word data element pair i[0] and i[4] is processed together. 7 . The processor system of claim 6 wherein the data element pairs are processed in parallel in multiple ALUs. 8 . The processor system of claim 2 further comprising: a single load instruction to configured to cause the LSU to load data with data elements from memory to the register without regard to a size of the data elements. 9 . The processor system of claim 3 further comprising: output reordering logic configured to align bytes of an output value calculated by the ALU. 10 . The processor system of claim 1 wherein the LSU is further configured to return bytes stored in the register to original memory byte addresses from which the bytes stored in the register were loaded. 11 . The processor system of claim 1 further comprising: search logic configured to search in little endian mode for a byte value starting at the first end of the register and searching byte by byte for the byte value until reaching the second end of the register, and wherein the search logic is configured to search in big endian mode for a byte value starting at the second end of the register and searching byte by byte for the byte value until reaching the first end of the register. 12 . A processor system, comprising: a load store unit (LSU) configured to execute load instructions and store instructions to access data in a memory comprising multiple distinct data elements, wherein the load instruction and store instructions do not differentiate as to a size of the multiple distinct data elements; and a register file configured to receive data in response to load instructions and to provide data for storing to memory in response to store instructions, wherein contents of a register differs in dependence on whether the register was loaded in either a big endian or a little endian mode. 13 . The processor system of claim 12 , further comprising: an execution unit configured to perform Single Instruction Multiple Data (SIMD) operations on one or more source registers and configured to store a result of the operation in one or more destination registers, wherein the execution unit receives an indication of endian mode to identify a location within the one or more source registers where a particular data element is located. 14 . The processor system of claim 12 , wherein there is one load instruction and one store instruction for data to be used in an SIMD operation, regardless of intended element size of the operation to be performed. 15 . The processor system of claim 14 , wherein the intended element size is one of the group of: byte, half-word, word, double-word, and quad-word. 16 . The processor system of claim 12 , wherein the load store unit populates a destination register for a load instruction with a first appearing data element at a most significant byte portion of the destination register when operating in big endian mode. 17 . The processor system of claim 12 , wherein the load store unit populates a destination register for a load instruction with a first appearing data element at a least significant byte portion of the destination register for little endian mode. 18 . The processor system of claim 12 , wherein the processor system executes a search instruction logically starting at one end of a source register for both big endian mode and little endian mode. 19 . A processor system, comprising: a load store unit (LSU); a single load instruction to cause the LSU to load data from memory to a register, wherein in little endian mode the byte of the first memory address is loaded into the least significant byte (LSB) of the register with bytes of consecutively increasing addresses loaded next to each other in the register with the with the byte at the largest memory address loaded in the most significant byte (MSB) of the register, wherein in big endian mode the byte of the first memory address is loaded into the MSB of the register with bytes of consecutively increasing addresses loaded next to each other in the register with the byte at the largest memory address loaded in the LSB of the register; and a single store instruction to cause the LSU in little endian mode to store data from the register to memory at a starting memory address with the LSB of the register loaded to the lowest starting memory address with consecutive bytes loaded to consecutively increasing memory addresses until the MSB of the register is loaded to the largest last memory address that is addressed by the single store instruction, and wherein the single store instruction is configured to cause the LSU in big endian mode to store data from the register to memory at a starting memory address with the MSB of the register loaded to the lowest starting memory address with consecutive bytes loaded to consecutively increasing memory addresses until the LSB of the register is loaded to the largest last memory address that is addressed by the single store instruction. 20 . The processor system of claim 20 further comprising: an execution pipeline; and an execution unit configured to perform Single Instruction Multiple Data (SIMD) operations on one or more source registers loaded by the single load instruction and configured to store a result of the operation in one or more destination registers.

Assignees

Inventors

Classifications

  • Arithmetic instructions · CPC title

  • Operand accessing · CPC title

  • Register arrangements · CPC title

  • G06F9/3012Primary

    Organisation of register space, e.g. banked or distributed register file · CPC title

  • LOAD or STORE instructions; Clear instruction · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017123792A1 cover?
A processor includes a register and a load store unit (LSU). The LSU loads data into the register from a memory. When in little endian mode, bytes from sequentially increasing memory addresses are loaded in order of corresponding sequentially increasing byte memory addresses from a first end (right end) of the register to a second end (left end) of the register. When in big endian mode, bytes f…
Who is the assignee on this patent?
Imagination Tech Ltd
What technology area does this patent fall under?
Primary CPC classification G06F9/3012. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 04 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).