Low energy accelerator processor architecture with short parallel instruction word and non-orthogonal register data file

US9952865B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9952865-B2
Application numberUS-201514678944-A
CountryUS
Kind codeB2
Filing dateApr 4, 2015
Priority dateApr 4, 2015
Publication dateApr 24, 2018
Grant dateApr 24, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus for a low energy accelerator processor architecture is disclosed. An example arrangement is an integrated circuit that includes a system bus having a data width N, where N is a positive integer; a central processor unit is coupled to the system bus and configured to execute instructions retrieved from a memory; a low energy accelerator processor is configured to execute instruction words received on the system bus and has a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, wherein each of the execution units is configured to perform operations responsive to retrieved instruction words; and a non-orthogonal data register file comprising a set of data registers coupled to the plurality of execution units, wherein the registers coupled to selected ones of the plurality of execution units. Additional methods and apparatus are disclosed.

First claim

Opening claim text (preview).

What is claimed is: 1. An integrated circuit, comprising: a system bus for transferring data between memory devices, processors, and peripheral devices having a data width N, where N is a positive integer; a central processor unit coupled to the system bus; a low energy accelerator processor coupled to the system bus and having a plurality of execution units; a register file comprising a plurality of registers in a non-orthogonal arrangement with the low energy accelerator processor; a first execution unit of the plurality of execution units configured to read data from and write data to each of the registers and execute a Fast Fourier Transform; and a second execution unit of the plurality of execution units configured to read data from only a first subset of the registers and write data to only a second subset of the registers, wherein the first subset comprises a fewer number of registers than the second subset. 2. The integrated circuit of claim 1 , wherein the low energy accelerator processor includes a floating point flag register to indicate if at least one of the first and second execution units operates on floating point or fixed point data. 3. The integrated circuit of claim 1 , wherein the first execution unit is a butterfly/adder ALU unit configured to perform Finite Impulse Response filtering. 4. The integrated circuit of claim 1 , comprising a third execution unit of the plurality of execution units configured to read data from only a third subset of the registers and write data to only a fourth subset of the registers. 5. The integrated circuit of claim 4 , wherein the third subset comprises the same number of registers as the fourth subset. 6. The integrated circuit of claim 1 , wherein the second execution unit is a load store unit of the low energy accelerator processor, the second execution unit being configured to use at least four of the registers as destination registers. 7. The integrated circuit of claim 1 , wherein the low energy accelerator processor is configured to execute the Fast Fourier Transform in response to a 32-bit instruction word. 8. The integrated circuit of claim 1 , wherein the data width N is 32 bits. 9. The integrated circuit of claim 1 , wherein the register file is included with the low energy accelerator processor and directly accessed by only the plurality of execution units. 10. The integrated circuit of claim 1 , wherein the low energy accelerator processor comprises a loop count register, a loop start register, and a loop end register configured to control a repetitive loop operation. 11. The integrated circuit of claim 1 , wherein none of the registers of the first subset are also part of the second subset. 12. The integrated circuit of claim 11 , comprising a third execution unit of the plurality of execution units configured to read data from only a third subset of the registers and write data to only a fourth subset of the registers, wherein at least some of the registers of the third subset are included in the second subset, and wherein none of the registers of the fourth subset overlap with registers of the first subset, the second subset, and the third subset. 13. The integrated circuit of claim 1 , comprising a fourth execution unit configured to write data to only a fifth subset of the registers and further configured to not read data from any of the registers. 14. A data processor, comprising: a system bus coupled to at least one memory and having a data width of N, where N is a positive integer; a central processor unit coupled to the system bus; a low energy accelerator processor having a plurality of execution units, coupled to the system bus and configured to execute instruction words having a length of less than or equal to N; a register file comprising a plurality of registers in a non-orthogonal arrangement with the low energy accelerator processor; a first execution unit of the plurality of execution units configured to read data from and write data to each of the registers and execute a Fast Fourier Transform; and a second execution unit of the plurality of execution units configured to read data from only a first number of the registers and write data to only a second number of the registers, wherein each of the first and second numbers is less than the plurality, and wherein the first number is less than the second number. 15. The data processor of claim 14 , comprising a third execution unit of the plurality of execution units configured to read data from only a third number of the registers and write data to only a fourth number of the registers, wherein each of the third and fourth numbers is less than the plurality, and wherein the third number is equal to the fourth number. 16. The data processor of claim 14 , wherein the first execution unit is a butterfly/adder ALU unit configured to execute the Fast Fourier Transform or Finite Impulse Response filtering in response to an instruction word. 17. The data processor of claim 14 , wherein the plurality of execution units comprises a load store execution unit, a load coefficient execution unit, a multiply execution unit, and a butterfly/adder ALU execution unit. 18. The data processor of claim 14 , wherein the register file is included with the low energy accelerator processor and directly accessed by only the plurality of execution units. 19. The data processor of claim 14 , wherein the low energy accelerator processor comprises a loop count register, a loop start register, and a loop end register configured to control a repetitive loop operation. 20. The data processor of claim 14 , wherein the low energy accelerator processor comprises a floating point flag register to indicate if at least one of the first and second execution units operates on floating point or fixed point data.

Assignees

Inventors

Classifications

  • using switching circuits, e.g. switching matrix, connection or expansion network (G06F13/4009 takes precedence) · CPC title

  • Cross-Sectional Technologies · mapped topic

  • G06F9/3001Primary

    Arithmetic instructions · CPC title

  • Cross-Sectional Technologies · mapped topic

  • G06F9/3013Primary

    according to data content, e.g. floating-point registers, address registers · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9952865B2 cover?
An apparatus for a low energy accelerator processor architecture is disclosed. An example arrangement is an integrated circuit that includes a system bus having a data width N, where N is a positive integer; a central processor unit is coupled to the system bus and configured to execute instructions retrieved from a memory; a low energy accelerator processor is configured to execute instruction…
Who is the assignee on this patent?
Texas Instruments Inc
What technology area does this patent fall under?
Primary CPC classification G06F9/3001. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 24 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).