Load store circuit with dedicated single or dual bit shift circuit and opcodes for low power accelerator processor
US-2017060586-A1 · Mar 2, 2017 · US
US9952865B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9952865-B2 |
| Application number | US-201514678944-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 4, 2015 |
| Priority date | Apr 4, 2015 |
| Publication date | Apr 24, 2018 |
| Grant date | Apr 24, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus for a low energy accelerator processor architecture is disclosed. An example arrangement is an integrated circuit that includes a system bus having a data width N, where N is a positive integer; a central processor unit is coupled to the system bus and configured to execute instructions retrieved from a memory; a low energy accelerator processor is configured to execute instruction words received on the system bus and has a plurality of execution units including a load store unit, a load coefficient unit, a multiply unit, and a butterfly/adder ALU unit, wherein each of the execution units is configured to perform operations responsive to retrieved instruction words; and a non-orthogonal data register file comprising a set of data registers coupled to the plurality of execution units, wherein the registers coupled to selected ones of the plurality of execution units. Additional methods and apparatus are disclosed.
Opening claim text (preview).
What is claimed is: 1. An integrated circuit, comprising: a system bus for transferring data between memory devices, processors, and peripheral devices having a data width N, where N is a positive integer; a central processor unit coupled to the system bus; a low energy accelerator processor coupled to the system bus and having a plurality of execution units; a register file comprising a plurality of registers in a non-orthogonal arrangement with the low energy accelerator processor; a first execution unit of the plurality of execution units configured to read data from and write data to each of the registers and execute a Fast Fourier Transform; and a second execution unit of the plurality of execution units configured to read data from only a first subset of the registers and write data to only a second subset of the registers, wherein the first subset comprises a fewer number of registers than the second subset. 2. The integrated circuit of claim 1 , wherein the low energy accelerator processor includes a floating point flag register to indicate if at least one of the first and second execution units operates on floating point or fixed point data. 3. The integrated circuit of claim 1 , wherein the first execution unit is a butterfly/adder ALU unit configured to perform Finite Impulse Response filtering. 4. The integrated circuit of claim 1 , comprising a third execution unit of the plurality of execution units configured to read data from only a third subset of the registers and write data to only a fourth subset of the registers. 5. The integrated circuit of claim 4 , wherein the third subset comprises the same number of registers as the fourth subset. 6. The integrated circuit of claim 1 , wherein the second execution unit is a load store unit of the low energy accelerator processor, the second execution unit being configured to use at least four of the registers as destination registers. 7. The integrated circuit of claim 1 , wherein the low energy accelerator processor is configured to execute the Fast Fourier Transform in response to a 32-bit instruction word. 8. The integrated circuit of claim 1 , wherein the data width N is 32 bits. 9. The integrated circuit of claim 1 , wherein the register file is included with the low energy accelerator processor and directly accessed by only the plurality of execution units. 10. The integrated circuit of claim 1 , wherein the low energy accelerator processor comprises a loop count register, a loop start register, and a loop end register configured to control a repetitive loop operation. 11. The integrated circuit of claim 1 , wherein none of the registers of the first subset are also part of the second subset. 12. The integrated circuit of claim 11 , comprising a third execution unit of the plurality of execution units configured to read data from only a third subset of the registers and write data to only a fourth subset of the registers, wherein at least some of the registers of the third subset are included in the second subset, and wherein none of the registers of the fourth subset overlap with registers of the first subset, the second subset, and the third subset. 13. The integrated circuit of claim 1 , comprising a fourth execution unit configured to write data to only a fifth subset of the registers and further configured to not read data from any of the registers. 14. A data processor, comprising: a system bus coupled to at least one memory and having a data width of N, where N is a positive integer; a central processor unit coupled to the system bus; a low energy accelerator processor having a plurality of execution units, coupled to the system bus and configured to execute instruction words having a length of less than or equal to N; a register file comprising a plurality of registers in a non-orthogonal arrangement with the low energy accelerator processor; a first execution unit of the plurality of execution units configured to read data from and write data to each of the registers and execute a Fast Fourier Transform; and a second execution unit of the plurality of execution units configured to read data from only a first number of the registers and write data to only a second number of the registers, wherein each of the first and second numbers is less than the plurality, and wherein the first number is less than the second number. 15. The data processor of claim 14 , comprising a third execution unit of the plurality of execution units configured to read data from only a third number of the registers and write data to only a fourth number of the registers, wherein each of the third and fourth numbers is less than the plurality, and wherein the third number is equal to the fourth number. 16. The data processor of claim 14 , wherein the first execution unit is a butterfly/adder ALU unit configured to execute the Fast Fourier Transform or Finite Impulse Response filtering in response to an instruction word. 17. The data processor of claim 14 , wherein the plurality of execution units comprises a load store execution unit, a load coefficient execution unit, a multiply execution unit, and a butterfly/adder ALU execution unit. 18. The data processor of claim 14 , wherein the register file is included with the low energy accelerator processor and directly accessed by only the plurality of execution units. 19. The data processor of claim 14 , wherein the low energy accelerator processor comprises a loop count register, a loop start register, and a loop end register configured to control a repetitive loop operation. 20. The data processor of claim 14 , wherein the low energy accelerator processor comprises a floating point flag register to indicate if at least one of the first and second execution units operates on floating point or fixed point data.
using switching circuits, e.g. switching matrix, connection or expansion network (G06F13/4009 takes precedence) · CPC title
Cross-Sectional Technologies · mapped topic
Arithmetic instructions · CPC title
Cross-Sectional Technologies · mapped topic
according to data content, e.g. floating-point registers, address registers · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.