Two dimensional shift array for image processor
US-2016316107-A1 · Oct 27, 2016 · US
US9830150B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9830150-B2 |
| Application number | US-201514960334-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 4, 2015 |
| Priority date | Dec 4, 2015 |
| Publication date | Nov 28, 2017 |
| Grant date | Nov 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus is described that includes an execution unit having a multiply add computation unit, a first ALU logic unit and a second ALU logic unit. The ALU unit is to perform first, second, third and fourth instructions. The first instruction is a multiply add instruction. The second instruction is to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction. The third instruction is to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction. The fourth instruction is to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate during to determine first and second division resultant digit values.
Opening claim text (preview).
The invention claimed is: 1. An apparatus, comprising: an execution unit comprising a multiply add computation unit, a first ALU logic unit and a second ALU logic unit, the execution unit to perform: a first instruction, said first instruction being a multiply add instruction; a second instruction, said second instruction to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction; a third instruction, said third instruction to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction; a fourth instruction, said fourth instruction to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate to determine first and second division resultant digit values. 2. The apparatus of claim 1 wherein the execution unit is further to perform a fifth instruction, said fifth instruction to also perform parallel ALU operations with the first and second ALU logic unit operating simultaneously. 3. The apparatus of claim 2 wherein the second and fifth instructions operate according to different data widths. 4. The apparatus of claim 3 wherein one of the widths is 8 bits and another of the widths is 16 bits. 5. The apparatus of claim 1 wherein the first and second ALU logic units are coupled with a carry term signal line. 6. The apparatus of claim 1 wherein the execution unit is further to perform a fifth instruction, said fifth instruction being a FUSED instruction. 7. The apparatus of claim 1 wherein the execution unit is further to perform a fifth instruction, said fifth instruction being a double wide instruction in which the first and second ALU logic units respectively generate different halves of a resultant. 8. The apparatus of claim 1 wherein the second, third and fourth instructions do not consume more time than the first instruction and the cycle time of the execution unit is commensurate with the multiply add instruction. 9. An apparatus, comprising: an image processor comprising an array of execution lanes, each execution lane comprising an execution unit, said execution unit comprising a multiply add computation unit, a first ALU logic unit and a second ALU logic unit, the ALU unit to perform: a first instruction, said first instruction being a multiply add instruction; a second instruction, said second instruction to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction; a third instruction, said third instruction to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction; a fourth instruction, said fourth instruction to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate to determine first and second resultant digit values. 10. The apparatus of claim 9 wherein the image processor further comprises a two-dimensional shift register array structure, where, array locations of the two dimensional shift register array locally couple to respective execution lanes of the array of execution lanes. 11. The apparatus of claim 9 wherein the execution unit is further to perform a fifth instruction, said fifth instruction to also perform parallel ALU operations with the first and second ALU logic unit operating simultaneously. 12. The apparatus of claim 11 wherein the second and fifth instructions operate according to different data widths. 13. The apparatus of claim 12 wherein one of the widths is 8 bits and another of the widths is 16 bits. 14. The apparatus of claim 11 wherein the first and second ALU logic units are coupled with a carry term signal line. 15. The apparatus of claim 10 wherein the execution unit is further to perform a fifth instruction, said fifth instruction being a FUSED instruction. 16. The apparatus of claim 10 wherein the execution unit is further to perform a fifth instruction, said fifth instruction being a double wide instruction in which the first and second ALU logic units respectively generate different halves of a resultant. 17. The apparatus of claim 9 wherein the second, third and fourth instructions consume less time than the first instruction and the cycle time of the execution unit is commensurate with the time consumed by the first instruction. 18. The apparatus of claim 9 wherein the image processor is within a computing system. 19. An apparatus, comprising: a circuit design synthesis tool compatible description of an execution unit, said execution unit comprising a multiply add computation unit, a first ALU logic unit and a second ALU logic unit, the ALU unit to perform: a first instruction, said first instruction being a multiply add instruction; a second instruction, said second instruction to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction; a third instruction, said third instruction to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction; a fourth instruction, said fourth instruction to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate to determine first and second division resultant digit values. 20. The apparatus of claim 19 wherein the description further describes an image processor comprising an array of execution lanes one of the execution lanes comprising the execution unit. 21. The apparatus of claim 20 wherein the image processor further comprises a two-dimensional shift register array circuit, where, array locations of the two dimensional shift register array circuit locally couple to respective ALU units of the array of execution lanes. 22. The apparatus of claim 9 wherein the execution unit is further to perform a fifth instruction, said fifth instruction to also perform parallel ALU operations with the first and second ALU logic unit operating simultaneously. 23. A method, comprising: performing the following with an execution unit of an image processor: executing a first instruction, said first instruction being a multiply add instruction; executing a second instruction including performing parallel ALU operations with first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction; executing a third instruction including performing sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction; executing a fourth instruction including performing an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate to determine a first and second division resultant digit values.
Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations {(G06F7/49, G06F7/491 take precedence)} · CPC title
with variable precision · CPC title
comprising an array of processing units with common control, e.g. single instruction multiple data processors (G06F15/82 takes precedence {; for correlation function computation G06F17/15}) · CPC title
Arithmetic instructions · CPC title
controlled by a single instruction for multiple data lanes [SIMD] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.