Multi-functional execution lane for image processor

US9830150B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9830150-B2
Application numberUS-201514960334-A
CountryUS
Kind codeB2
Filing dateDec 4, 2015
Priority dateDec 4, 2015
Publication dateNov 28, 2017
Grant dateNov 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus is described that includes an execution unit having a multiply add computation unit, a first ALU logic unit and a second ALU logic unit. The ALU unit is to perform first, second, third and fourth instructions. The first instruction is a multiply add instruction. The second instruction is to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction. The third instruction is to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction. The fourth instruction is to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate during to determine first and second division resultant digit values.

First claim

Opening claim text (preview).

The invention claimed is: 1. An apparatus, comprising: an execution unit comprising a multiply add computation unit, a first ALU logic unit and a second ALU logic unit, the execution unit to perform: a first instruction, said first instruction being a multiply add instruction; a second instruction, said second instruction to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction; a third instruction, said third instruction to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction; a fourth instruction, said fourth instruction to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate to determine first and second division resultant digit values. 2. The apparatus of claim 1 wherein the execution unit is further to perform a fifth instruction, said fifth instruction to also perform parallel ALU operations with the first and second ALU logic unit operating simultaneously. 3. The apparatus of claim 2 wherein the second and fifth instructions operate according to different data widths. 4. The apparatus of claim 3 wherein one of the widths is 8 bits and another of the widths is 16 bits. 5. The apparatus of claim 1 wherein the first and second ALU logic units are coupled with a carry term signal line. 6. The apparatus of claim 1 wherein the execution unit is further to perform a fifth instruction, said fifth instruction being a FUSED instruction. 7. The apparatus of claim 1 wherein the execution unit is further to perform a fifth instruction, said fifth instruction being a double wide instruction in which the first and second ALU logic units respectively generate different halves of a resultant. 8. The apparatus of claim 1 wherein the second, third and fourth instructions do not consume more time than the first instruction and the cycle time of the execution unit is commensurate with the multiply add instruction. 9. An apparatus, comprising: an image processor comprising an array of execution lanes, each execution lane comprising an execution unit, said execution unit comprising a multiply add computation unit, a first ALU logic unit and a second ALU logic unit, the ALU unit to perform: a first instruction, said first instruction being a multiply add instruction; a second instruction, said second instruction to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction; a third instruction, said third instruction to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction; a fourth instruction, said fourth instruction to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate to determine first and second resultant digit values. 10. The apparatus of claim 9 wherein the image processor further comprises a two-dimensional shift register array structure, where, array locations of the two dimensional shift register array locally couple to respective execution lanes of the array of execution lanes. 11. The apparatus of claim 9 wherein the execution unit is further to perform a fifth instruction, said fifth instruction to also perform parallel ALU operations with the first and second ALU logic unit operating simultaneously. 12. The apparatus of claim 11 wherein the second and fifth instructions operate according to different data widths. 13. The apparatus of claim 12 wherein one of the widths is 8 bits and another of the widths is 16 bits. 14. The apparatus of claim 11 wherein the first and second ALU logic units are coupled with a carry term signal line. 15. The apparatus of claim 10 wherein the execution unit is further to perform a fifth instruction, said fifth instruction being a FUSED instruction. 16. The apparatus of claim 10 wherein the execution unit is further to perform a fifth instruction, said fifth instruction being a double wide instruction in which the first and second ALU logic units respectively generate different halves of a resultant. 17. The apparatus of claim 9 wherein the second, third and fourth instructions consume less time than the first instruction and the cycle time of the execution unit is commensurate with the time consumed by the first instruction. 18. The apparatus of claim 9 wherein the image processor is within a computing system. 19. An apparatus, comprising: a circuit design synthesis tool compatible description of an execution unit, said execution unit comprising a multiply add computation unit, a first ALU logic unit and a second ALU logic unit, the ALU unit to perform: a first instruction, said first instruction being a multiply add instruction; a second instruction, said second instruction to perform parallel ALU operations with the first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction; a third instruction, said third instruction to perform sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction; a fourth instruction, said fourth instruction to perform an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate to determine first and second division resultant digit values. 20. The apparatus of claim 19 wherein the description further describes an image processor comprising an array of execution lanes one of the execution lanes comprising the execution unit. 21. The apparatus of claim 20 wherein the image processor further comprises a two-dimensional shift register array circuit, where, array locations of the two dimensional shift register array circuit locally couple to respective ALU units of the array of execution lanes. 22. The apparatus of claim 9 wherein the execution unit is further to perform a fifth instruction, said fifth instruction to also perform parallel ALU operations with the first and second ALU logic unit operating simultaneously. 23. A method, comprising: performing the following with an execution unit of an image processor: executing a first instruction, said first instruction being a multiply add instruction; executing a second instruction including performing parallel ALU operations with first and second ALU logic units operating simultaneously to produce different respective output resultants of the second instruction; executing a third instruction including performing sequential ALU operations with one of the ALU logic units operating from an output of the other of the ALU logic units to determine an output resultant of the third instruction; executing a fourth instruction including performing an iterative divide operation in which the first ALU logic unit and the second ALU logic unit operate to determine a first and second division resultant digit values.

Assignees

Inventors

Classifications

  • G06F7/57Primary

    Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations {(G06F7/49, G06F7/491 take precedence)} · CPC title

  • with variable precision · CPC title

  • comprising an array of processing units with common control, e.g. single instruction multiple data processors (G06F15/82 takes precedence {; for correlation function computation G06F17/15}) · CPC title

  • G06F9/3001Primary

    Arithmetic instructions · CPC title

  • controlled by a single instruction for multiple data lanes [SIMD] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9830150B2 cover?
An apparatus is described that includes an execution unit having a multiply add computation unit, a first ALU logic unit and a second ALU logic unit. The ALU unit is to perform first, second, third and fourth instructions. The first instruction is a multiply add instruction. The second instruction is to perform parallel ALU operations with the first and second ALU logic units operating simultan…
Who is the assignee on this patent?
Google Inc, Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F7/57. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).