Neural network processor

US9747546B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9747546-B2
Application numberUS-201514844524-A
CountryUS
Kind codeB2
Filing dateSep 3, 2015
Priority dateMay 21, 2015
Publication dateAug 29, 2017
Grant dateAug 29, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A circuit for performing neural network computations for a neural network comprising a plurality of neural network layers, the circuit comprising: a matrix computation unit configured to, for each of the plurality of neural network layers: receive a plurality of weight inputs and a plurality of activation inputs for the neural network layer, and generate a plurality of accumulated values based on the plurality of weight inputs and the plurality of activation inputs; and a vector computation unit communicatively coupled to the matrix computation unit and configured to, for each of the plurality of neural network layers: apply an activation function to each accumulated value generated by the matrix computation unit to generate a plurality of activated values for the neural network layer.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for performing neural network computations for a neural network having a plurality of neural network layers, the system comprising: a hardware circuit comprising at least a first circuit portion that includes a matrix computation unit comprising M ×N cells, wherein M and N are positive integers that are greater than one, and wherein each cell of the M ×N cells of the first circuit portion includes respective circuitry configured to: obtain, from an adjacent cell along a first dimension of the matrix computation unit, a respective weight input the respective weight input being a weight input for a neural network layer of the plurality of neural network layers; obtain, from an adjacent cell along a second dimension of the matrix computation unit, a respective activation input for the neural network layer; determine a respective multiplication product based on the respective weight input and the respective activation input; determine a respective accumulated value based at least on the respective multiplication product; provide, to another adjacent cell along the first dimension of the matrix computation unit, the respective accumulated value for determining an output for the neural network layer; and provide, to an adjacent cell along the second dimension of the matrix computation unit, the respective activation input for the neural network layer. 2. The system of claim 1 , wherein, for each of M cells along the first dimension, the adjacent cell along the second dimension obtains the respective activation input from a respective value loader. 3. The system of claim 2 , wherein, for each of N cells along the second dimension, the adjacent cell along the first dimension obtains the respective weight input from a weight fetcher interface. 4. The system of claim 1 , wherein the respective weight input is obtained from the adjacent cell along the first dimension and the respective activation input is obtained from the adjacent cell along the second dimension periodically over a predetermined number of clock cycles. 5. The system of claim 1 , wherein for each cell of (M−1)×N cells of the M ×N cells, the respective circuitry is further configured to obtain a respective second accumulated value shifted from another cell of the M ×N cells, wherein determining the respective accumulated value comprises determining the respective accumulated value based at least on the respective multiplication product and the respective second accumulated value. 6. The system of claim 1 , wherein determining the respective multiplication product comprises determining the respective multiplication product based on a control signal. 7. The system of claim 1 , wherein, for each of N cells along the second dimension, the another adjacent cell long the first dimension provides one or more values to a respective accumulator unit. 8. The system of claim 1 , wherein providing the respective accumulated value comprises providing the respective accumulated value based on a control signal. 9. The system of claim 1 , wherein providing the respective activation input comprises providing the respective activation input based on a control signal. 10. The system of claim 1 , further comprising: a first memory configured to provide activation inputs for the plurality of neural network layers; and a second memory configured to provide weight inputs for the plurality of neural network layers. 11. The system of claim 10 , wherein the hardware circuit further comprises a second circuit portion that includes: vector computation circuitry configured to: determine an activation vector based on one or more accumulated values received from the matrix computation unit; and provide the activation vector to the first memory. 12. The system of claim 11 , further comprising: sequencer circuitry configured to provide one or more control signals to the first memory, the second memory, the vector computation circuitry, or the matrix computation unit to control a dataflow of the system. 13. A method for performing neural network computations for a neural network having a plurality of neural network layers, the method comprising: for each cell of M ×N cells of a matrix computation unit that is disposed within at least a first circuit portion of a hardware circuit comprising the neural network: obtaining, from an adjacent cell along a first dimension of the matrix computation unit, a respective weight input for a neural network layer of the plurality of neural network layers; obtaining, from an adjacent cell along a second dimension of the matrix computation unit, a respective activation input for the neural network layer; determining a respective multiplication product based on the respective weight input and the respective activation input; determining a respective accumulated value based at least on the respective multiplication product; providing, to another adjacent cell along the first dimension of the matrix computation unit, the respective accumulated value for determining an output for the neural network layer, wherein M and N are positive integers that are greater than one; and providing, to an adjacent cell along the second dimension of the matrix computation unit, the respective activation input for the neural network layer. 14. The method of claim 13 , wherein, for each of M cells along the first dimension, the adjacent cell along the second dimension obtains the respective activation input from a respective value loader. 15. The method of claim 14 , wherein, for each of the N cells along the second dimension, the adjacent cell along the first dimension obtains the respective weight input from a weight fetcher interface. 16. The method of claim 1 , further comprising: for each cell of (M−1)×N cells of the M×N cells, obtaining a respective second accumulated value shifted from another cell of the M×N cells, wherein determining the respective accumulated value comprises determining the respective accumulated value based at least on the respective multiplication product and the respective second accumulated value. 17. The method of claim 13 , wherein, for each of N cells along the second dimension, the another adjacent cell long the first dimension provides one or more values to a respective accumulator unit. 18. A matrix computation unit configured to be disposed within at least a first circuit portion of a hardware circuit, the matrix computation unit for performing neural network computations for a neural network having a plurality of neural network layers, the matrix computation unit comprising M×N cells, wherein M and N are positive integers that are greater than one, and wherein each cell of the M×N cells of the first circuit portion includes respective circuitry configured to: obtain, from an adjacent cell along a first dimension of the matrix computation unit, a respective weight input for a neural network layer of the plurality of neural network layers; obtain, from an adjacent cell along a second dimension of the matrix computation unit, a respective activation input for the neural network layer; determine a respective multiplication product based on the respective weight input and the respective activation input; determine a respective accumulated value based at least on the respective multiplication product; provide, to another adjacent cell along the second dimension of the matrix computation unit, the respective accumulated value for determining an output for the neural network layer; and prov

Assignees

Inventors

Classifications

  • Inference or reasoning models · CPC title

  • Systolic arrays · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9747546B2 cover?
A circuit for performing neural network computations for a neural network comprising a plurality of neural network layers, the circuit comprising: a matrix computation unit configured to, for each of the plurality of neural network layers: receive a plurality of weight inputs and a plurality of activation inputs for the neural network layer, and generate a plurality of accumulated values based …
Who is the assignee on this patent?
Google Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 29 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).