Neural network processor

US11049016B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11049016-B2
Application numberUS-202016824411-A
CountryUS
Kind codeB2
Filing dateMar 19, 2020
Priority dateMay 21, 2015
Publication dateJun 29, 2021
Grant dateJun 29, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A circuit for performing neural network computations for a neural network comprising a plurality of neural network layers, the circuit comprising: a matrix computation unit configured to, for each of the plurality of neural network layers: receive a plurality of weight inputs and a plurality of activation inputs for the neural network layer, and generate a plurality of accumulated values based on the plurality of weight inputs and the plurality of activation inputs; and a vector computation unit communicatively coupled to the matrix computation unit and configured to, for each of the plurality of neural network layers: apply an activation function to each accumulated value generated by the matrix computation unit to generate a plurality of activated values for the neural network layer.

First claim

Opening claim text (preview).

What is claimed is: 1. A circuit for performing neural network computations for a neural network comprising a plurality of neural network layers, the circuit comprising: a matrix computation unit configured to, for each of the plurality of neural network layers: receive a plurality of weight inputs and a plurality of activation inputs for the neural network layer, and generate a plurality of accumulated values based on the plurality of weight inputs and the plurality of activation inputs, wherein the matrix computation unit is configured as a two-dimensional array comprising a plurality of cells, wherein the plurality of weight inputs is shifted through cells along a first dimension of the two-dimensional array, and wherein the plurality of activation inputs is shifted through cells along a second dimension of the two-dimensional array; and a vector computation unit configured to, for each of the plurality of neural network layers: apply an activation function to each of the plurality of accumulated values for the neural network layer generated by the matrix computation unit to generate a plurality of activated values for the neural network layer. 2. The circuit of claim 1 , wherein the first dimension is different from the second dimension. 3. The circuit of claim 1 , wherein the circuit is configured to: receive instructions from a host; convert the instructions to control signals; and provide the control signals to the vector computation unit and the matrix computation unit to perform the neural network computations. 4. The circuit of claim 3 wherein the circuit is configured to: use the timing of clock signals to send the control signals to one or more components of the circuit; and regulate dataflow in the circuit based on the timing of the clock signals. 5. The circuit of claim 1 , wherein the matrix computation unit is configured to: using respective circuitry in at least two distinct cells of the plurality of cells, process the plurality of weight inputs and the plurality of activation inputs to generate the plurality of accumulated values for the neural network layer. 6. The circuit of claim 1 , wherein the matrix computation unit comprises: multiplication circuitry in each cell of the plurality of cells of the two-dimensional array, the multiplication circuitry being configured to multiply an activation input and a weight input to generate a product; and an accumulator unit coupled to a cell along a dimension of the two-dimensional array, the accumulator unit being configured to accumulate values that are based on products generated by multiplication circuitry in one or more cells along the dimension of the two-dimensional array. 7. The circuit of claim 6 , wherein the matrix computation unit comprises: a weight register in each cell of the plurality of cells of the two-dimensional array, the weight register being configured to store a weight input of the plurality of weight inputs. 8. The circuit of claim 6 , wherein the matrix computation unit is configured to perform a plurality of multiply-accumulate operations at least by: using multiplication circuitry in the matrix computation unit to perform a plurality of multiplication operations between the plurality of activation inputs and the plurality of weight inputs; and using the accumulator unit to generate the plurality of accumulated values based on the multiplication operations between the plurality of activation inputs and the plurality of weight inputs. 9. The circuit of claim 1 , wherein the vector computation unit comprises: respective computation units that are each configured to perform arithmetic operations; and each of the respective computation units are configured to be controlled at the circuit using control signals received by the vector computation unit. 10. The circuit of claim 9 , wherein the vector computation unit comprises: an activation unit configured to apply the activation function to an accumulated value of the plurality of accumulated values for the neural network layer to generate an activated value of the plurality of activated values for the neural network layer. 11. The circuit of claim 9 , wherein the vector computation unit comprises: a normalization unit configured to normalize the plurality of activated values for the neural network layer to generate a plurality of normalized values from the plurality of activated values. 12. The circuit of claim 1 , wherein the activation function is a non-linear function. 13. A method for performing neural network computations for a neural network comprising a plurality of neural network layers using a circuit comprising a matrix computation unit and a vector computation unit, wherein the method comprises, for each of the plurality of neural network layers: receiving, by the matrix computation unit, a plurality of weight inputs and a plurality of activation inputs for the neural network layer, wherein the matrix computation unit is configured as a two-dimensional array comprising a plurality of cells; shifting the plurality of weight inputs through cells along a first dimension of the two-dimensional array; shifting the plurality of activation inputs through cells along a second dimension of the two-dimensional array; generating a plurality of accumulated values based on the plurality of weight inputs that are shifted through cells along the first dimension and the plurality of activation inputs that are shifted through cells along the second dimension; and applying, by the vector computation unit, an activation function to each of the plurality of accumulated values generated by the matrix computation unit to generate a plurality of activated values for the neural network layer. 14. The method of claim 13 , wherein the first dimension is different from the second dimension. 15. The method of claim 13 , further comprising: receiving instructions from a host; converting the instructions to control signals; and providing the control signals to the vector computation unit and the matrix computation unit to perform the neural network computations. 16. The method of claim 15 , further comprising: sending the control signals to one or more components of the circuit using the timing of clock signals; and regulating dataflow in the circuit based on the timing of the clock signals. 17. The method of claim 13 , further comprising: processing, using respective circuitry in at least two distinct cells of the plurality of cells, the plurality of weight inputs and the plurality of activation inputs to generate the plurality of accumulated values for the neural network layer. 18. The method of claim 13 , wherein the matrix computation unit comprises multiplication circuitry in each cell of the plurality of cells and an accumulator unit coupled to a cell along a dimension of the two-dimensional array, and the method comprises: multiplying, using the multiplication circuitry, an activation input and a weight input to generate a product; and accumulating, using the accumulator unit, values that are based on products generated by multiplication circuitry in one or more cells along the dimension of the two-dimensional array. 19. The method of claim 18 , wherein the matrix computation unit comprises a weight register in each cell of the plurality of cells and the method comprises: storing a weight input of the plurality of weight inputs in at least one weight register in a cell of the matrix computation unit. 20. The method of claim 18 , wherein

Assignees

Inventors

Classifications

  • G06N3/08Primary

    Learning methods · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Inference or reasoning models · CPC title

  • Systolic arrays · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11049016B2 cover?
A circuit for performing neural network computations for a neural network comprising a plurality of neural network layers, the circuit comprising: a matrix computation unit configured to, for each of the plurality of neural network layers: receive a plurality of weight inputs and a plurality of activation inputs for the neural network layer, and generate a plurality of accumulated values based …
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 29 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).