What technology area does this patent fall under?

Primary CPC classification G06N3/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural network compute tile

US10175980B2 · US · B2

Patent metadata
Field	Value
Publication number	US-10175980-B2
Application number	US-201615335769-A
Country	US
Kind code	B2
Filing date	Oct 27, 2016
Priority date	Oct 27, 2016
Publication date	Jan 8, 2019
Grant date	Jan 8, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing unit is disclosed, comprising a first memory bank for storing input activations and a second memory bank for storing parameters used in performing computations. The computing unit includes at least one cell comprising at least one multiply accumulate (“MAC”) operator that receives parameters from the second memory bank and performs computations. The computing unit further includes a first traversal unit that provides a control signal to the first memory bank to cause an input activation to be provided to a data bus accessible by the MAC operator. The computing unit performs one or more computations associated with at least one element of a data array, the one or more computations being performed by the MAC operator and comprising, in part, a multiply operation of the input activation received from the data bus and a parameter received from the second memory bank.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: multiple sets of hardware computing units for accelerating inference computations for a plurality of layers of a neural network, wherein each set of hardware computing units comprises: a first computing unit configured to: receive instructions for performing inference computations for a first layer of the plurality of layers of the neural network, layer inputs for the first layer, and a respective set of weights for the first layer; and perform at least a subset of the inference computations for the first layer based on execution of a first loop nest to access the layer inputs for the first layer and the respective set of weights for the first layer; and a second computing unit configured to: receive instructions for performing inference computations for a second layer of the plurality of layers of the neural network, layer inputs for the second layer, and the respective set of weights for the second layer; and perform at least a subset of the inference computations for the second layer based on execution of a second loop nest to access the layer inputs for the second layer and the respective set of weights for the second layer. 2. The system of claim 1 , wherein execution of the first loop nest comprises: executing, by a processor of the first computing unit, a plurality of nested loops included in the first loop nest; and accessing, memory of the first computing unit, to retrieve data corresponding to elements of a tensor, wherein the data includes at least one of: the layer inputs for the first layer or weights for the first layer. 3. The system of claim 2 , wherein the first computing unit comprises: at least one traversal unit that uses the first loop nest to access the elements of the tensor; wherein a structure of the first loop nest indicates a manner in which the at least one traversal unit traverses dimensions of the tensor. 4. The system of claim 1 , wherein execution of the second loop nest comprises: executing, by a processor of the second computing unit, a plurality of nested loops included in the second loop nest; accessing, memory of the second computing unit, to retrieve data corresponding to elements of a tensor, wherein the data includes at least one of: the layer inputs for the second layer or weights for the second layer. 5. The system of claim 4 , wherein the second computing unit comprises: at least one traversal unit that uses the second loop nest to access particular elements of the tensor; wherein a structure of the second loop nest indicates a manner in which the at least one traversal unit traverses dimensions of the tensor. 6. The system of claim 1 , wherein: the first layer is a neural network layer of a first layer type; and the second layer is a neural network layer of a second layer type that is different than the first layer type. 7. The system of claim 1 , further comprising: a data communications instruction bus configured to: receive one or more instructions from an external source; provide, to the first computing unit, the instructions for performing the subset of inference computations for the first layer; and provide, to the second computing unit, the instructions for performing the subset of inference computations for the second layer. 8. The system of claim 7 , further comprising: a data communications ring bus configured to: receive multiple inputs and multiple weights from an external source; provide, to the first computing unit, the layer inputs for the first layer, and the respective set of weights for the first layer; and provide, to the second computing unit, the layer inputs for the second layer, and the respective set of weights for the second layer. 9. A method of accelerating inference computations for a plurality of layers of a neural network using a system comprising multiple sets of hardware computing units, the method comprising: receiving, by a first computing unit, instructions for performing inference computations for a first layer of the plurality of layers of the neural network, layer inputs for the first layer, and a respective set of weights for the first layer; performing, by the first computing unit, at least a subset of the inference computations for the first layer based on execution of a first loop nest to access the layer inputs for the first layer and the respective set of weights for the first layer; receiving, by a second computing unit, instructions for performing inference computations for a second layer of the plurality of layers of the neural network, layer inputs for the second layer, and a respective set of weights for the second layer; and performing, by the second computing unit, at least a subset of the inference computations for the second layer based on execution of a second loop nest to access the layer inputs for the second layer and the respective set of weights for the second layer. 10. The method of claim 9 , wherein execution of the first loop nest comprises: executing, by a processor of the first computing unit, a plurality of nested loops included in the first loop nest; and accessing, memory of the first computing unit, to retrieve data corresponding to elements of a tensor, wherein the data includes at least one of: the layer inputs for the first layer or weights for the first layer. 11. The method of claim 10 , wherein the first computing unit comprises: at least one traversal unit that uses the first loop nest to access the elements of the tensor; wherein a structure of the first loop nest indicates a manner in which the at least one traversal unit traverses one or more dimensions of the tensor. 12. The method of claim 9 , wherein execution of the second loop nest comprises: executing, by a processor of the second computing unit, a plurality of nested loops included in the second loop nest; accessing, memory of the second computing unit, to retrieve data corresponding to elements of a tensor, wherein the data includes at least one of: the layer inputs for the second layer or weights for the second layer. 13. The method of claim 12 , wherein the second computing unit comprises: at least one traversal unit that uses the second loop nest to access particular elements of the tensor; wherein a structure of the second loop nest indicates a manner in which the at least one traversal unit traverses dimensions of the tensor. 14. The method of claim 9 , wherein: the first layer is a neural network layer of a first layer type; and the second layer is a neural network layer of a second layer type that is different than the first layer type. 15. The method of claim 9 , further comprising: receiving, at an instruction bus, one or more instructions from an external source, wherein the instruction bus is configured to provide data communications to the multiple hardware computing units; providing, to the first computing unit and by the instruction bus, the instructions for performing the subset of inference computations for the first layer; and providing, to the second computing unit and by the instruction bus, the instructions for performing the subset of inference computations for the second layer. 16. The method of claim 15 , further comprising: receiving, at a ring bus, multiple inputs and multiple weights from an external source, wherein the ring bus is configured to provide data communications to the multiple hardware computing units; providing, to the first computing unit and by the ring bus, the layer inputs for the first layer, and the respective set of weights for the first layer; and

Assignees

Google Llc

Inventors

Classifications

G06N3/045
Combinations of networks · CPC title
G06N3/063Primary
using electronic means · CPC title
G06F9/30065
Loop control instructions; iterative instructions, e.g. LOOP, REPEAT · CPC title
G06F13/28Primary
using burst mode transfer, e.g. direct memory access {DMA}, cycle steal (G06F13/32 takes precedence) · CPC title
G06F9/3001Primary
Arithmetic instructions · CPC title

Patent family

Related publications grouped by family.

View patent family 59296600

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10175980B2 cover?: A computing unit is disclosed, comprising a first memory bank for storing input activations and a second memory bank for storing parameters used in performing computations. The computing unit includes at least one cell comprising at least one multiply accumulate (“MAC”) operator that receives parameters from the second memory bank and performs computations. The computing unit further includes a…
Who is the assignee on this patent?: Google Llc
What technology area does this patent fall under?: Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).