Neural network compute tile
US-9710265-B1 · Jul 18, 2017 · US
US9818059B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9818059-B1 |
| Application number | US-201715465774-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 22, 2017 |
| Priority date | Oct 27, 2016 |
| Publication date | Nov 14, 2017 |
| Grant date | Nov 14, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method includes receiving, by a computing device, input activations and determining, by a controller of the computing device, whether each of the input activations has either a zero value or a non-zero value. The method further includes storing, in a memory bank of the computing device, at least one of the input activations. Storing the at least one input activation includes generating an index comprising one or more memory address locations that have input activation values that are non-zero values. The method still further includes providing, by the controller and from the memory bank, at least one input activation onto a data bus that is accessible by one or more units of a computational array. The activations are provided, at least in part, from a memory address location associated with the index.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for performing computations for a neural network having a plurality of neural network layers, the method comprising: receiving, by a computing device, a plurality of input activations for processing by a neural network layer of the plurality of neural network layers; determining, by a controller of the computing device, whether each of the plurality of input activations, for processing by the neural network layer, has a zero value or a non-zero value; storing, in a memory bank of the computing device, at least one of the plurality of input activations; generating, by the controller, an index identifying only memory address locations in the memory bank that store non-zero input activation values for processing by the neural network layer; and providing, by the controller and from the memory bank, at least one input activation onto a data bus that is accessible by one or more units of a computational array of the computing device, wherein the at least one input activation is provided, at least in part, from a memory address location of the index. 2. The method of claim 1 , wherein the index is generated based on a bitmap comprising a plurality of bits and, wherein each bit of the bitmap indicates at least one of a non-zero input activation value or a zero input activation value. 3. The method of claim 1 , further including, providing a first input activation that has a non-zero value to perform, by at least one unit of the computational array, a neural network inference computation using the non-zero input activation value, and subsequently providing a second input activation that has a zero value, and preventing, in at least one unit of the computational array, a neural network inference computation that would otherwise be performed using the zero input activation value. 4. The method of claim 3 , wherein preventing the neural network inference computation that would otherwise be performed using the zero input activation value occurs in response to the controller determining that the second input activation is provided from a memory address location that is not identified in the index. 5. The method of claim 3 , further including, detecting, by the controller, that the second input activation is provided from a memory address location that is not identified in the index, and, in response to the detecting, providing a control signal to at least one unit of the computational array to prevent a multiply operation associated with the zero input activation value. 6. The method of claim 1 , wherein the method further comprises: mapping, by the controller and to a first unit of the computational array, a first portion of a tensor computation that uses a first input activation; and mapping, by the controller and to a second unit of the computational array that differs from the first unit, a second portion of the tensor computation that also uses the first input activation. 7. The method of claim 1 , further comprising, sequentially providing a single input activation onto the data bus, the single input activation being accessed and selected from memory address locations in the memory bank that are identified using the index. 8. The method of claim 1 , wherein providing the at least one input activation onto the data bus comprises, not providing input activations that have a zero value. 9. One or more non-transitory machine-readable storage devices storing instructions for performing computations for a neural network having a plurality of neural network layers, where the instructions are executable by one or more processing devices to cause performance of operations comprising: receiving, by a computing device, a plurality of input activations for processing by a neural network layer of the plurality of neural network layers; determining, by a controller of the computing device, whether each of the plurality of input activations, for processing by the neural network layer, has a zero value or a non-zero value; storing, in a memory bank of the computing device, at least one of the plurality of input activations; generating, by the controller, an index identifying only memory address locations in the memory bank that store non-zero input activation values for processing by the neural network layer; and providing, by the controller and from the memory bank, at least one input activation onto a data bus that is accessible by one or more units of a computational array of the computing device, wherein the at least one input activation is provided, at least in part, from a memory address location of the index. 10. The machine-readable storage devices of claim 9 , wherein the index is generated based on a bitmap comprising a plurality of bits and, wherein each bit of the bitmap indicates at least one of a non-zero input activation value or a zero input activation value. 11. The machine-readable storage devices of claim 9 , where the operations further comprise: providing a first input activation that has a non-zero value to perform, by at least one unit of the computational array, a neural network inference computation using the non-zero input activation value, and subsequently providing a second input activation that has a zero value, and preventing, in at least one unit of the computational array, a neural network inference computation that would otherwise be performed using the zero input activation value. 12. The machine-readable storage devices of claim 11 , wherein preventing the neural network inference computation that would otherwise be performed using the zero input activation value occurs in response to the controller determining that the second input activation is provided from a memory address location that is not identified in the index. 13. The machine-readable storage devices of claim 11 , further including, detecting, by the controller, that the second input activation is provided from a memory address location that is not associated with the index, and, in response to detecting, providing a control signal to at least one unit of the computational array to prevent a multiply operation associated with the zero input activation value. 14. The machine-readable storage devices of claim 9 , wherein the operations further comprise: mapping, by the controller and to a first unit of the computational array, a first portion of a tensor computation that uses a first input activation; and mapping, by the controller and to a second unit of the computational array that differs from the first unit, a second portion of the tensor computation that also uses the first input activation. 15. An electronic system comprising: a controller located in a computing device, the controller including one or more processing devices; and one or more non-transitory machine-readable storage devices for storing instructions that are executable by the one or more processing devices to cause performance of operations comprising: receiving, by the computing device, a plurality of input activations for processing by a neural network layer of the plurality of neural network layers; determining, by the controller, whether each of the plurality of input activations, for processing by the neural network layer, has a zero value or a non-zero value; storing, in a memory bank of the computing device, at least one of the plurality of input activations; generating, by the controller, an index identifying only memory address locations in the memory bank that store non-zero input activation values for processing by the neural network layer; and providing, by the controller and from the memory bank,
Combinations of networks · CPC title
Arithmetic instructions · CPC title
Operand accessing · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.