Method and apparatus for distributed and cooperative computation in artificial neural networks
US-2017277658-A1 · Sep 28, 2017 · US
US11238347B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11238347-B2 |
| Application number | US-201816146632-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 28, 2018 |
| Priority date | Sep 28, 2018 |
| Publication date | Feb 1, 2022 |
| Grant date | Feb 1, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Parallel processing among arrays of physical neural cores is provided. An array of neural cores is adapted to compute, in parallel, an output activation tensor of a neural network layer. A network is operatively connected to each of the neural cores. The output activation tensor is distributed across the neural cores. An input activation tensor is distributed across the neural cores. A weight tensor is distributed across the neural cores. Each neural core's computation comprises multiplying elements of a portion of the input activation tensor at that core with elements of a portion of the weight tensor at that core, and storing the summed products in a partial sum corresponding to an element of the output activation tensor. Each element of the output activation tensor is computed by accumulating all of the partial sums corresponding to that element via the network. The partial sums for each element of the output activation tensor are computed in a sequence of steps whose order is described by tracing a path through the weight tensor that visits every weight tensor element that contributes to any partial sum.
Opening claim text (preview).
What is claimed is: 1. A system comprising: an array of neural cores adapted to compute, in parallel, an output activation tensor of a neural network layer; a network operatively connected to each of the neural cores, wherein: the output activation tensor is distributed across the array of neural cores; an input activation tensor is distributed across the array of neural cores; a weight tensor is distributed across the array of neural cores; each neural core's computation comprises multiplying elements of a portion of the input activation tensor at that core with elements of a portion of the weight tensor at that core, and storing the summed products in a partial sum corresponding to an element of the output activation tensor; each element of the output activation tensor is computed by accumulating all of the partial sums corresponding to that element via the network; the partial sums for each element of the output activation tensor are computed in a sequence of steps whose order is described by tracing a path through the weight tensor that visits every weight tensor element that contributes to any partial sum. 2. The system of claim 1 , wherein each neural core is configured to compute the at least one output activation from the partial sums. 3. The system of claim 1 , wherein the network interconnects adjacent neural cores within the array. 4. The system of claim 1 , wherein the network interconnects neighborhoods of neural cores within the array. 5. The system of claim 1 , wherein the network interconnects all neural cores within the array. 6. The system of claim 1 , wherein the path through the weight tensor is configurable in each core. 7. The system of claim 1 , wherein the path through the weight tensor is continuous. 8. The system of claim 1 , wherein the path through the weight tensor is discontinuous. 9. The system of claim 1 , wherein the path through the weight tensor comprises a space-filling curve. 10. The system of claim 1 , wherein the path through the weight tensor terminates at its radial center. 11. The system of claim 1 , wherein no segment of the path through the weight tensor is directed away from its radial center. 12. The system of claim 1 , wherein the path through the weight tensor comprises a serpentine path. 13. The system of claim 1 , wherein the path through the weight tensor comprises a spiral path. 14. The system of claim 1 , wherein the path through the weight tensor comprises a pinwheel path. 15. The system of claim 1 , wherein the path through the weight tensor comprises a horizontal-vertical path. 16. The system of claim 1 , wherein each neural core is adapted to execute microcode to compute and communicate partial sums. 17. The system of claim 16 , wherein each neural core is loaded with the same microcode. 18. The system of claim 2 , wherein the each neural core is further adapted to communicate the at least one output activation via the network. 19. The system of claim 1 , wherein the path through the weight tensor is two-dimensional. 20. The system of claim 1 , wherein the path through the weight tensor is three-dimensional. 21. A method comprising: by each neural core of an array of neural cores, applying a weight tensor to a plurality of input activations to compute partial sums in a sequence of steps whose order is described by tracing a path through the weight tensor that visits every weight tensor element that contributes to any partial sum; communicating partial sums to at least one adjacent neural core within the array via a network. 22. The method of claim 21 , further comprising: computing at least one output activation of a neural network layer from the partial sums. 23. The method of claim 21 , wherein the network interconnects adjacent neural cores within the array. 24. The method of claim 21 , wherein the network connects neighborhoods of neural cores within the array. 25. The method of claim 21 , wherein the network connects all neural cores within the array.
Combinations of networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
using electronic means · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.