Data distribution in an array of neural network cores

US11238347B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11238347-B2
Application numberUS-201816146632-A
CountryUS
Kind codeB2
Filing dateSep 28, 2018
Priority dateSep 28, 2018
Publication dateFeb 1, 2022
Grant dateFeb 1, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Parallel processing among arrays of physical neural cores is provided. An array of neural cores is adapted to compute, in parallel, an output activation tensor of a neural network layer. A network is operatively connected to each of the neural cores. The output activation tensor is distributed across the neural cores. An input activation tensor is distributed across the neural cores. A weight tensor is distributed across the neural cores. Each neural core's computation comprises multiplying elements of a portion of the input activation tensor at that core with elements of a portion of the weight tensor at that core, and storing the summed products in a partial sum corresponding to an element of the output activation tensor. Each element of the output activation tensor is computed by accumulating all of the partial sums corresponding to that element via the network. The partial sums for each element of the output activation tensor are computed in a sequence of steps whose order is described by tracing a path through the weight tensor that visits every weight tensor element that contributes to any partial sum.

First claim

Opening claim text (preview).

What is claimed is: 1. A system comprising: an array of neural cores adapted to compute, in parallel, an output activation tensor of a neural network layer; a network operatively connected to each of the neural cores, wherein: the output activation tensor is distributed across the array of neural cores; an input activation tensor is distributed across the array of neural cores; a weight tensor is distributed across the array of neural cores; each neural core's computation comprises multiplying elements of a portion of the input activation tensor at that core with elements of a portion of the weight tensor at that core, and storing the summed products in a partial sum corresponding to an element of the output activation tensor; each element of the output activation tensor is computed by accumulating all of the partial sums corresponding to that element via the network; the partial sums for each element of the output activation tensor are computed in a sequence of steps whose order is described by tracing a path through the weight tensor that visits every weight tensor element that contributes to any partial sum. 2. The system of claim 1 , wherein each neural core is configured to compute the at least one output activation from the partial sums. 3. The system of claim 1 , wherein the network interconnects adjacent neural cores within the array. 4. The system of claim 1 , wherein the network interconnects neighborhoods of neural cores within the array. 5. The system of claim 1 , wherein the network interconnects all neural cores within the array. 6. The system of claim 1 , wherein the path through the weight tensor is configurable in each core. 7. The system of claim 1 , wherein the path through the weight tensor is continuous. 8. The system of claim 1 , wherein the path through the weight tensor is discontinuous. 9. The system of claim 1 , wherein the path through the weight tensor comprises a space-filling curve. 10. The system of claim 1 , wherein the path through the weight tensor terminates at its radial center. 11. The system of claim 1 , wherein no segment of the path through the weight tensor is directed away from its radial center. 12. The system of claim 1 , wherein the path through the weight tensor comprises a serpentine path. 13. The system of claim 1 , wherein the path through the weight tensor comprises a spiral path. 14. The system of claim 1 , wherein the path through the weight tensor comprises a pinwheel path. 15. The system of claim 1 , wherein the path through the weight tensor comprises a horizontal-vertical path. 16. The system of claim 1 , wherein each neural core is adapted to execute microcode to compute and communicate partial sums. 17. The system of claim 16 , wherein each neural core is loaded with the same microcode. 18. The system of claim 2 , wherein the each neural core is further adapted to communicate the at least one output activation via the network. 19. The system of claim 1 , wherein the path through the weight tensor is two-dimensional. 20. The system of claim 1 , wherein the path through the weight tensor is three-dimensional. 21. A method comprising: by each neural core of an array of neural cores, applying a weight tensor to a plurality of input activations to compute partial sums in a sequence of steps whose order is described by tracing a path through the weight tensor that visits every weight tensor element that contributes to any partial sum; communicating partial sums to at least one adjacent neural core within the array via a network. 22. The method of claim 21 , further comprising: computing at least one output activation of a neural network layer from the partial sums. 23. The method of claim 21 , wherein the network interconnects adjacent neural cores within the array. 24. The method of claim 21 , wherein the network connects neighborhoods of neural cores within the array. 25. The method of claim 21 , wherein the network connects all neural cores within the array.

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

  • Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11238347B2 cover?
Parallel processing among arrays of physical neural cores is provided. An array of neural cores is adapted to compute, in parallel, an output activation tensor of a neural network layer. A network is operatively connected to each of the neural cores. The output activation tensor is distributed across the neural cores. An input activation tensor is distributed across the neural cores. A weight t…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 01 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).