Multi-memory on-chip computational network
US-2019180170-A1 · Jun 13, 2019 · US
US11847553B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11847553-B2 |
| Application number | US-201816008949-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 14, 2018 |
| Priority date | Jun 14, 2018 |
| Publication date | Dec 19, 2023 |
| Grant date | Dec 19, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Neural network processing hardware using parallel computational architectures with reconfigurable core-level and vector-level parallelism is provided. In various embodiments, a neural network model memory is adapted to store a neural network model comprising a plurality of layers. Each layer has at least one dimension and comprises a plurality of synaptic weights. A plurality of neural cores is provided. Each neural core includes a computation unit and an activation memory. The computation unit is adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations. The computation unit has a plurality of vector units. The activation memory is adapted to store the input activations and the output activations. The system is adapted to partition the plurality of cores into a plurality of partitions based on dimensions of the layer and the vector units.
Opening claim text (preview).
What is claimed is: 1. A system comprising: a neural network model memory adapted to store a neural network model comprising a plurality of layers, each layer having at least one dimension and comprising a plurality of synaptic weights; a plurality of neural cores, each neural core comprising a computation unit, the computation unit adapted to apply a plurality of synaptic weights to a plurality of input activations to produce a plurality of output activations, the computation unit having a plurality of vector units, and an activation memory adapted to store the input activations and the output activations; wherein the system is adapted to partition the plurality of cores into a plurality of partitions based on a comparison of at least a portion of dimensions of a layer and a quantity of the vector units, wherein the comparison includes a comparison of a size of the feature dimensions for the layer with the quantity of the vector units, and wherein partitioning the plurality of cores comprises at least one of the plurality of cores being subdivided when the size of the feature dimensions of the layer is less than the quantity of the vector units. 2. The system of claim 1 , further comprising: at least one controller operatively coupled to the neural network model memory and to the plurality of cores, the at least one controller being adapted to, for each layer of the neural network model configure the plurality of cores to implement the layer, and provide input activations for the layer to the plurality of cores. 3. The system of claim 2 , further comprising a network on a chip (NoC) coupled to the plurality of cores. 4. The system of claim 3 , wherein input activations are provided to the plurality of cores via the NoC. 5. The system of claim 3 , wherein configuring the plurality of cores comprises distributing parameters to the plurality of cores via the NoC. 6. The system of claim 5 , wherein configuring the plurality of cores further comprises distributing instructions to the plurality of cores via the NoC. 7. The system of claim 1 , wherein the plurality of partitions for each layer is further determined based on spatial dimensions of the input activations for that layer. 8. The system of claim 1 , wherein the plurality of partitions for each layer is further determined based on spatial dimensions and a size of the feature dimensions of the input activations for that layer. 9. The system of claim 1 , wherein the plurality of partitions for each layer is further determined based on spatial dimensions of the output activations for that layer. 10. The system of claim 1 , wherein the plurality of partitions for each layer is further determined based on spatial dimensions and feature dimensions of the output activations for that layer. 11. The system of claim 1 , wherein the plurality of partitions for each layer is further determined based on one or more of spatial dimensions of the input activations, feature dimensions of the input activations, spatial dimensions of the output activations, or feature dimensions of the output activations for that layer. 12. The system of claim 11 , wherein the plurality of partitions for each layer is further determined by a dimension of the plurality of cores. 13. The system of claim 1 , wherein the cores within each of the plurality of partitions are configured to compute partial sums. 14. The system of claim 13 , wherein the partial sums are aggregated to compute a result for an associated layer. 15. The system of claim 14 , wherein the partial sums are transmitted via a network on a chip (NoC) for aggregation. 16. The system of claim 2 , wherein the at least one controller is further adapted to, upon computation of output activations of a layer, redistribute the output activations among the plurality of cores. 17. The system of claim 16 , wherein the redistribution is via a network. 18. The system of claim 16 , wherein the redistribution is determined based on one or more of spatial dimensions of the input activations, feature dimensions of the input activations, spatial dimensions of the output activations, or feature dimensions of the output activations for that layer. 19. A method comprising: reading a neural network model comprising a plurality of layers, each layer having at least one dimension and comprising a plurality of synaptic weights; for each layer of the neural network model comparing at least a portion of dimensions of a layer and a quantity of vector units, wherein the comparison includes a comparison of a size of the feature dimensions with the quantity of the vector units; partitioning a plurality of cores into a plurality of partitions based on the comparison, wherein partitioning the plurality of cores comprises at least one of the plurality of cores being subdivided when the size of the feature dimensions of the layer is less than the quantity of the vector units, configuring the plurality of cores to implement the layer, providing to the plurality of cores input activations for the layer, and applying the synaptic weights associated with the layer to the input activations to produce a plurality of output activations. 20. The method of claim 19 , further comprising: computing partial sums within each partition; transmitting the partial sums among cores within each partition; aggregating the partial sums to compute the output activations. 21. The method of claim 19 , wherein configuring the plurality of cores comprises distributing parameters to the plurality of cores via a network. 22. The method of claim 19 , wherein configuring the plurality of cores comprises distributing instructions to the plurality of cores via a network. 23. The method of claim 19 , wherein the plurality of partitions for each layer is further determined based on one or more of spatial dimensions of the input activations, feature dimensions of the input activations, spatial dimensions of the output activations, or feature dimensions of the output activations for that layer. 24. The system of claim 23 , wherein the plurality of partitions for each layer is further determined by a dimension of the plurality of cores.
Convolutional networks [CNN, ConvNet] · CPC title
using electronic means · CPC title
Array of vector units · CPC title
Architecture, e.g. interconnection topology · CPC title
the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.