Sparse convolutional neural network accelerator
US-2018046900-A1 · Feb 15, 2018 · US
US12073306B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12073306-B2 |
| Application number | US-202117551967-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 15, 2021 |
| Priority date | Dec 15, 2020 |
| Publication date | Aug 27, 2024 |
| Grant date | Aug 27, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are disclosed for a centrosymmetric convolutional neural network (CSCNN), an algorithm/hardware co-design framework for CNN compression and acceleration that mitigates the effects of computational irregularity and effectively exploits computational reuse and sparsity for increased performance and energy efficiency.
Opening claim text (preview).
The invention claimed is: 1. A centrosymmetric convolutional neural network computer system comprising: an input activation position buffer, an input activation value buffer, a weight value buffer; a weight position buffer; a multiplier array that receives a first vector from the input activation value buffer and a second vector from the weight value buffer and calculates a third vector of products; and a coordinate computation unit that: receives first associated positions of the first vector from the input activation position buffer and second associated positions of the second vector from the weight position buffer, calculates a fourth vector of associated positions; transforms the second associated positions into third associated positions, wherein the third associated positions are centrosymmetric; combines the second associated positions and the third associated positions into a fifth vector of associated positions. 2. The convolutional neural network computer system of claim 1 , wherein the third vector is stored at a first accumulator buffer and a second accumulator buffer. 3. The convolutional neural network computer system of claim 2 , wherein the third vector is stored at an associated position in the fourth vector of associated positions at the first accumulator buffer. 4. The convolutional neural network computer system of claim 2 , wherein the third vector is stored at an associated position in the fifth vector of associated positions at the second accumulator buffer. 5. The convolutional neural network computer system of claim 1 , wherein the multiplier array performs a full Cartesian product of the first vector and the second vector to generate multiplier outputs. 6. The convolutional neural network computer system of claim 5 , wherein the multiplier outputs are routed to an accumulator bank using a crossbar switch. 7. The convolutional neural network computer system of claim 1 , wherein the first vector and the second vector are non-zero values. 8. The convolutional neural network computer system of claim 1 , wherein the multiplier array uses input-stationary computation order. 9. A computer-implemented method of operating a centrosymmetric convolutional neural network comprising: receiving a first vector from an input activation value buffer and a second vector from a weight value buffer at a multiplier array; calculating a third vector of products at the multiplier array; receiving first associated positions of the first vector from the input activation position buffer and second associated positions of the second vector from a weight position buffer at a coordinate computation unit; calculating a fourth vector of associated positions at the coordinate computation unit; transforming the second associated positions into third associated positions at the coordinate computation unit, wherein the third associated positions are centrosymmetric; and combining the second associated positions and the third associated positions into a fifth vector of associated positions at the coordinate computation unit. 10. The computer-implemented method of claim 9 , wherein the third vector is stored at a first accumulator buffer and a second accumulator buffer. 11. The computer-implemented method of claim 10 , wherein the third vector is stored at an associated position in the fourth vector of associated positions at the first accumulator buffer. 12. The computer-implemented method of claim 11 , wherein the third vector is stored at an associated position in the fifth vector of associated positions at the second accumulator buffer. 13. The computer-implemented method of claim 9 , further comprising performing a full Cartesian product of the first vector and the second vector to generate multiplier outputs at the multiplier array. 14. The computer-implemented method of claim 13 , wherein the multiplier outputs are routed to an accumulator bank using a crossbar switch. 15. The computer-implemented method of claim 9 , wherein the first vector and the second vector are non-zero values. 16. The computer-implemented method of claim 9 , wherein the multiplier array uses input-stationary computation order.
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.