Systems and methods for compression and acceleration of convolutional neural networks
US-12073306-B2 · Aug 27, 2024 · US
US12443571B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12443571-B2 |
| Application number | US-202117323490-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 18, 2021 |
| Priority date | Apr 9, 2021 |
| Publication date | Oct 14, 2025 |
| Grant date | Oct 14, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Apparatuses, systems, and techniques to transform data sets, such as matrices representing layers of neural networks, to increase sparsity and/or other characteristics of said data sets to improve performance in computations, such as neural network computations. In at least one embodiment, one or more subsets of data in one or more sets of data are rearranged as part of a process to increase sparsity in said one or more sets of data to satisfy one or more one or more structural sparsity constraints.
Opening claim text (preview).
What is claimed is: 1. One or more processors, comprising: circuitry to cause an increase in sparsity of one or more sets of data by at least rearranging elements in the one or more sets of data, such that the one or more sets of data comprise one or more subsets of data conforming to one or more structural sparsity constraints. 2. The one or more processors of claim 1 , wherein the one or more structural sparsity constraints impose a limitation on the one or more sets of data whereby the one or more subsets of data comprise at least a first quantity of non-zero data values and a second quantity of zero data values. 3. The one or more processors of claim 1 , wherein the one or more circuits rearrange circuitry rearranges elements in the one or more sets of data by swapping a first subset of the one or more subsets of data having a first position in the one or more sets of data with a second subset of the one or more subsets of data having a second position in the one or more sets of data. 4. The one or more processors of claim 1 , wherein: the one or more sets of data comprise numerical values accessible using at least a first index and a second index; and the one or more sets of data comprise a first subset of data associated with the first index and a second subset of data associated with the second index; and the circuitry causes the increase in sparsity by at least exchanging the numerical values of the first subset of data associated with the first index with the second subset of data associated with a second matrix. 5. The one or more processors of claim 1 , wherein the one or more sets of data comprise numerical values corresponding to weight parameters associated with one or more neural networks. 6. The one or more processors of claim 1 , wherein each data value in the one or more sets of data is associated with a first index value and a second index value, the first index value and the second index value indicating a position of each data value in the one or more sets of data. 7. The one or more processors of claim 1 , wherein the circuitry causes the increase in sparsity of the one or more sets of data using a deep learning framework to determine a set of transforms to swap two or more subsets of data in the one or more sets of data, the deep learning framework further setting one or more data values in the two or more subsets of data to zero. 8. The one or more processors of claim 1 , wherein rearranging the elements in the one or more sets of data is performed by permuting the elements in the one or more sets of data. 9. The one or more processors of claim 1 , wherein rearranging the elements in the one or more sets of data is performed by pruning the elements in the one or more sets of data. 10. A system comprising: one or more processors; and memory including instructions that, when executed by the one or more processors, cause computer system to at least: cause an increase in sparsity of one or more sets of data by at least rearranging elements in the one or more sets of data, such that the one or more sets of data comprise one or more subsets of data conforming to one or more structural sparsity constraints. 11. The system of claim 10 , wherein the instructions further include instructions that, when executed by the one or more processors, cause a deep learning framework to transform the one or more sets of data based, at least in part, on the one or more structural sparsity constraints. 12. The system of claim 11 , wherein the one or more structural sparsity constraints comprise at least one structural sparsity constraint wherein a subset of the one or more sets of data comprises at least a first quantity of non-zero data values and a second quantity of zero data values. 13. The system of claim 11 , wherein the deep learning framework transforms the one or more sets of data by exchanging at least a first subset of the one or more sets of data associated with a first position value with at least a second subset of the one or more sets of data associated with a second position value. 14. The system of claim 11 , wherein the deep learning framework determines a set of transforms to rearrange the elements in the one or more sets of data by: randomly selecting a first subset of the one or more sets of data associated with a first position value and a second subset of the one or more sets of data associated with a second position value; exchanging the first subset and the second subset; setting one or more data values in the first subset and the second subset to a zero value; calculating a metric associated with a neural network corresponding to the one or more sets of data; and as a result of the metric being greater than another metric, adding a transform comprising the first position value and the second position value to the set of transforms. 15. The system of claim 10 , wherein the one or more sets of data are associated with one or more layers of a neural network and the one or more sets of data values comprise only non-zero numerical values. 16. The system of claim 10 , wherein the instructions further include instructions that, when executed by the one or more processors, cause the increase in sparsity by setting one or more data values in a subset of the one or more sets of data to a zero value, the one or more data values in the subset being numerical values representing one or more weight values associated with a neural network and the subset being determined based, at least in part, on one of the one or more structural sparsity constraints. 17. The system of claim 10 , wherein the one or more processors are parallel processing units, the parallel processing units comprising one or more sparse tensor cores to accelerate one or computations on the one or more sets of data based, at least in part, on the one or more structural sparsity constraints on the one or more sets of data. 18. The system of claim 10 , wherein rearranging the elements in the one or more sets of data is performed by permuting the elements in the one or more sets of data. 19. The system of claim 10 , wherein rearranging the elements in the one or more sets of data is performed by pruning the elements in the one or more sets of data. 20. A method comprising: causing an increase in sparsity of one or more sets of data by at least rearranging elements in the one or more sets of data, such that the one or more sets of data comprise one or more subsets of data conforming to one or more structural sparsity constraints. 21. The method of claim 20 , further comprising causing the increase in sparsity of the one or more sets of data using a deep learning framework, wherein the deep learning framework at least rearranges the elements in the one or more sets of data. 22. The method of claim 21 , wherein the one or more structural sparsity constraints comprise at least a limitation on the one or more sets of data, the limitation requiring the one or more subsets of data to comprise at least a first quantity of non-zero data values and a second quantity of zero data values. 23. The method of claim 21 , wherein the deep learning framework rearranges the elements in the one or more sets of data by swapping a first subset of the one or more subsets of data having a first position in the one or more sets of data with a second subset of the one or more subsets of data having a second position in the one or more sets of data. 24. The method of claim 21 , wherein
Architecture, e.g. interconnection topology · CPC title
Learning methods · CPC title
using electronic means · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.