Compression of kernel data for neural network operations
US-2019340488-A1 · Nov 7, 2019 · US
US11928581B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11928581-B2 |
| Application number | US-201816132015-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 14, 2018 |
| Priority date | Sep 14, 2018 |
| Publication date | Mar 12, 2024 |
| Grant date | Mar 12, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of compressing kernels comprising detecting a plurality of replicated kernels. The plurality of replicated kernels comprise kernels. The method also comprises generating a composite kernel from the replicated kernels. The composite kernel comprises kernel data and meta data indicative of the rotations applied to the composite kernel data. The method also comprises storing a composite kernel.
Opening claim text (preview).
What is claimed is: 1. A method of compressing kernels of a neural network trained for a particular purpose; the method comprising: detecting a plurality of replicated kernels, wherein the plurality of replicated kernels exhibit cyclic rotation; generating a plurality of composite kernels from the plurality of replicated kernels, the composite kernels comprising kernel data and meta data, wherein generating the plurality of composite kernels comprises determining differences between first replicated kernels and second replicated kernels of the plurality of replicated kernels, wherein at least one of the differences exceeds a predetermined threshold that is indicative of a maximum between the plurality of replicated kernels, and at least one of the differences is less than the predetermined threshold, wherein at least one of the generated plurality of composite kernels is the second replicated kernel for the at least one of the differences that exceeds the predetermined threshold, and at least one other of the plurality of composite kernels is set to an average of the plurality of replicated kernels for the at least one of the differences that is less than the predetermined threshold; and storing the plurality of composite kernels. 2. The method of compressing kernels according to claim 1 , wherein the replicated kernels exhibit 90-degree cyclic rotation. 3. The method of compressing kernels according to claim 1 , wherein a first of the replicated kernels is a mirror of a second of the replicated kernels. 4. The method of compressing kernels according to claim 1 , wherein the meta data is indicative of the cyclic rotation of the plurality of replicated kernels. 5. The method of compressing kernels according to claim 1 , wherein the step of generating the plurality of composite kernels comprises producing an average kernel based upon the replicated kernels. 6. The method of compressing kernels according to claim 1 , wherein at least one of the plurality of composite kernels comprises a first kernel of the plurality of replicated kernels. 7. The method of compressing kernels according to claim 6 , wherein for each of the plurality of replicated kernels, the step of generating the plurality of composite kernels comprises: aligning a second kernel of the plurality of replicated kernels with the first kernel of the plurality of replicated kernels; determining a delta kernel, wherein the delta kernels is indicative of the difference between the first kernel and the aligned second kernel; and setting at least one of the composite kernels to the delta kernel. 8. The method of compressing kernels according to claim 7 , wherein the step of generating the plurality of composite kernels further comprises the step of compressing the delta kernel. 9. The method of compressing kernels according to claim 1 , wherein the step of detecting a plurality of replicated kernels occurs during a training phase of a convolutional neural network. 10. The method of compressing kernels according to claim 9 , further comprising a step of retraining the convolutional neural network using the plurality of composite kernels. 11. The method of compressing kernels according to claim 1 , wherein the step of detecting a plurality of replicated kernels occurs prior to a training phase of a convolutional neural network. 12. A method of implementing a convolutional neural network using compressed kernels, the method comprising the steps of: extracting a kernel from the compressed kernels, wherein the kernel comprises kernel data and meta data; interrogating the meta data to determine any cyclic rotations; applying the cyclic rotations to the kernel data to produce one or more rotated kernels; and implementing the convolutional neural network using the one or more rotated kernels, wherein the compressed kernels are produced by a method according to claim 1 . 13. A system for compressing kernels, the system comprising: a detection module for detecting a plurality of replicated kernels, wherein the plurality of replicated kernels exhibit cyclic rotation; a generation module for generating composite kernels from the plurality of replicated kernels, wherein generating the composite kernels comprises determining differences between first replicated kernels and second replicated kernels of the plurality of replicated kernels, wherein at least one of the differences exceeds a predetermined threshold and at least one of the differences is less than the predetermined threshold, wherein at least one of the generated composite kernels is the second replicated kernel for at least one of the differences that exceeds the predetermined threshold, and at least one other of the generated composite kernels is set to an average of the plurality of replicated kernels for the at least one of the differences that is less than the predetermined threshold, wherein the predetermined threshold is indicative of a maximum difference between the plurality of replicated kernels; and storage for storing at least one of the composite kernels. 14. The system of compressing kernels according to claim 13 , wherein the replicated kernels exhibit 90-degree cyclic rotation. 15. The system of compressing kernels according to claim 13 , wherein a first of the replicated kernels is a mirror of a second of the replicated kernels. 16. The system for compressing kernels according to claim 13 , wherein the detection module is a driver of a processing unit. 17. A non-transitory computer-readable storage medium comprising computer-executable instructions stored thereon which, when executed by at least one processor, cause the at least one processor to compress kernels the instructions comprising: detecting a plurality of replicated kernels, wherein the plurality of replicated kernels exhibit cyclic rotation; generating composite kernels from the plurality of replicated kernels, the composite kernels comprising kernel data and meta data, wherein generating the composite kernels comprises determining differences between first replicated kernels and second replicated kernels of the plurality of replicated kernels, wherein at least one of the differences exceeds a predetermined threshold and at least one of the differences is less than the predetermined threshold, wherein at least one of the generated composite kernels is the second replicated kernel for at least one of the differences that exceeds the predetermined threshold, and at least one other of the generated composite kernels is set to an average of the plurality of replicated kernels for the at least one of the differences that is less than the predetermined threshold, wherein the predetermined threshold is indicative of a maximum difference between the plurality of replicated kernels; and storing at least one of the composite kernels.
Quantised networks; Sparse networks; Compressed networks · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Learning methods · CPC title
Architecture, e.g. interconnection topology · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.