Decomposing convolution operation in neural networks
US-2016019456-A1 · Jan 21, 2016 · US
US11620766B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11620766-B2 |
| Application number | US-202117344639-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 10, 2021 |
| Priority date | Apr 8, 2017 |
| Publication date | Apr 4, 2023 |
| Grant date | Apr 4, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
In an example, an apparatus comprises logic, at least partially including hardware logic, to implement a lossy compression algorithm which utilizes a data transform and quantization process to compress data in a convolutional neural network (CNN) layer.
Opening claim text (preview).
The invention claimed is: 1. An apparatus, comprising: a hardware processor to: apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determine a scalar associated with each of the one or more independent rows of the matrix; encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold. 2. The apparatus of claim 1 , the hardware processor to: compress at least a portion of the encoded weight data in a frequency domain. 3. The apparatus of claim 2 , the hardware processor to: quantize the at least a portion of the encoded weight data in the frequency domain. 4. The apparatus of claim 2 , the hardware processor to: compress the at least a portion of the encoded weight data via K-means compression. 5. The apparatus of claim 2 , the hardware processor to: apply an inversed transform to the neural network layer. 6. The apparatus of claim 1 , further comprising: an instruction cache to receive a stream of instructions; an instruction unit to execute the stream of instructions; a general-purpose graphics processing compute block comprising a plurality of graphics processing cores; and a shared memory communicatively coupled to the plurality of graphics processing cores. 7. A method, comprising: applying a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; applying a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterizing one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determining a scalar associated with each of the one or more independent rows of the matrix; encoding a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; implementing a delta compression algorithm to compress the encoded weight data; storing the encoded weight data in the shared memory; and loading the matrix into the neural network using hardware when the rank is beneath a threshold. 8. The method of claim 7 , further comprising: compressing at least a portion of the encoded weight data in a frequency domain. 9. The method of claim 8 , further comprising: quantizing the at least a portion of the encoded weight data in the frequency domain. 10. The method of claim 8 , further comprising: compressing the at least a portion of the encoded weight data via K-means compression. 11. The method of claim 8 , further comprising: applying an inversed transform to the neural network layer. 12. A non-transitory computer-readable medium comprising instructions which, when executed by a processor, configure the processor to perform operations, comprising: applying a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; applying a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterizing one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determining a scalar associated with each of the one or more independent rows of the matrix; encoding a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; implementing a delta compression algorithm to compress the encoded weight data; storing the encoded weight data in the shared memory; and loading the matrix into the neural network using hardware when the rank is beneath a threshold. 13. The non-transitory computer-readable medium of claim 12 , further comprising instructions which, when executed by the processor, configure the processor to perform operations, comprising: compressing at least a portion of the encoded weight data in a frequency domain. 14. The non-transitory computer-readable medium of claim 13 , further comprising instructions which, when executed by the processor, configure the processor to perform operations, comprising: quantizing the at least a portion of the encoded weight data in the frequency domain. 15. The non-transitory computer-readable medium of claim 13 further comprising instructions which, when executed by the processor, configure the processor to perform operations, comprising: compressing the at least a portion of the encoded weight data via K-means compression. 16. The non-transitory computer-readable medium of claim 13 , further comprising instructions which, when executed by the processor, configure the processor to perform operations, comprising: applying an inversed transform to the neural network layer.
Quantised networks; Sparse networks; Compressed networks · CPC title
Distributed learning, e.g. federated learning · CPC title
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.