Low rank matrix compression

US11620766B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11620766-B2
Application numberUS-202117344639-A
CountryUS
Kind codeB2
Filing dateJun 10, 2021
Priority dateApr 8, 2017
Publication dateApr 4, 2023
Grant dateApr 4, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In an example, an apparatus comprises logic, at least partially including hardware logic, to implement a lossy compression algorithm which utilizes a data transform and quantization process to compress data in a convolutional neural network (CNN) layer.

First claim

Opening claim text (preview).

The invention claimed is: 1. An apparatus, comprising: a hardware processor to: apply a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; apply a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterize one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determine a scalar associated with each of the one or more independent rows of the matrix; encode a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; apply delta compression to compress the encoded weight data; store the encoded weight data in the shared memory; and load the matrix into the neural network using hardware when the rank is beneath a threshold. 2. The apparatus of claim 1 , the hardware processor to: compress at least a portion of the encoded weight data in a frequency domain. 3. The apparatus of claim 2 , the hardware processor to: quantize the at least a portion of the encoded weight data in the frequency domain. 4. The apparatus of claim 2 , the hardware processor to: compress the at least a portion of the encoded weight data via K-means compression. 5. The apparatus of claim 2 , the hardware processor to: apply an inversed transform to the neural network layer. 6. The apparatus of claim 1 , further comprising: an instruction cache to receive a stream of instructions; an instruction unit to execute the stream of instructions; a general-purpose graphics processing compute block comprising a plurality of graphics processing cores; and a shared memory communicatively coupled to the plurality of graphics processing cores. 7. A method, comprising: applying a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; applying a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterizing one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determining a scalar associated with each of the one or more independent rows of the matrix; encoding a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; implementing a delta compression algorithm to compress the encoded weight data; storing the encoded weight data in the shared memory; and loading the matrix into the neural network using hardware when the rank is beneath a threshold. 8. The method of claim 7 , further comprising: compressing at least a portion of the encoded weight data in a frequency domain. 9. The method of claim 8 , further comprising: quantizing the at least a portion of the encoded weight data in the frequency domain. 10. The method of claim 8 , further comprising: compressing the at least a portion of the encoded weight data via K-means compression. 11. The method of claim 8 , further comprising: applying an inversed transform to the neural network layer. 12. A non-transitory computer-readable medium comprising instructions which, when executed by a processor, configure the processor to perform operations, comprising: applying a matrix interpolation operation to one or more linearly dependent rows of a matrix comprising weights of a neural network; applying a singular value decomposition algorithm to convert one or more weights of one or more linearly dependent rows of the matrix to a low rank; characterizing one or more rows of the matrix comprising weights of a neural network for which a rank of the one or more rows of the matrix is less than a threshold value as independent rows of the matrix; determining a scalar associated with each of the one or more independent rows of the matrix; encoding a plurality of the one or more independent rows with the scalar associated with the row to generate encoded weight data; implementing a delta compression algorithm to compress the encoded weight data; storing the encoded weight data in the shared memory; and loading the matrix into the neural network using hardware when the rank is beneath a threshold. 13. The non-transitory computer-readable medium of claim 12 , further comprising instructions which, when executed by the processor, configure the processor to perform operations, comprising: compressing at least a portion of the encoded weight data in a frequency domain. 14. The non-transitory computer-readable medium of claim 13 , further comprising instructions which, when executed by the processor, configure the processor to perform operations, comprising: quantizing the at least a portion of the encoded weight data in the frequency domain. 15. The non-transitory computer-readable medium of claim 13 further comprising instructions which, when executed by the processor, configure the processor to perform operations, comprising: compressing the at least a portion of the encoded weight data via K-means compression. 16. The non-transitory computer-readable medium of claim 13 , further comprising instructions which, when executed by the processor, configure the processor to perform operations, comprising: applying an inversed transform to the neural network layer.

Assignees

Inventors

Classifications

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11620766B2 cover?
In an example, an apparatus comprises logic, at least partially including hardware logic, to implement a lossy compression algorithm which utilizes a data transform and quantization process to compress data in a convolutional neural network (CNN) layer.
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06T9/002. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 04 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).