Vector quantization decoding hardware unit for real-time dynamic decompression for parameters of neural networks

US11880759B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11880759-B2
Application numberUS-202318172979-A
CountryUS
Kind codeB2
Filing dateFeb 22, 2023
Priority dateFeb 18, 2020
Publication dateJan 23, 2024
Grant dateJan 23, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of an electronic device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit along with a plurality of convolution accelerators and a decompression unit coupled to the reconfigurable stream switch. The decompression unit decompresses encoded kernel data in real time during operation of convolutional neural network.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, comprising: training a convolutional neural network with a machine learning process; generating, with the machine learning process, kernel data for a convolutional layer of the convolutional neural network; generating encoded kernel data including index data and codebook data by performing a vector quantization process on the kernel data with an encoder external to the convolutional neural network; providing, during operation of the convolutional neural network after the machine learning process, the encoded kernel data to a decompression unit of the convolutional neural network; storing a vector quantization codebook in a lookup table of the decompression unit; generating decompressed kernel data with the decompression unit by retrieving code vectors from the lookup table with the index data; and providing the decompressed kernel data to the convolutional layer. 2. The method of claim 1 , further comprising: receiving feature data at a convolutional accelerator of the convolutional neural network; and performing convolution operations on the decompressed kernel data and the feature data with the convolutional accelerator. 3. The method of claim 2 , further comprising generating prediction data with the convolutional neural network based on the feature data and the decompressed kernel data. 4. The method of claim 3 , wherein the feature data is generated from image data from an image sensor. 5. The method of claim 1 , wherein the decompression unit includes an index stream buffer, wherein the encoded kernel data includes index data for retrieving code vectors from the lookup table. 6. The method of claim 4 , comprising: receiving, with the decompression unit, the codebook data, wherein the encoded kernel data includes the codebook data; storing, with the decompression unit, the codebook data in the lookup table; receiving, with the decompression unit, the index data; and retrieving, with the decompression unit, the code vectors from the codebook data with the index data. 7. The method of claim 1 , further comprising storing, with the decompression unit, codebook data for multiple convolution accelerators simultaneously. 8. The method of claim 3 , wherein the feature data is image data from an image sensor. 9. The method of claim 8 , wherein the prediction data identifies features in the image data. 10. The method of claim 1 , comprising implementing the convolutional neural network with multiple convolutional accelerators defining multiple convolution layers of the convolutional neural network. 11. A convolutional neural network processing system, comprising: an input layer configured to receive input data; a decompressor unit configured to receive encoded kernel data encoded with a vector quantization process and to generate decompressed kernel data based on the encoded kernel data, wherein the decompressor unit includes a lookup table configured to store codebook data associated with the encoded kernel data; a first convolutional accelerator and a second convolutional accelerator each configured to receive the decompressed kernel data, to receive feature data based on the input data, and to perform a convolution operations on the feature data and the decompressed kernel data, wherein the first convolutional accelerator defines a first convolutional layer of the convolutional neural network, wherein the second convolutional accelerator defines a second convolutional layer of the convolutional neural network, wherein the first and second convolutional layers are trained with a machine learning process that generates kernel data, wherein the encoded kernel data is generated from the kernel data; and a fully connected layer configured to receive convolved data from the convolutional accelerator and to generate prediction data based on the convolved data. 12. The system of claim 11 , wherein the decompressor unit includes an index stream buffer, wherein the encoded kernel data includes index data for retrieving code vectors from the lookup table. 13. The system of claim 12 , wherein the decompressor unit generates the decompressed kernel data by retrieving code vectors from the lookup table based on the index data. 14. The system of claim 13 , wherein the encoded kernel data includes the codebook data, wherein the decompressor unit is configured to receive the codebook data, store the codebook data in the lookup table, receive the index data, and retrieve the code vectors from the codebook data with the index data. 15. The system of claim 12 , wherein the codebook data and the index data are generated with the vector quantization process. 16. The system of claim 11 , further comprising multiple convolution accelerators, wherein the decompressor unit is configured to store codebook data for the multiple convolution accelerators simultaneously. 17. A method, comprising: training a convolutional neural network with a machine learning process; generating, with the machine learning process, kernel data for a convolutional layer of the convolutional neural network; generating encoded kernel data including index data and codebook data by performing a vector quantization process on the kernel data with an encoder external to the convolutional neural network; receiving the encoded kernel data with a decompression unit of the convolutional neural network, wherein the encoded kernel data includes index data for a vector quantization codebook; storing the vector quantization codebook in a lookup table of the decompression unit; generating decompressed kernel data with the decompression unit by retrieving code vectors from the lookup table with the index data; receiving feature data at a convolutional accelerator of the convolutional neural network; receiving the decompressed kernel data with the convolution accelerator from the decompression unit; and performing convolution operations on the decompressed kernel data and the feature data with the convolutional accelerator. 18. The method of claim 17 , wherein the convolutional neural network is implemented on a system on chip, wherein receiving the encoded kernel data includes receiving the encoded kernel data from a source external to the system on chip. 19. The method of claim 17 wherein the encoded kernel data includes second codebook data for a second convolutional accelerator of the convolutional neural network, the method comprising: storing the second vector quantization codebook data in the lookup table of the decompression unit; receiving second index data in the encoded kernel data; generating second decompressed kernel data by retrieving code vectors from the second vector quantization codebook with the second index data; and providing the second decompressed kernel data to the second convolutional accelerator. 20. The method of claim 17 , further comprising: receiving feature data at the convolutional accelerator; and performing convolution operations on the decompressed kernel data and the feature data with the convolutional accelerator.

Assignees

Inventors

Classifications

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • G06N3/045Primary

    Combinations of networks · CPC title

  • Tablespace storage structures; Management thereof · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11880759B2 cover?
Embodiments of an electronic device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit along with a plurality of convolution accelerators and a decompression unit coupled to the reconfigurable stream switch. The decompression unit decompresses encoded kernel data in real time during operation of convolutional neural network.
Who is the assignee on this patent?
St Microelectronics Srl, St Microelectronics Int Nv
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).