Device and method to process data in parallel
US-2017011006-A1 · Jan 12, 2017 · US
US11880759B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11880759-B2 |
| Application number | US-202318172979-A |
| Country | US |
| Kind code | B2 |
| Filing date | Feb 22, 2023 |
| Priority date | Feb 18, 2020 |
| Publication date | Jan 23, 2024 |
| Grant date | Jan 23, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of an electronic device include an integrated circuit, a reconfigurable stream switch formed in the integrated circuit along with a plurality of convolution accelerators and a decompression unit coupled to the reconfigurable stream switch. The decompression unit decompresses encoded kernel data in real time during operation of convolutional neural network.
Opening claim text (preview).
The invention claimed is: 1. A method, comprising: training a convolutional neural network with a machine learning process; generating, with the machine learning process, kernel data for a convolutional layer of the convolutional neural network; generating encoded kernel data including index data and codebook data by performing a vector quantization process on the kernel data with an encoder external to the convolutional neural network; providing, during operation of the convolutional neural network after the machine learning process, the encoded kernel data to a decompression unit of the convolutional neural network; storing a vector quantization codebook in a lookup table of the decompression unit; generating decompressed kernel data with the decompression unit by retrieving code vectors from the lookup table with the index data; and providing the decompressed kernel data to the convolutional layer. 2. The method of claim 1 , further comprising: receiving feature data at a convolutional accelerator of the convolutional neural network; and performing convolution operations on the decompressed kernel data and the feature data with the convolutional accelerator. 3. The method of claim 2 , further comprising generating prediction data with the convolutional neural network based on the feature data and the decompressed kernel data. 4. The method of claim 3 , wherein the feature data is generated from image data from an image sensor. 5. The method of claim 1 , wherein the decompression unit includes an index stream buffer, wherein the encoded kernel data includes index data for retrieving code vectors from the lookup table. 6. The method of claim 4 , comprising: receiving, with the decompression unit, the codebook data, wherein the encoded kernel data includes the codebook data; storing, with the decompression unit, the codebook data in the lookup table; receiving, with the decompression unit, the index data; and retrieving, with the decompression unit, the code vectors from the codebook data with the index data. 7. The method of claim 1 , further comprising storing, with the decompression unit, codebook data for multiple convolution accelerators simultaneously. 8. The method of claim 3 , wherein the feature data is image data from an image sensor. 9. The method of claim 8 , wherein the prediction data identifies features in the image data. 10. The method of claim 1 , comprising implementing the convolutional neural network with multiple convolutional accelerators defining multiple convolution layers of the convolutional neural network. 11. A convolutional neural network processing system, comprising: an input layer configured to receive input data; a decompressor unit configured to receive encoded kernel data encoded with a vector quantization process and to generate decompressed kernel data based on the encoded kernel data, wherein the decompressor unit includes a lookup table configured to store codebook data associated with the encoded kernel data; a first convolutional accelerator and a second convolutional accelerator each configured to receive the decompressed kernel data, to receive feature data based on the input data, and to perform a convolution operations on the feature data and the decompressed kernel data, wherein the first convolutional accelerator defines a first convolutional layer of the convolutional neural network, wherein the second convolutional accelerator defines a second convolutional layer of the convolutional neural network, wherein the first and second convolutional layers are trained with a machine learning process that generates kernel data, wherein the encoded kernel data is generated from the kernel data; and a fully connected layer configured to receive convolved data from the convolutional accelerator and to generate prediction data based on the convolved data. 12. The system of claim 11 , wherein the decompressor unit includes an index stream buffer, wherein the encoded kernel data includes index data for retrieving code vectors from the lookup table. 13. The system of claim 12 , wherein the decompressor unit generates the decompressed kernel data by retrieving code vectors from the lookup table based on the index data. 14. The system of claim 13 , wherein the encoded kernel data includes the codebook data, wherein the decompressor unit is configured to receive the codebook data, store the codebook data in the lookup table, receive the index data, and retrieve the code vectors from the codebook data with the index data. 15. The system of claim 12 , wherein the codebook data and the index data are generated with the vector quantization process. 16. The system of claim 11 , further comprising multiple convolution accelerators, wherein the decompressor unit is configured to store codebook data for the multiple convolution accelerators simultaneously. 17. A method, comprising: training a convolutional neural network with a machine learning process; generating, with the machine learning process, kernel data for a convolutional layer of the convolutional neural network; generating encoded kernel data including index data and codebook data by performing a vector quantization process on the kernel data with an encoder external to the convolutional neural network; receiving the encoded kernel data with a decompression unit of the convolutional neural network, wherein the encoded kernel data includes index data for a vector quantization codebook; storing the vector quantization codebook in a lookup table of the decompression unit; generating decompressed kernel data with the decompression unit by retrieving code vectors from the lookup table with the index data; receiving feature data at a convolutional accelerator of the convolutional neural network; receiving the decompressed kernel data with the convolution accelerator from the decompression unit; and performing convolution operations on the decompressed kernel data and the feature data with the convolutional accelerator. 18. The method of claim 17 , wherein the convolutional neural network is implemented on a system on chip, wherein receiving the encoded kernel data includes receiving the encoded kernel data from a source external to the system on chip. 19. The method of claim 17 wherein the encoded kernel data includes second codebook data for a second convolutional accelerator of the convolutional neural network, the method comprising: storing the second vector quantization codebook data in the lookup table of the decompression unit; receiving second index data in the encoded kernel data; generating second decompressed kernel data by retrieving code vectors from the second vector quantization codebook with the second index data; and providing the second decompressed kernel data to the second convolutional accelerator. 20. The method of claim 17 , further comprising: receiving feature data at the convolutional accelerator; and performing convolution operations on the decompressed kernel data and the feature data with the convolutional accelerator.
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
Combinations of networks · CPC title
Tablespace storage structures; Management thereof · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.