Layer-based operations scheduling to optimise memory for CNN applications
US-2017344882-A1 · Nov 30, 2017 · US
US11900253B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11900253-B2 |
| Application number | US-202218050939-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 28, 2022 |
| Priority date | Dec 20, 2018 |
| Publication date | Feb 13, 2024 |
| Grant date | Feb 13, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, apparatuses, and methods for converting data to a tiling format when implementing convolutional neural networks are disclosed. A system includes at least a memory, a cache, a processor, and a plurality of compute units. The memory stores a first buffer and a second buffer in a linear format, where the first buffer stores convolutional filter data and the second buffer stores image data. The processor converts the first and second buffers from the linear format to third and fourth buffers, respectively, in a tiling format. The plurality of compute units load the tiling-formatted data from the third and fourth buffers in memory to the cache and then perform a convolutional filter operation on the tiling-formatted data. The system generates a classification of a first dataset based on a result of the convolutional filter operation.
Opening claim text (preview).
What is claimed is: 1. An apparatus comprising: a processor comprising circuitry, wherein in response to a request to perform a convolutional filter operation, the processor is configured to: read convolutional data stored in a linear format from sequential locations in a memory device; and convert the convolutional data from the linear format to a tiling format by writing the read convolutional data to memory locations according to a stride greater than one. 2. The apparatus as recited in claim 1 , wherein the stride is based on a number of input channels and a number of convolutional filters. 3. The apparatus as recited in claim 1 , wherein the stride is equal to a sum of a number of input channels and a number of convolutional filters. 4. The apparatus as recited in claim 1 , wherein the stride is equal to a number of pixel channels. 5. The apparatus as recited in claim 1 , wherein the convolutional data comprises a plurality of convolutional filters. 6. The apparatus as recited in claim 5 , wherein each convolutional filter of the plurality of convolutional filters has three rows and three columns. 7. The apparatus as recited in claim 1 , further comprising a second processor configured to generate a classification of a first dataset based on the data in the tiling format. 8. A method comprising: receiving a request to perform a convolutional filter operation; and in response to the request: reading convolutional data stored in a linear format from sequential locations in a memory device; and converting the convolutional data from the linear format to a tiling format by writing the read convolutional data to memory locations according to a stride greater than one. 9. The method as recited in claim 8 , wherein the stride is based on a number of input channels and a number of convolutional filters. 10. The method as recited in claim 8 , wherein the stride is equal to a sum of a number of input channels and a number of convolutional filters. 11. The method as recited in claim 8 , wherein the stride is equal to a number of pixel channels. 12. The method as recited in claim 8 , wherein the convolutional data comprises a plurality of convolutional filters. 13. The method as recited in claim 12 , wherein each convolutional filter of the plurality of convolutional filters has three rows and three columns. 14. The method as recited in claim 8 , further comprising a second processor configured to generate a classification of a first dataset based on the data in the tiling format. 15. A non-transitory storage medium comprising program instructions, wherein the program instructions are executable to: receive a request to perform a convolutional filter operation; and in response to the request: read convolutional data stored in a linear format from sequential locations in a memory device; and convert the convolutional data from the linear format to a tiling format by writing the read convolutional data to memory locations according to a stride greater than one. 16. The non-transitory storage medium as recited in claim 15 , wherein the stride is based on a number of input channels and a number of convolutional filters. 17. The non-transitory storage medium as recited in claim 15 , wherein the stride is equal to a sum of a number of input channels and a number of convolutional filters. 18. The non-transitory storage medium as recited in claim 15 , wherein the stride is equal to a number of pixel channels. 19. The non-transitory storage medium as recited in claim 15 , wherein the convolutional data comprises a plurality of convolutional filters. 20. The non-transitory storage medium as recited in claim 19 , wherein each convolutional filter of the plurality of convolutional filters has three rows and three columns.
Convolutional networks [CNN, ConvNet] · CPC title
Learning methods · CPC title
Validation; Performance evaluation; Active pattern learning techniques · CPC title
Architecture, e.g. interconnection topology · CPC title
Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.