Memory storage format for supporting machine learning acceleration

US12165237B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12165237-B2
Application numberUS-202217946753-A
CountryUS
Kind codeB2
Filing dateSep 16, 2022
Priority dateSep 16, 2022
Publication dateDec 10, 2024
Grant dateDec 10, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor-implemented method for a memory storage format to accelerate machine learning (ML) on a computing device is described. The method includes receiving an image in a first layer storage format of a neural network. The method also includes assigning addresses to image pixels of each of three channels of the first layer storage format for accessing the image pixels in a blocked ML storage acceleration format. The method further includes storing the image pixels in the blocked ML storage acceleration format according to the assigned addresses of the image pixels. The method also includes accelerating inference video processing of the image according to the assigned addresses for the image pixels corresponding to the blocked ML storage acceleration format.

First claim

Opening claim text (preview).

What is claimed is: 1. A processor-implemented method for a memory storage format to accelerate machine learning (ML) on a computing device, comprising: receiving an image in a first layer storage format of a neural network; assigning addresses to image pixels of each of three channels of the first layer storage format for accessing the image pixels in a blocked ML storage acceleration format; splitting the image into a plurality of stripes according to an image width and an image height, in which a stripe height of each of the stripes is less than the image height; splitting each of the stripes into memory blocks having a memory block size according to a variable stride size to form the blocked ML storage acceleration format; storing the image pixels in the blocked ML storage acceleration format according to the assigned addresses of the image pixels; and accelerating inference video processing of the image according to the assigned addresses for the image pixels of the image corresponding to the blocked ML storage acceleration format. 2. The method of claim 1 , in which the assigning of addresses comprises: computing the assigned addresses to layout the image pixels within the memory blocks, in which each of the image pixels for each channel in the image are assigned to the memory blocks. 3. The method of claim 1 , in which storing the image comprises arranging image pixels in the memory blocks according to a spatial axis or a channel axis of the memory blocks. 4. The method of claim 3 , further comprising: storing the image pixels in the memory blocks in a spatial domain; and then storing the image pixels in a channel domain. 5. The method of claim 4 , further comprising: storing an initial group of the image pixels in an initial memory block of the memory blocks in a first channel of the initial memory block; storing a next group of the image pixels in the initial memory block of the memory blocks in a second channel of the initial memory block; storing a subsequent group of the image pixels in the initial memory block of the memory blocks in a third channel of the initial memory block; and repeating storing for each memory block of the memory blocks and for each consecutive group of the image pixels. 6. The method of claim 3 , further comprising: storing the image pixels in the memory blocks in a channel domain of the memory blocks; and then storing the image pixels in a spatial domain of the memory blocks. 7. The method of claim 6 , further comprising: storing a selected image pixel in an initial memory block of the memory blocks in a first channel of the initial memory block; storing a next image pixel in the initial memory block of the memory blocks for a second channel of the initial memory block; storing a subsequent image pixel in the initial memory block of the memory blocks in a third channel of the initial memory block; and repeating the storing of the selected image pixel, the storing of the next image pixel, and the storing of the subsequent image pixel for each memory block of the memory blocks and for each consecutive selected, next, and subsequent ones of the image pixels. 8. The method of claim 1 , in which accelerating inference video processing comprises simultaneously processing each of the three channels of the first layer storage format in the blocked ML storage acceleration format through matrix units of a neural signal processor (NSP) of the computing device. 9. The method of claim 1 , in which a precision of the first layer storage format of the neural network comprises 16-bit floating point (FP16) or quantized eight-bit integer (INT8). 10. A non-transitory computer-readable medium having program code recorded thereon for a memory storage format to accelerate machine learning (ML) on a computing device, the program code being executed by a processor and comprising: program code to receive an image in a first layer storage format of a neural network; program code to assign addresses to image pixels of each of three channels of the first layer storage format for accessing the image pixels in a blocked ML storage acceleration format; program code to split the image into a plurality of stripes according to an image width and an image height, in which a stripe height of each of the stripes is less than the image height; program code to split each of the stripes into memory blocks having a memory block size according to a variable stride size to form the blocked ML storage acceleration format; program code to store the image pixels in the blocked ML storage acceleration format according to the assigned addresses of the image pixels; and program code to accelerate inference video processing of the image according to the assigned addresses for the image pixels of the image corresponding to the blocked ML storage acceleration format. 11. The non-transitory computer-readable medium of claim 10 , in which the program code to assign addresses comprises: program code to compute the assigned addresses to layout the image pixels within the memory blocks, in which each of the image pixels for each channel in the image are assigned to the memory blocks. 12. The non-transitory computer-readable medium of claim 10 , in which the program code to store the image pixels comprises program code to arrange the image pixels in the memory blocks according to a spatial axis or a channel axis of the memory blocks. 13. The non-transitory computer-readable medium of claim 12 , further comprising: program code to store the image pixels in the memory blocks in a spatial domain; and then program code to store the image pixels in a channel domain. 14. The non-transitory computer-readable medium of claim 13 , further comprising: program code to store an initial group of the image pixels in an initial memory block of the memory blocks in a first channel of the initial memory block; program code to store a next group of the image pixels in the initial memory block of the memory blocks in a second channel of the initial memory block; program code to store a subsequent group of the image pixels in the initial memory block of the memory blocks in a third channel of the initial memory block; and program code to repeat program code to store for each memory block of the memory blocks and for each consecutive group of the image pixels. 15. The non-transitory computer-readable medium of claim 12 , further comprising: program code to store the image pixels in the memory blocks in a channel domain of the memory blocks; and then program code to store the image pixels in a spatial domain of the memory blocks. 16. The non-transitory computer-readable medium of claim 15 , further comprising: program code to store a selected image pixel in an initial memory block of the memory blocks in a first channel of the initial memory block; program code to store a next image pixel in the initial memory block of the memory blocks for a second channel of the initial memory block; program code to store a subsequent image pixel in the initial memory block of the memory blocks in a third channel of the initial memory block; and program code to repeat the program code to store the selected image pixel, the program code to store the next image pixel, and the program code to store the subsequent image pixel for each memory block of the memory blocks and for each consecutive selected, next, and subsequent ones of the image pixels. 17. The non-transitory computer-readable medium of claim 10 , in which the program code to accelerate inference video processing compr

Assignees

Inventors

Classifications

  • Allocation control and policies · CPC title

  • in block erasable memory, e.g. flash memory · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Artificial neural networks [ANN] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12165237B2 cover?
A processor-implemented method for a memory storage format to accelerate machine learning (ML) on a computing device is described. The method includes receiving an image in a first layer storage format of a neural network. The method also includes assigning addresses to image pixels of each of three channels of the first layer storage format for accessing the image pixels in a blocked ML storag…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 10 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).