Convolution matrix multiply with callback for deep tiling for deep convolutional neural networks

US2016239706A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016239706-A1
Application numberUS-201514845243-A
CountryUS
Kind codeA1
Filing dateSep 3, 2015
Priority dateFeb 13, 2015
Publication dateAug 18, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of address translation of images and filters to virtual matrices to perform a convolution by matrix multiplication includes receiving an image and a filter. Each image and filter has a memory address. The method also includes mapping the memory addresses to virtual matrix addresses based on a calculated linearized image and a calculated linearized filter. The method further includes converting data in the virtual matrix to a predefined internal format. The method still further includes convolving the image by matrix multiplication of the data in the predefined internal format based on the virtual matrix addresses.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of address translation of images and filters to virtual matrices to perform a convolution by matrix multiplication, comprising: receiving an image and a filter, each having a memory address; mapping the memory addresses to virtual matrix addresses based at least in part on a calculated linearized image and a calculated linearized filter; converting data in the virtual matrix to a predefined internal format; and convolving the image by matrix multiplication of the data in the predefined internal format based at least in part on the virtual matrix addresses. 2 . The method of claim 1 , further comprising declaring as completed a portion of the convolved image in a cache before completing the convolution. 3 . The method of claim 2 , further comprising: processing each portion of the convolved image from the cache by a plurality of layers of a DCN to create outputs for each portion; aggregating the outputs of each portion into an aggregated output; and processing the aggregated output by a plurality of remaining layers. 4 . An apparatus for translating images and filters to virtual matrices to perform a convolution by matrix multiplication, the apparatus comprising: a memory; and at least one processor coupled to the memory, the at least one processor configured: to receive an image and a filter, each having a memory address; to map the memory addresses to virtual matrix addresses based at least in part on a calculated linearized image and a calculated linearized filter; to convert data in the virtual matrix to a predefined internal format; and to convolve the image by matrix multiplication of the data in the predefined internal format based at least in part on the virtual matrix addresses. 5 . The apparatus of claim 4 , in which the at least one processor is further configured to declare as completed a portion of the convolved image in a cache before completing the convolution. 6 . The apparatus of claim 5 , in which the at least one processor is further configured: to process each portion of the convolved image from the cache by a plurality of layers of a DCN to create outputs for each portion; to aggregate the outputs of each portion into an aggregated output; and to process the aggregated output by a plurality of remaining layers. 7 . A method of processing an input source by a deep convolutional network (DCN), comprising: processing one portion at a time of the input source by a plurality of layers of the DCN to create outputs for each portion; aggregating the outputs of each portion into an aggregated output; and processing the aggregated output by a plurality of remaining layers. 8 . The method of claim 7 , in which the portions comprise tiles. 9 . The method of claim 7 , in which the input source comprises an image. 10 . The method of claim 7 , further comprising storing the output for each portion in a cache memory. 11 . The method of claim 7 , further comprising selecting a size of each portion to fit within a predetermined memory size so that the output for each portion fits within the predetermined memory size. 12 . An apparatus for processing an input source by a deep convolutional network (DCN), the apparatus comprising: a memory; and at least one processor coupled to the memory, the at least one processor configured: to process one portion at a time of the input source by a plurality of layers of the DCN to create outputs for each portion; to aggregate the outputs of each portion into an aggregated output; and to process the aggregated output by a plurality of remaining layers. 13 . The apparatus of claim 12 , in which the portions comprise tiles. 14 . The apparatus of claim 12 , in which the input source comprises an image. 15 . The apparatus of claim 12 , further comprising storing the output for each portion in a cache memory. 16 . The apparatus of claim 12 , in which the at least one processor is further configured to select a size of each portion to fit within a predetermined memory size so that the output for each portion fits within the predetermined memory size. 17 . A method of processing an input source by a deep convolutional network (DCN), comprising: receiving an image and a filter, each having a memory address; translating a portion of the image and a portion of the filter to virtual matrices; convolving the virtual matrices by matrix multiplication based at least in part on a virtual matrix address to generate a convolved image; and processing the convolved image by a plurality of layers of a DCN to create outputs for each portion. 18 . The method of claim 17 , further comprising: mapping the memory address to the virtual matrix address based at least in part on a calculated linearized image and a calculated linearized filter; converting data in the virtual matrix to a predefined internal format; and convolving the image and the filter by matrix multiplication of the data in the internal format based at least in part on the virtual matrix addresses. 19 . The method of claim 17 , further comprising: aggregating the outputs of each portion into an aggregated output; and processing the aggregated output by a plurality of remaining layers. 20 . An apparatus for processing an input source by a deep convolutional network (DCN), the apparatus comprising: a memory; and at least one processor coupled to the memory, the at least one processor configured: to receive an image and a filter, each having a memory address; to translate a portion of the image and a portion of the filter to virtual matrices; to convolve the virtual matrices by matrix multiplication based at least in part on a virtual matrix address to generate a convolved image; and to process the convolved image by a plurality of layers of a DCN to create outputs for each portion. 21 . The apparatus of claim 20 , in which the at least one processor is further configured: to map the memory address to the virtual matrix address based at least in part on a calculated linearized image and a calculated linearized filter; to convert data in the virtual matrix to a predefined internal format; and to convolve the image and the filter by matrix multiplication of the data in the internal format based at least in part on the virtual matrix addresses. 22 . The apparatus of claim 20 , in which the at least one processor is further configured: to aggregate the outputs of each portion into an aggregated output; and to process the aggregated output by a plurality of remaining layers.

Assignees

Inventors

Classifications

  • Preprocessing · CPC title

  • G06V10/454Primary

    Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN] · CPC title

  • G06N3/045Primary

    Combinations of networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016239706A1 cover?
A method of address translation of images and filters to virtual matrices to perform a convolution by matrix multiplication includes receiving an image and a filter. Each image and filter has a memory address. The method also includes mapping the memory addresses to virtual matrix addresses based on a calculated linearized image and a calculated linearized filter. The method further includes co…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification G06V10/454. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 18 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).