Systems and methods for handling padding regions in convolution operations

US11501147B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11501147-B1
Application numberUS-202016777606-A
CountryUS
Kind codeB1
Filing dateJan 30, 2020
Priority dateJan 30, 2020
Publication dateNov 15, 2022
Grant dateNov 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A disclosed computer-implemented method may include maintaining, within a local memory device (LMD) included in a hardware accelerator (1) a filter matrix corresponding to a filter location included in each of a set of filters of a convolutional layer of an artificial neural network (ANN), and (2) a set of activation vectors corresponding to an active region of an activation volume input into the convolutional layer. The method may also include determining that the active region of the activation volume is contiguous with a padding region associated with at least a portion of the activation volume. The method may further include directing a matrix multiplication unit (MMU) included in the hardware accelerator to execute a matrix multiplication operation (MMO) using the filter matrix and an activation matrix that may include (1) the set of activation vectors, and (2) at least one padding vector corresponding to the padding region.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: maintaining, within a local memory device (LMD) included in a hardware accelerator: a filter matrix corresponding to a filter location included in each of a set of filters of a convolutional layer of an artificial neural network (ANN); and a set of activation vectors corresponding to an active region of an activation volume input into the convolutional layer; determining that the active region of the activation volume is contiguous with a padding region associated with at least a portion of the activation volume; and directing a matrix multiplication unit (MMU) included in the hardware accelerator to execute a matrix multiplication operation (MMO) using the filter matrix and an activation matrix comprising: the set of activation vectors; and at least one padding vector corresponding to the padding region. 2. The computer-implemented method of claim 1 , wherein: the LMD comprises: a set of multiplier registers associated with the MMU; and a set of multiplicand registers associated with the MMU; maintaining the filter matrix within the LMD comprises loading, from a data store, the filter matrix to the set of multiplier registers; and maintaining the set of activation vectors within the LMD comprises loading, from the data store, the set of activation vectors to the set of multiplicand registers. 3. The computer-implemented method of claim 2 , wherein directing the MMU to execute the MMO using the filter matrix and the activation matrix comprises directing the hardware accelerator to include a padding value in a multiplicand register included in the set of multiplicand registers corresponding to the padding region. 4. The computer-implemented method of claim 2 , wherein: the hardware accelerator further comprises a set of output activation registers associated with the MMU; and directing the MMU to execute the MMO using the filter matrix and the activation matrix comprises: for each multiplicand register that includes an activation vector included in the active region of the activation volume: directing the MMU to execute a dot product operation using a filter vector included in the filter matrix and the activation vector; and storing a result of the dot product operation in the set of output activation registers; and for each multiplicand register that corresponds to the padding region, storing a padding value in the set of output activation registers. 5. The computer-implemented method of claim 1 , wherein directing the MMU to execute the MMO using the filter matrix and the activation matrix comprises directing the MMU to execute the MMO using the filter matrix as a multiplier matrix and the activation matrix as a multiplicand matrix. 6. The computer-implemented method of claim 5 , wherein: the filter matrix comprises a set of filter vectors corresponding to a filter location included in each of a set of filters of the convolutional layer of the artificial neural network; and each activation vector in the set of activation vectors comprises a set of channel values corresponding to a location within the activation volume; and the active region comprises at least a portion of a row of activation vectors included in the activation volume. 7. The computer-implemented method of claim 6 , wherein: the multiplier matrix comprises: a multiplier matrix height dimension; and a multiplier matrix width dimension; and the multiplicand matrix comprises: a multiplicand matrix height dimension comprising the multiplier matrix width dimension; and a multiplicand matrix width dimension. 8. The computer-implemented method of claim 7 , wherein: the activation matrix comprises a number of activation vectors no greater than the multiplier matrix height dimension; and each filter vector included in the set of filter vectors comprises a predetermined number of filter weight values, wherein: the predetermined number of filter weight values is at most the multiplier matrix width dimension; and each filter weight value included in the filter vector corresponds to a different channel included in a set of channels associated with each of the set of filters. 9. The computer-implemented method of claim 1 , further comprising: replacing: the filter matrix with an additional filter matrix corresponding to an additional filter location; and at least one activation vector included in the set of activation vectors with an additional activation vector included in the activation volume; and directing the MMU to execute an additional MMO using the additional filter matrix and the activation matrix. 10. The computer-implemented method of claim 9 , wherein: the hardware accelerator further comprises a set of output activation registers associated with the MMU; and directing the MMU to execute the MMO using the filter matrix and the activation matrix further comprises: generating a primary result matrix by directing the MMU to execute the MMO using the filter matrix as a multiplier matrix and the activation matrix as a multiplicand matrix; and storing the primary result matrix within the set of output activation registers. 11. The computer-implemented method of claim 10 , wherein directing the MMU to execute the additional MMO using the additional filter matrix and the activation matrix further comprises: producing a secondary result matrix by directing the MMU to execute the additional MMO using the additional filter matrix as the multiplier matrix and the activation matrix as the multiplicand matrix; accumulating the secondary result matrix and the primary result matrix; and storing a result of accumulating the secondary result matrix and the primary result matrix within the set of output activation registers. 12. The computer-implemented method of claim 11 , wherein the computer-implemented method further comprises determining, based on the result of accumulating the secondary result matrix and the primary result matrix, a set of output activation values for the convolutional layer of the ANN. 13. The computer-implemented method of claim 1 , wherein directing the MMU to execute an MMO comprises directing the MMU to execute a generalized matrix multiplication (GEMM) operation. 14. The computer-implemented method of claim 1 , wherein the activation volume comprises a digital image comprising: at least one row of activation values; at least one column of activation values; and at least one channel of activation values. 15. A system comprising: a hardware accelerator comprising: a matrix multiplication unit (MMU); and a local memory device (LMD); a maintaining module, stored in memory, that maintains, within the LMD: a filter matrix corresponding to a filter location included in each of a set of filters of a convolutional layer of an artificial neural network (ANN); and a set of activation vectors corresponding to an active region of an activation volume input into the convolutional layer; a determining module, stored in memory, that determines that the active region of the activation volume is contiguous with a padding region associated with at least a portion of the activation volume; and a directing module, stored in memory, that directs the MMU to execute a matrix multiplication operation (MMO) using the filter matrix and an activation matrix comprising: the set of activation vectors; and at least one padding vector corresponding to the padding region; and at least one physical processor that executes the maintaining module, the determining module, and the directing module.

Assignees

Inventors

Classifications

  • G06N3/063Primary

    using electronic means · CPC title

  • G06N3/0464Primary

    Convolutional networks [CNN, ConvNet] · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11501147B1 cover?
A disclosed computer-implemented method may include maintaining, within a local memory device (LMD) included in a hardware accelerator (1) a filter matrix corresponding to a filter location included in each of a set of filters of a convolutional layer of an artificial neural network (ANN), and (2) a set of activation vectors corresponding to an active region of an activation volume input into t…
Who is the assignee on this patent?
Meta Platforms Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).