Neural network comprising matrix multiplication

US2022253716A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2022253716-A1
Application numberUS-202217568325-A
CountryUS
Kind codeA1
Filing dateJan 4, 2022
Priority dateJan 4, 2021
Publication dateAug 11, 2022
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and data processing system implement a neural network containing at least one matrix multiplication operation. The matrix multiplication operation is mapped to a graph of neural network operations including at least one transformation and at least one convolution. The at least one convolution is implemented in fixed-function hardware of a neural network accelerator.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of implementing, using a neural network accelerator comprising fixed-function hardware, a neural network comprising a plurality of layers, wherein at least one of the layers comprises a matrix multiplication operation defined in two or more dimensions between a first tensor X having dimensions [. . . , P, . . . , Q, . . . ] and a second tensor Y having dimensions [. . . , Q, . . . , R, . . . ], the method comprising: mapping the matrix multiplication operation to a graph of neural network operations including at least one transformation and at least one convolution operation; and evaluating the graph of neural network operations to thereby evaluate the matrix multiplication operation, wherein the at least one convolution operation is evaluated in the fixed-function hardware. 2 . The method of claim 1 , wherein: the first tensor X or a tensor derived from it is treated as input data for the at least one convolution operation, and the second tensor Y or a tensor derived from it is treated as coefficient data for the at least one convolution operation. 3 . The method of claim 1 , wherein the at least one transformation reconfigures the second tensor Y to arrange the dimension with size R in the output channel dimension before the at least one convolution operation is evaluated. 4 . The method of claim 1 , wherein the at least one transformation reconfigures both tensors to arrange the dimension with size Q in the input channel dimension before the at least one convolution operation is evaluated. 5 . The method of claim 1 , wherein the at least one transformation reconfigures the first tensor X to arrange the dimension with size P in a dimension that is traversed by the at least one convolution operation. 6 . The method of claim 1 , wherein the hardware is configured to evaluate the at least one convolution operation by processing in parallel several sets of one or more input data elements selected along a first dimension traversed by the convolution operation, and wherein the at least one transformation reconfigures the first tensor X to arrange the dimension with size P in the first dimension. 7 . The method of claim 1 , wherein: the first tensor X has dimensions [1, 1, P, Q] and the second tensor Y has dimensions [1, 1, Q, R]; the at least one transformation reconfigures the first tensor X to form a reconfigured first tensor having dimensions [1, Q, 1, P]; the at least one transformation reconfigures the second tensor Y to form a reconfigured second tensor having dimensions [R, Q, 1, 1]; and the reconfigured first tensor and reconfigured second tensor are input to the at least one convolution. 8 . The method of claim 1 , wherein: the first tensor X has dimensions [M, N, P, Q] and the second tensor Y has dimensions [M′, N′, Q, R], where B=(max(M, M′) max(N, N′))>1; the at least one transformation splits and/or replicates, and reconfigures, the first tensor X to form B reconfigured first tensors each having dimensions [1, Q, 1, P], wherein if M′>M′=1 or N′>N=1 the at least one transformation comprises replicating the first tensor in the respective dimension, and if M′=M>1 or N′=N>1 the at least one transformation comprises splitting the first tensor in the respective dimension; the at least one transformation splits and/or replicates, and reconfigures, the second tensor Y to form B reconfigured second tensors having dimensions [R, Q, 1, 1], wherein if M>M′=1 or N>N′=1 the at least one transformation comprises replicating the second tensor in the respective dimension, and if M′=M>1 or N′=N>1 the at least one transformation comprises splitting the second tensor in the respective dimension; and the at least one convolution comprises B convolutions applied to respective pairs of the first reconfigured tensors and second reconfigured tensors. 9 . The method of claim 8 , wherein if either (i) M′=1 and M>1, or (ii) N′=1 and N>1, broadcasting is performed such that the second tensor Y is reused across several convolutions. 10 . The method of claim 1 , wherein: the first tensor X has dimensions [M, N, P, Q] and the second tensor Y has dimensions [M′, N′, Q, R]; the at least one transformation reconfigures the first tensor X to form a reconfigured first tensor having dimensions [1, BQ, 1, P]; the at least one transformation reconfigures the second tensor Y to form a reconfigured second tensor having dimensions [BR, Q, 1, 1]; and the at least one convolution comprises a grouped convolution, with B groups each with Q input channels and R output channels, applied to the reconfigured first tensor and reconfigured second tensor, wherein B=(max(M, M′) max(N, N′)), and wherein: if M′>M=1 and/or N′>N=1 the reconfiguration of the first tensor comprises replicating the first tensor M′ times and/or N′ times in the respective dimensions; and if M>M′=1 and/or N>N′=1 the reconfiguration of the second tensor comprises replicating the second tensor M times and/or N times in the respective dimensions. 11 . The method of claim 1 , wherein the first tensor X has dimensions [M, N, P, 1] and the second tensor Y has dimensions [M′, N′, 1, R]. 12 . The method of claim 1 , further comprising, before mapping the matrix multiplication operation to the graph of neural network operations, analysing the matrix multiplication operation, and determining, based on a result of the analysing, how to implement the matrix multiplication operation, comprising determining that the matrix multiplication operation should be implemented using the at least one transformation and the at least one convolution operation, and rejecting at least one alternative method for implementing the matrix multiplication operation. 13 . The method of claim 12 , wherein the determining how to implement the matrix multiplication operation is based on one or more of: a size of the first tensor in one or more dimensions; a size of the second tensor in one or more dimensions; a memory-access bandwidth required to implement the matrix multiplication operation using the selected method; a memory size required to implement the matrix multiplication operation using the selected method; a number of hardware passes through the fixed-function hardware that will be required to implement the matrix multiplication operation using the selected method; an execution time on the fixed function hardware that will be required to implement the matrix multiplication operation using the selected method; a power consumption required to implement the matrix multiplication operation using the selected method; and a capability of the fixed-function hardware. 14 . A data processing system for implementing a neural network comprising a plurality of layers, wherein at least one of the layers comprises a matrix multiplication operation defined in two or more dimensions between a first tensor X having dimensions [. . . , P, . . . , Q, . . . ] and a second tensor Y having dimensions [. . . , R, . . . ], the data processing system comprising: a mapping unit, configured to map the matrix multiplication operation to a graph of neural network operations including at least one transformation and at least one convolution operation; and a neural network accelerator comprising fixed-function hardware, wherein the neural network accelerator is configured to evaluate the graph of neural network operations to thereby evaluate the matrix multiplication operation, wherein the at least one convolution operation is evaluated in the fixed-function hardware. 15 . The data processing system of claim 14 , wherein the fixed-function

Assignees

Inventors

Classifications

  • G06N3/045Primary

    Combinations of networks · CPC title

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06N3/10Primary

    Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2022253716A1 cover?
A method and data processing system implement a neural network containing at least one matrix multiplication operation. The matrix multiplication operation is mapped to a graph of neural network operations including at least one transformation and at least one convolution. The at least one convolution is implemented in fixed-function hardware of a neural network accelerator.
Who is the assignee on this patent?
Imagination Tech Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 11 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).