Neural network comprising matrix multiplication

US12488253B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12488253-B2
Application numberUS-202217568325-A
CountryUS
Kind codeB2
Filing dateJan 4, 2022
Priority dateJan 4, 2021
Publication dateDec 2, 2025
Grant dateDec 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and data processing system implement a neural network containing at least one matrix multiplication operation. The matrix multiplication operation is mapped to a graph of neural network operations including at least one transformation and at least one convolution. The at least one convolution is implemented in fixed-function hardware of a neural network accelerator.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of implementing, using a neural network accelerator comprising fixed-function hardware, a neural network comprising a plurality of layers, wherein at least one of the layers comprises a matrix multiplication operation defined in two or more dimensions between a first tensor X having dimensions [ . . . , P, . . . , Q, . . . ] and a second tensor Y having dimensions [ . . . , Q, . . . , R, . . . ], the method comprising: mapping the matrix multiplication operation to a graph of neural network operations including at least one transformation and at least one convolution operation; evaluating the graph of neural network operations to thereby evaluate the matrix multiplication operation, wherein the at least one convolution operation is evaluated in the fixed-function hardware; and implementing, in said fixed-function hardware, said neural network to include a matrix multiplication layer configured to perform a matrix multiplication operation in dependence on the evaluation of the graph, whereby said neural network accelerator is adapted to multiply the same set of weights simultaneously by multiple sets of input data elements in parallel at multiple processing elements. 2 . The method of claim 1 , wherein: the first tensor X or a tensor derived from it is treated as input data for the at least one convolution operation, and the second tensor Y or a tensor derived from it is treated as coefficient data for the at least one convolution operation. 3 . The method of claim 1 , wherein the at least one transformation reconfigures the second tensor Y to arrange the dimension with size R in the output channel dimension before the at least one convolution operation is evaluated. 4 . The method of claim 1 , wherein the at least one transformation reconfigures both tensors to arrange the dimension with size Q in the input channel dimension before the at least one convolution operation is evaluated. 5 . The method of claim 1 , wherein the at least one transformation reconfigures the first tensor X to arrange the dimension with size P in a dimension that is traversed by the at least one convolution operation. 6 . The method of claim 1 , wherein the hardware is configured to evaluate the at least one convolution operation by processing in parallel several sets of one or more input data elements selected along a first dimension traversed by the convolution operation, and wherein the at least one transformation reconfigures the first tensor X to arrange the dimension with size P in the first dimension. 7 . The method of claim 1 , wherein: the first tensor X has dimensions [1, 1, P, Q] and the second tensor Y has dimensions [1, 1, Q, R]; the at least one transformation reconfigures the first tensor X to form a reconfigured first tensor having dimensions [1, Q, 1, P]; the at least one transformation reconfigures the second tensor Y to form a reconfigured second tensor having dimensions [R, Q, 1, 1]; and the reconfigured first tensor and reconfigured second tensor are input to the at least one convolution. 8 . The method of claim 1 , wherein: the first tensor X has dimensions [M, N, P, Q] and the second tensor Y has dimensions [M′, N′, Q, R], where B=(max (M, M′) max (N,N′))>1; the at least one transformation splits and/or replicates, and reconfigures, the first tensor X to form B reconfigured first tensors each having dimensions [1, Q, 1, P], wherein if M′>M=1 or N′>N=1 the at least one transformation comprises replicating the first tensor in the respective dimension, and if M′=M>1 or N′=N>1 the at least one transformation comprises splitting the first tensor in the respective dimension; the at least one transformation splits and/or replicates, and reconfigures, the second tensor Y to form B reconfigured second tensors having dimensions [R, Q, 1, 1], wherein if M>M′=1 or N>N′=1 the at least one transformation comprises replicating the second tensor in the respective dimension, and if M′=M>1 or N′=N>1 the at least one transformation comprises splitting the second tensor in the respective dimension; and the at least one convolution comprises B convolutions applied to respective pairs of the first reconfigured tensors and second reconfigured tensors. 9 . The method of claim 8 , wherein if either (i) M′=1 and M>1, or (ii) N′=1 and N>1, broadcasting is performed such that the second tensor Y is reused across several convolutions. 10 . The method of claim 1 , wherein: the first tensor X has dimensions [M, N, P, Q] and the second tensor Y has dimensions [M′, N′, Q, R]; the at least one transformation reconfigures the first tensor X to form a reconfigured first tensor having dimensions [1, BQ, 1, P]; the at least one transformation reconfigures the second tensor Y to form a reconfigured second tensor having dimensions [BR, Q, 1, 1]; and the at least one convolution comprises a grouped convolution, with B groups each with Q input channels and R output channels, applied to the reconfigured first tensor and reconfigured second tensor, wherein B=(max (M, M′) max (N,N′)), and wherein: if M′>M=1 and/or N′>N=1 the reconfiguration of the first tensor comprises replicating the first tensor M′ times and/or N′ times in the respective dimensions; and if M>M′=1 and/or N>N′=1 the reconfiguration of the second tensor comprises replicating the second tensor M times and/or N times in the respective dimensions. 11 . The method of claim 1 , wherein the first tensor X has dimensions [M, N, P, 1] and the second tensor Y has dimensions [M′, N′, 1, R]. 12 . The method of claim 1 , further comprising, before mapping the matrix multiplication operation to the graph of neural network operations, analysing the matrix multiplication operation, and determining, based on a result of the analysing, how to implement the matrix multiplication operation, comprising determining that the matrix multiplication operation should be implemented using the at least one transformation and the at least one convolution operation, and rejecting at least one alternative method for implementing the matrix multiplication operation. 13 . The method of claim 12 , wherein the determining how to implement the matrix multiplication operation is based on one or more of: a size of the first tensor in one or more dimensions; a size of the second tensor in one or more dimensions; a memory-access bandwidth required to implement the matrix multiplication operation using the selected method; a memory size required to implement the matrix multiplication operation using the selected method; a number of hardware passes through the fixed-function hardware that will be required to implement the matrix multiplication operation using the selected method; an execution time on the fixed function hardware that will be required to implement the matrix multiplication operation using the selected method; a power consumption required to implement the matrix multiplication operation using the selected method; and a capability of the fixed-function hardware. 14 . The method of claim 1 , wherein the layer comprising the matrix multiplication operation is a classification layer, for classifying an input to the neural network into one of a number of categories. 15 . The method of claim 1 , wherein the neural network is configured for use in one of: a natural language processing application; and an image processing application, and/or wherein the neural network comprises an attention-based neural network. 16 . A non-transitory computer readable storage medium having stored thereon computer readable

Assignees

Inventors

Classifications

  • Recurrent networks, e.g. Hopfield networks · CPC title

  • G06N3/045Primary

    Combinations of networks · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12488253B2 cover?
A method and data processing system implement a neural network containing at least one matrix multiplication operation. The matrix multiplication operation is mapped to a graph of neural network operations including at least one transformation and at least one convolution. The at least one convolution is implemented in fixed-function hardware of a neural network accelerator.
Who is the assignee on this patent?
Imagination Tech Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).