What technology area does this patent fall under?

Primary CPC classification G06F17/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Jan 19 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural network comprising matrix multiplication

US2023021204A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2023021204-A1
Application number	US-202217851306-A
Country	US
Kind code	A1
Filing date	Jun 28, 2022
Priority date	Jun 29, 2021
Publication date	Jan 19, 2023
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and data processing system for implementing a neural network containing at least one matrix multiplication operation. The matrix multiplication operation is mapped to a graph of neural network operations including at least one element-wise operation. The at least one element-wise operation is implemented in fixed-function hardware of a neural network accelerator.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method of implementing, using a neural network accelerator comprising fixed-function hardware, a neural network comprising a plurality of layers, wherein at least one of the layers comprises a matrix multiplication operation defined in two or more dimensions between a first tensor X having dimensions [ . . . , Q, . . . ] and a second tensor Y having dimensions [ . . . , R, . . . ], the method comprising: mapping the matrix multiplication operation to a graph of neural network operations including at least one element-wise operation; and evaluating the graph of neural network operations to thereby evaluate the matrix multiplication operation; wherein the at least one element-wise operation is evaluated in the fixed-function hardware. 2 . The method of claim 1 , wherein the graph of neural network operations further comprises at least one transformation, applied to the first tensor X and/or the second tensor Y. 3 . The method of claim 2 , wherein the at least one transformation comprises: reconfiguring the second tensor Y to form a third tensor having dimensions [ . . . , R, Q]; and splitting the third tensor into R constituent tensors each of dimensions [ . . . , 1, Q], wherein the at least one element-wise operation comprises an element-wise multiplication between the first tensor and each of the R constituent tensors. 4 . The method of claim 3 , wherein: the at least one transformation further comprises concatenating and reconfiguring the results of the element-wise multiplications to arrange them in R groups of size Q, the graph of neural network operations further comprises summing within each of the R groups over one dimension; and the at least one transformation further comprises reconfiguring the result of the summing into a tensor having dimensions [ . . . , P, R]. 5 . The method of claim 4 , wherein: the at least one transformation comprises concatenating and reconfiguring the results of the element-wise multiplications to arrange them in R groups of size Q over the channel dimension; and the summing comprises summing within each of the R groups over the channel dimension. 6 . The method of claim 4 , wherein the summing comprises at least one convolution with a tensor of ones. 7 . The method of claim 4 , wherein the summing comprises a grouped convolution with a tensor of ones such that the grouped convolution has R groups, each with Q input channels and 1 output channel. 8 . The method of claim 1 , wherein the first tensor X has dimensions [M, N, P, 1] and the second tensor Y has dimensions [M′, N′, 1, R]. 9 . The method of claim 1 , wherein: the first tensor X has dimensions [M, N, P, 1] and the second tensor Y has dimensions [M′, N′, 1, R]; and the element-wise operation comprises an element-wise multiplication of the first tensor, or a tensor derived from it, with the second tensor, or a tensor derived from it. 10 . The method of claim 9 , wherein the element-wise multiplication is performed using broadcasting over two dimensions. 11 . The method of claim 9 , wherein the element-wise multiplication is performed using broadcasting over one dimension and repeating one of the tensors over the other dimension. 12 . The method of claim 9 , wherein the element-wise multiplication comprises repeating one of the tensors over one dimension and repeating the other of the tensors over the other dimension. 13 . The method of claim 1 , wherein the at least one transformation is performed at least in part using a memory manipulation module configured to manipulate data stored in a memory; and/or wherein the repeating of a tensor is performed at least in part by one of: the memory manipulation module; and an element-wise operations unit of the neural network accelerator. 14 . The method of claim 1 , further comprising, before mapping the matrix multiplication operation to the graph of neural network operations: analysing the matrix multiplication operation; and determining, based on a result of the analysing, how to implement the matrix multiplication operation, comprising determining that the matrix multiplication operation should be implemented using the at least one element-wise operation, and rejecting at least one alternative method for implementing the matrix multiplication operation. 15 . The method of claim 14 , wherein the determining how to implement the matrix multiplication operation is based on one or more of: a size of the first tensor in one or more dimensions; a size of the second tensor in one or more dimensions; a memory-access bandwidth required to implement the matrix multiplication operation using the selected method; a memory size required to implement the matrix multiplication operation using the selected method; a number of hardware passes through the fixed-function hardware that will be required to implement the matrix multiplication operation using the selected method; an execution time on the fixed function hardware that will be required to implement the matrix multiplication operation using the selected method; a power consumption required to implement the matrix multiplication operation using the selected method; and a capability of the fixed-function hardware. 16 . A data processing system for implementing a neural network comprising a plurality of layers, wherein at least one of the layers comprises a matrix multiplication operation defined in two or more dimensions between a first tensor X having dimensions [ . . . , Q, . . . ] and a second tensor Y having dimensions [ . . . , R, . . . ], the data processing system comprising: a mapping unit, configured to map the matrix multiplication operation to a graph of neural network operations including at least one element-wise operation; and a neural network accelerator comprising fixed-function hardware; wherein the neural network accelerator is configured to evaluate the graph of neural network operations to thereby evaluate the matrix multiplication operation; and wherein the at least one element-wise operation is evaluated in the fixed-function hardware. 17 . The data processing system of claim 16 , wherein the graph of neural network operations further comprises at least one transformation, applied to the first tensor X and/or the second tensor Y; wherein the data processing system comprises a memory manipulation module for manipulating data stored in a memory; and wherein the data processing system is configured to perform the at least one transformation using memory manipulation module. 18 . The data processing system of claim 17 , wherein the memory manipulation module comprises: an internal buffer; a memory reading block, configured to read data from the memory and write the data to the internal buffer; a memory writing block, configured to read the data from the internal buffer and write the data to the memory; and a control channel between the memory reading block and the memory writing block, wherein the memory reading block and the memory writing block are configured to communicate via the control channel to maintain synchronisation between them when writing the data to the internal buffer and reading the data from the internal buffer, respectively. 19 . The method of claim 1 , wherein the layer comprising the matrix multiplication operation is a classification layer for classifying an input to the neural network into one of a number of categories. 20 . A non-transitory computer readable storage medium having stored

Assignees

Imagination Tech Ltd

Inventors

Classifications

G06F17/16Primary
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
G06N3/045
Combinations of networks · CPC title
G06N3/063Primary
using electronic means · CPC title
G06N3/04Primary
Architecture, e.g. interconnection topology · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title

Patent family

Related publications grouped by family.

View patent family 77179677

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2023021204A1 cover?: A method and data processing system for implementing a neural network containing at least one matrix multiplication operation. The matrix multiplication operation is mapped to a graph of neural network operations including at least one element-wise operation. The at least one element-wise operation is implemented in fixed-function hardware of a neural network accelerator.
Who is the assignee on this patent?: Imagination Tech Ltd
What technology area does this patent fall under?: Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Jan 19 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Converting quasi-affine expressions to matrix operations

Reconfigurable neural network processing based on subgraph recognition

Reformatting Matrices to Improve Computing Efficiency

Neural network processor

Frequently asked questions