Matrix multiplier

US11334648B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11334648-B2
Application numberUS-202016915915-A
CountryUS
Kind codeB2
Filing dateJun 29, 2020
Priority dateDec 29, 2017
Publication dateMay 17, 2022
Grant dateMay 17, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the present invention disclose a matrix multiplier, and relate to the field of data computing technologies, so as to divide two matrices into blocks for computation. The matrix multiplier includes: a first memory, a second memory, an operation circuit, and a controller, where the operation circuit, the first memory, and the second memory may perform data communication by using a bus; and the controller is configured to control, according to a preset program or instruction, a first matrix and a second matrix to be divided into blocks, and control the operation circuit to perform a multiplication operation on corresponding blocks in the first memory and the second memory based on block division results of the controller. The matrix multiplier may be configured to perform a multiplication operation on two matrices.

First claim

Opening claim text (preview).

What is claimed is: 1. A matrix multiplier, comprising: a first memory, configured to store a first matrix, wherein the first matrix is an M*K matrix; a second memory, configured to store a second matrix, wherein the second matrix is a K*N matrix; an operation circuit connected to the first memory and the second memory, wherein the operation circuit comprises operation units of X rows and Y columns, and each operation unit comprises a vector multiplication circuit and an addition circuit, wherein the vector multiplication circuit is configured to receive row vector data sent by the first memory and column vector data sent by the second memory, and multiply the two vectors; and the addition circuit is configured to add results obtained by multiplying the two vectors, and accumulate computation results of a same operation unit, to obtain an operation result of each operation unit; and a controller connected to the operation circuit, wherein the controller is configured to perform the following actions: dividing the first matrix into blocks in a unit of a sub-block whose size is X*L, to obtain S×R sub-blocks of a same size, wherein a sub-block in a row s and a column r of the S×R sub-blocks is denoted as A sr , s=(1, 2, 3, . . . , and S), and r=(1, 2, 3, . . . , and R); and dividing the second matrix into blocks in a unit of a sub-block whose size is L*Y, to obtain R×T sub-blocks of a same size, wherein a sub-block in a row r and a column t in the R×T sub-blocks is denoted as B rt , r=(1, 2, 3, . . . , and R), and t=(1, 2, 3, . . . , and T); wherein the controller is further configured to perform the following action: inputting a row x in X row vectors of any sub-block A sr and a column y in Y column vectors of a corresponding sub-block B rt into an operation unit in a row x and a column y in the operation units of X rows and Y columns, so as to perform an operation, wherein x=(1, 2, 3, . . . , and X), y=(1, 2, 3, . . . , and Y), and r in the any sub-block A sr and r in the corresponding sub-block B rt have an equal value. 2. The matrix multiplier according to claim 1 , wherein the controller is specifically configured to perform the following action: inputting the row x in the X row vectors of the any sub-block A sr and the column y in the Y column vectors of the corresponding sub-block B rt into the operation unit in a row x and a column y in the operation units of X rows and Y columns in parallel in a same clock cycle, so as to perform the operation. 3. The matrix multiplier according to claim 1 , wherein the controller is further configured to control row vectors of the any sub-block A sr to successively enter, in ascending order of x row numbers, a row x corresponding to the operation units of X rows and Y columns, wherein a difference between moments at which adjacent row vectors enter operation units in a same column and different rows is one clock cycle; and the controller is further configured to simultaneously control column vectors of the corresponding sub-block B rt to successively enter, in ascending order of y column numbers, a column y corresponding to the operation units of X rows and Y columns, wherein a difference between moments at which adjacent column vectors enter operation units in a same row and different columns is one clock cycle. 4. The matrix multiplier according to claim 1 , wherein the controller is further configured to control: values of s and r to remain unchanged and a value of t to be changed in at least two consecutive sub-block multiplication computation cycles, so that the first memory reuses a same sub-block A sr within the at least two consecutive sub-block multiplication computation cycles, wherein the sub-block multiplication computation cycle is a time used by the operation units of X rows and Y columns to complete a matrix multiplication operation on one sub-block A sr and a corresponding sub-block B rt . 5. The matrix multiplier according to claim 1 , wherein the matrix multiplier further comprises a third memory connected to the operation circuit; and the controller is configured to control the operation units of X rows and Y columns to store operation results of the vector multiplication circuit and the addition circuit into the third memory. 6. The matrix multiplier according to claim 5 , wherein the matrix multiplier further comprises: a fourth memory connected to the first memory and the second memory, and a fifth memory connected to the third memory; and the controller is further configured to control: before performing a multiplication operation on the first matrix and the second matrix, data sources of the first matrix and the second matrix to be moved from the fourth memory to the first memory and the second memory respectively, and the computation results to be moved from the third memory to the fifth memory. 7. The matrix multiplier according to claim 1 , wherein the vector multiplication circuit comprises L multipliers, and the addition circuit comprises an adder tree with an input quantity being L+1. 8. The matrix multiplier according to claim 1 , wherein the first memory, the second memory, the operation circuit, and the controller are connected by using a bus interface unit. 9. The matrix multiplier according to claim 1 , wherein S = { M / X , M ⁢ % ⁢ X = 0 [ M X ] + 1 , M ⁢ % ⁢ X ≠ 0 , and ⁢ ⁢ R = { K / L ,

Assignees

Inventors

Classifications

  • Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

  • in parallel-parallel fashion, i.e. both operands being entered in parallel (G06F7/533 takes precedence) · CPC title

  • G06F17/16Primary

    Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11334648B2 cover?
Embodiments of the present invention disclose a matrix multiplier, and relate to the field of data computing technologies, so as to divide two matrices into blocks for computation. The matrix multiplier includes: a first memory, a second memory, an operation circuit, and a controller, where the operation circuit, the first memory, and the second memory may perform data communication by using a …
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 17 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).