Matrix processing method and apparatus, and logic circuit

US11734386B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11734386-B2
Application numberUS-202117560472-A
CountryUS
Kind codeB2
Filing dateDec 23, 2021
Priority dateAug 6, 2018
Publication dateAug 22, 2023
Grant dateAug 22, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A matrix processing method performed by a graphics processing unit (GPU) includes: determining a plurality of non-zero elements in a to-be-processed matrix at a processor in the GPU; generating a distribution matrix of the to-be-processed matrix at the processor, where the distribution matrix comprises identities for indicating positions of the plurality of non-zero elements in the to-be-processed matrix; obtaining a target matrix from another matrix by using the distribution matrix at a logic circuit in the processor, where the target matrix comprises a plurality of target elements from the another matrix; and performing matrix processing on the plurality of non-zero elements and the target matrix to obtain an operation result at the processor.

First claim

Opening claim text (preview).

What is claimed is: 1. A matrix processing method performed by a graphics processing unit (GPU), comprising: determining, at a processor in the GPU, a plurality of non-zero elements in a to-be-processed matrix; generating, at the processor, a distribution matrix of the to-be-processed matrix, wherein the distribution matrix comprises identities for indicating positions of the plurality of non-zero elements in the to-be-processed matrix; obtaining, at a logic circuit in the processor, a target matrix from another matrix by using the distribution matrix, wherein the target matrix comprises a plurality of target elements from the another matrix, and a position of each of the plurality of target elements in the another matrix corresponds to a position of a non-zero element in the to-be-processed matrix; and performing, at the processor, matrix processing on the plurality of non-zero elements and the target matrix to obtain an operation result. 2. The method according to claim 1 , wherein the to-be-processed matrix is a multi-dimensional matrix. 3. The method according to claim 1 , wherein the matrix processing comprises a multiply-add operation. 4. The method according to claim 1 , wherein the to-be-processed matrix is an image convolution kernel. 5. A graphics processing unit (GPU) for matrix processing, comprising: a processor including at least a logic circuit, wherein the processor is configured to invoke programs stored in a memory coupled to the processor, to perform: determining a plurality of non-zero elements in a to-be-processed matrix; generating a distribution matrix of the to-be-processed matrix, wherein the distribution matrix comprises identities for indicating first positions of the plurality of non-zero elements in the to-be-processed matrix; and perform matrix processing on the plurality of non-zero elements and a target matrix to obtain an operation result; and the logic circuit is further configured to: obtain the target matrix from another matrix by using the distribution matrix, wherein the target matrix comprises a plurality of target elements from the another matrix, and a position of each of the plurality of target elements in the another matrix corresponds to a position of a non-zero element in the to-be-processed matrix. 6. The GPU according to claim 5 , wherein the to-be-processed matrix is a multi-dimensional matrix. 7. The GPU according to claim 5 , wherein the matrix processing comprises a multiply-add operation. 8. The GPU according to claim 5 , wherein the to-be-processed matrix is an image convolution kernel. 9. A matrix processing method performed by a graphics processing unit (GPU), comprising: determining, using at least one processor, a quantity of non-zero elements in a to-be-processed matrix, wherein the to-be-processed matrix is a one-dimensional matrix; generating, using the processor, a distribution matrix of the to-be-processed matrix, wherein the distribution matrix is used to indicate a position of a non-zero element in the to-be-processed matrix; and combining, using the processor, the quantity of non-zero elements, values of all non-zero elements in the to-be-processed matrix arranged sequentially, and the distribution matrix, to obtain a compressed matrix of the to-be-processed matrix. 10. The method according to claim 9 , wherein the distribution matrix is a one-dimensional matrix, and all elements in the to-be-processed matrix have a one-to-one correspondence with elements in the distribution matrix that are in same positions as the elements in the to-be-processed matrix; and the generating a distribution matrix of the to-be-processed matrix comprises: sequentially scanning the elements in the to-be-processed matrix; and when a scanned element is non-zero, setting a value of an element, corresponding to the scanned element, in the distribution matrix to 1; or when a value of the scanned element is 0, setting a value of the element, corresponding to the scanned element, in the distribution matrix to 0. 11. The method according to claim 9 , wherein there are N elements in the to-be-processed matrix and M non-zero elements in the to-be-processed matrix, and correspondingly, there are N elements in the distribution matrix, M elements whose values are 1 in the distribution matrix, and (M+N+1) elements in the compressed matrix, wherein N is a positive integer, M is a non-negative integer, and M is less than or equal to N. 12. The method according to claim 9 , wherein the to-be-processed matrix comprises a first to-be-processed matrix and a second to-be-processed matrix, a quantity of elements in the first to-be-processed matrix is the same as a quantity of elements in the second to-be-processed matrix, and correspondingly, the distribution matrix comprises a first distribution matrix and a second distribution matrix; and the method further comprises: obtaining a target value based on the first distribution matrix, the second distribution matrix, non-zero elements in the first to-be-processed matrix, and non-zero elements in the second to-be-processed matrix, wherein the target value is the same as a result of summing products of each element in the first to-be-processed matrix with an element in the second to-be-processed matrix that is in a same position as the element in the first to-be-processed matrix. 13. The method according to claim 12 , further comprising: generating a first non-zero element matrix constructed by sequentially obtaining the non-zero elements in the first to-be-processed matrix, and a second non-zero element matrix constructed by sequentially obtaining the non-zero elements in the second to-be-processed matrix, and wherein the obtaining a target value based on the first distribution matrix, the second distribution matrix, non-zero elements in the first to-be-processed matrix, and non-zero elements in the second to-be-processed matrix comprises: constructing a first mask matrix by sequentially obtaining first target elements from the second distribution matrix according to the first distribution matrix, wherein the first target elements are obtained from the same positions in the second distribution matrix as positions of elements whose values are 1 in the first distribution matrix; constructing a first reduced matrix by sequentially obtaining first valid elements from the first non-zero element matrix according to the first mask matrix, wherein the first valid elements are obtained from the same positions in the first non-zero element matrix as positions of elements whose values are 1 in the first mask matrix; constructing a second mask matrix by sequentially obtaining second target elements from the first distribution matrix according to the second distribution matrix, wherein the second target elements are obtained from the same positions in the first distribution matrix as positions of elements whose values are 1 in the second distribution matrix; constructing a second reduced matrix by sequentially obtaining second valid elements from the second non-zero element matrix according to the second mask matrix, wherein the second valid elements are obtained from the same positions in the second non-zero element matrix as positions of elements whose values are 1 in the second mask matrix; and obtaining the target value by summing products of each element in the first reduced matrix with an element in the second reduced matrix that is in a same position as the element in the first reduced matrix. 14. A matrix processing apparatus, comprising: a graphics processing unit (GPU) comprising a processor and a memory, wherein the memory is configured to store a p

Assignees

Inventors

Classifications

  • G06F17/16Primary

    Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

  • Adding; Subtracting {(G06F7/405 takes precedence)} · CPC title

  • characterised by logic function, e.g. AND, OR, NOR, NOT circuits (H03K19/003 - H03K19/01 take precedence) · CPC title

  • Compression (speech analysis-synthesis for redundancy reduction G10L19/00; for image communication H04N); Expansion; Suppression of unnecessary data, e.g. redundancy reduction · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11734386B2 cover?
A matrix processing method performed by a graphics processing unit (GPU) includes: determining a plurality of non-zero elements in a to-be-processed matrix at a processor in the GPU; generating a distribution matrix of the to-be-processed matrix at the processor, where the distribution matrix comprises identities for indicating positions of the plurality of non-zero elements in the to-be-proces…
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 22 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).