Matrix multiplication on a systolic array

US10769238B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10769238-B2
Application numberUS-201916576144-A
CountryUS
Kind codeB2
Filing dateSep 19, 2019
Priority dateMar 16, 2017
Publication dateSep 8, 2020
Grant dateSep 8, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processing element that comprises a first input data bit of the first data matrix and a first activation bit of a second data matrix. The method can also include determining, by the system, at the first processing element, a first partial sum of a third data matrix. Further, the method can include streaming, by the system, the first partial sum of the third data matrix from the first processing element.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a memory that stores computer executable components; and a processor that executes the computer executable components stored in the memory, wherein the computer executable components comprise: a load manager component that populates respective first registers of all processing elements of a systolic array structure with respective input data bits of a first data matrix, wherein the load manager component further inputs a first activation bit of a second data matrix into a first processing element of the processing elements, and the respective input data bits of the first data matrix are maintained in the respective first registers while a matrix multiplication of the first data matrix and the second data matrix is completed. 2. The system of claim 1 , further comprising: a computation component that determines, during the matrix multiplication, at the first processing element, a first partial sum of a third data matrix based on a first product of the first activation bit and a first input data bit of the first data matrix, and a first initial value of the third data matrix. 3. A computer-implemented method, comprising: populating, by a system operatively coupled to a processor, respective first registers of all processing elements of a systolic array structure with respective input data bits of a first data matrix; inputting, by the system, a first activation bit of a second data matrix into a first processing element of the processing elements; and maintaining, by the system, respective input data bits of the first data matrix in the respective first registers while a matrix multiplication of the first data matrix and the second data matrix is completed. 4. The computer-implemented method of claim 3 , further comprising: determining, by the system during the matrix multiplication, at the first processing element, a first partial sum of a third data matrix based on a first product of the first activation bit and a first input data bit of the first data matrix, and a first initial value of the third data matrix. 5. The computer-implemented method of claim 4 , further comprising: streaming, by the system during the matrix multiplication, the first partial sum of the third data matrix along a first dimension to a second processing element of the processing elements. 6. The computer-implemented method of claim 5 , further comprising: determining, by the system during the matrix multiplication, at the second processing element, a second partial sum of the third data matrix based on a second sum of the first partial sum and a second product determined based on a second activation bit of the second data matrix and a second input data bit of the first data matrix stored in the first register of the second processing element. 7. The computer-implemented method of claim 6 , further comprising: streaming, by the system during the matrix multiplication, the second partial sum of the third data matrix from the second processing element and along the first dimension. 8. The computer-implemented method of claim 5 , further comprising: determining, by the system during the matrix multiplication, during the matrix multiplication, at the second processing element, a second partial sum of the third data matrix based on a second sum of a second product and a second initial value of the third data matrix, wherein the second product is determined based on the first activation bit and a second input data bit of the first data matrix stored in the first register of the second processing element. 9. The computer-implemented method of claim 8 , further comprising: streaming, by the system during the matrix multiplication, the second partial sum of the third data matrix from the second processing element and along the first dimension. 10. A computer program product for facilitating matrix multiplication on a systolic array structure, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a processing component to cause the processing component to: populate respective first registers of all processing elements of a systolic array structure with respective input data bits of a first data matrix; input a first activation bit of a second data matrix into a first processing element of the processing elements; and maintain respective input data bits of the first data matrix in the respective first registers while a matrix multiplication of the first data matrix and the second data matrix is completed. 11. The computer program product of claim 10 , wherein the program instructions further cause the processing component to: determine, during the matrix multiplication, at the first processing element, a first partial sum of a third data matrix based on a first product of the first activation bit and a first input data bit of the first data matrix, and a first initial value of the third data matrix. 12. The computer program product of claim 11 , wherein the program instructions further cause the processing component to: stream, during the matrix multiplication, the first partial sum of the third data matrix along a first dimension to a second processing element of the processing elements. 13. The computer program product of claim 12 , wherein the program instructions further cause the processing component to: determine, during the matrix multiplication, at the second processing element, a second partial sum of the third data matrix based on a second sum of the first partial sum and a second product determined based on a second activation bit of the second data matrix and a second input data bit of the first data matrix stored in the first register of the second processing element. 14. The computer program product of claim 13 , wherein the program instructions further cause the processing component to: stream, during the matrix multiplication, the second partial sum of the third data matrix from the second processing element and along the first dimension. 15. The computer program product of claim 12 , wherein the program instructions further cause the processing component to: determine, during the matrix multiplication, during the matrix multiplication, at the second processing element, a second partial sum of the third data matrix based on a second sum of a second product and a second initial value of the third data matrix, wherein the second product is determined based on the first activation bit and a second input data bit of the first data matrix stored in the first register of the second processing element; and stream, during the matrix multiplication, the second partial sum of the third data matrix from the second processing element and along the first dimension.

Assignees

Inventors

Classifications

  • G06F17/16Primary

    Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10769238B2 cover?
Techniques facilitating matrix multiplication on a systolic array are provided. A computer-implemented method can comprise populating, by a system operatively coupled to a processor, respective first registers of one or more processing elements of a systolic array structure with respective input data bits of a first data matrix. The one or more processing elements can comprise a first processin…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 08 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).