Sparse systolic array design

US11669489B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11669489-B2
Application numberUS-202117490830-A
CountryUS
Kind codeB2
Filing dateSep 30, 2021
Priority dateSep 30, 2021
Publication dateJun 6, 2023
Grant dateJun 6, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A systolic array can be configured to skip distributed operands that have zero-values, resulting in improved resource efficiency. A skip module is introduced to receive operands from memory, identify whether they have a zero value or not, and, if they are nonzero, generate an operand vector including an index before sending the operand vector to a processing element.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a first row of processing elements; a memory; and a skip module, the skip module configured to: receive a sequence of operands from the memory, the sequence including at least a first operand and a second operand; generate a first operand vector based on an identification that the first operand is a nonzero operand; skip the second operand based on an identification that the second operand is a zero-value operand; and send the first operand vector to each processing element included in the first row of processing elements. 2. The system of claim 1 , wherein a first processing element included in the first row of processing elements is configured to: receive the first operand vector from the skip module; identify the first operand from the first operand vector; receive a third operand; and perform an operation using the first operand and the third operand. 3. The system of claim 2 , wherein the operation includes a multiply-accumulate (MAC) operation. 4. The system of claim 3 , wherein the first processing element is a 3-way MAC unit. 5. The system of claim 2 , further comprising a first register, wherein the first processing element is further configured to store a result of the operation in the first register. 6. The system of claim 5 , further comprising: a second register; an operand register; and a second row of processing elements, wherein a second processing element included in the second row of processing elements is configured to: receive a value from the first register; store the value in the second register; receive a second operand vector; identify a fourth operand from the second operand vector; receive a fifth operand from the operand register; and perform a second operation using the fourth operand and the fifth operand. 7. The system of claim 5 , wherein the second processing element is further configured to add a second result of the second operation to the value stored in the second register. 8. The system of claim 2 , wherein: the performing the operation requires a first number of cycles; the processing elements are configured to execute via a second number of threads; and the second number is greater than the first number. 9. The system of claim 1 , wherein sequence of operands includes a third operand and wherein the skip module is further configured to: generate a second operand vector based on an identification that the third operand is a nonzero operand; determine a first index for the first operand vector based on a first position of the first operand within the sequence, wherein the first operand vector includes the first index; and determine a second index for the second operand vector based on a second position of the third operand within the sequence, wherein the second operand vector includes the second index. 10. The system of claim 9 , wherein a processing element included in the first row of processing elements is configured to: receive the second operand vector from the skip module; identify the third operand from the second operand vector; identify the second index from the second operand vector; retrieve a fourth operand from an operand register based on the second index; and perform an operation using the third operand and the fourth operand. 11. The system of claim 1 , further comprising a second row of processing elements, wherein the skip module is further configured to: identify a first number of nonzero operands for use by the first row of processing elements; identify a second number of nonzero operands for use by the second row of processing elements; and redistribute nonzero operands amongst the first row and the second row based on the first number and the second number. 12. The system of claim 1 , wherein the skip module further includes a plurality of multiplexers configured to enable the skip module to select sequential nonzero operands across a range of operands. 13. A skip module apparatus, the skip module apparatus configured to: receive a sequence of operands from a memory, the sequence including at least a first operand a second operand; generate a first operand vector based on an identification that the first operand is a nonzero operand; skip the second operand based on an identification that the second operand is a zero-value operand; and send the first operand vector to each processing element included in a first row of processing elements. 14. The skip module apparatus of claim 13 , further configured to: determine a first index for the first operand vector based on a first position of the first operand within the sequence, wherein the first operand vector includes the first index. 15. The skip module apparatus of claim 13 , wherein the skip module is further configured to: identify a first number of nonzero operands for use by the first row of processing elements; identify a second number of nonzero operands for use by a second row of processing elements; and redistribute nonzero operands amongst the first row and the second row based on the first number and the second number. 16. A method, comprising: receiving a sequence of operands from a memory, the sequence including at least a first operand and a second operand; generating a first operand vector based on an identification that the first operand is a nonzero operand; skipping the second operand based on an identification that the second operand is a zero-value operand; and sending the first operand vector to each processing element included in a first row of processing elements. 17. The method of claim 16 , further comprising: determining a first index for the first operand vector based on a first position of the first operand within the sequence, wherein the first operand vector includes the first index. 18. The method of claim 16 , further comprising: identifying a first number of nonzero operands for use by the first row of processing elements; identifying a second number of nonzero operands for use by a second row of processing elements; and redistributing nonzero operands amongst the first row and the second row based on the first number and the second number. 19. The skip module apparatus of claim 13 , wherein the sequence of operands includes a third operand and wherein the skip module apparatus is further configured to: generate a second operand vector based on an identification that the third operand is a nonzero operand; and send the second operand vector to each processing elements included in the first row of processing elements. 20. The method of claim 16 , wherein the sequence includes a third operand and the method further comprises: generating a second operand vector based on an identification that the third operand is a nonzero operand; and sending the second operand vector to each processing elements included in the first row of processing elements.

Assignees

Inventors

Classifications

  • Adding; Subtracting (G06F7/483 - G06F7/491, G06F7/544 - G06F7/556 take precedence) · CPC title

  • Systolic arrays · CPC title

  • Arithmetic instructions · CPC title

  • Multiplying only · CPC title

  • Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11669489B2 cover?
A systolic array can be configured to skip distributed operands that have zero-values, resulting in improved resource efficiency. A skip module is introduced to receive operands from memory, identify whether they have a zero value or not, and, if they are nonzero, generate an operand vector including an index before sending the operand vector to a processing element.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F15/8046. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 06 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).