What technology area does this patent fall under?

Primary CPC classification G06F17/16. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Sparse matrix vector multiplication

US9830302B1 · US · B1

Patent metadata
Field	Value
Publication number	US-9830302-B1
Application number	US-201514685277-A
Country	US
Kind code	B1
Filing date	Apr 13, 2015
Priority date	Apr 16, 2014
Publication date	Nov 28, 2017
Grant date	Nov 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods for multiplying a sparse matrix by a vector using a single instruction multiple data (SIMD) architecture are provided. An example method includes sorting rows of the sparse matrix by a number of non-zero elements in the rows to generate sorted rows. The sorted rows are split to generate groups of the sorted rows. The number of rows in each group of the sorted rows is equal to the number of rows updated in parallel. The method allows for packing the sorted rows in each of the groups to generate packed rows. Each of the packed rows within the same group has the same length. Per clock cycle, C elements of the packed rows and data for selecting elements of the vector are provided to computational units in the SIMD architecture, where C is the number of computational units.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for multiplying a sparse matrix by a vector using a single instruction multiple data (SIMD) architecture, the system comprising: at least one processor; and a memory communicatively coupled with the at least one processor, the memory storing instructions which when executed by the at least one processor perform a method comprising: sorting rows of the sparse matrix by a number of non-zero elements in the rows to generate sorted rows; splitting the sorted rows to generate groups of the sorted rows, wherein a number of rows in each group of the sorted rows is equal to a number (R) of rows updated in parallel; packing the sorted rows in each of the groups to generate packed rows, wherein each of the packed rows within the same group has a length; and providing, per clock cycle, C elements of the packed rows to computational units in an SIMD architecture, wherein C is the number of computational units. 2. The system of claim 1 , wherein each of the computational units includes a multiply accumulate (MAC) unit. 3. The system of claim 1 , wherein the rows of the sparse matrix are sorted as one of the following: ascending order of the numbers of the non-zero elements; and descending order of the numbers of the non-zero elements. 4. The system of claim 1 , wherein R is equal to C. 5. The system of claim 1 , wherein R divides C. 6. The system of claim 1 , wherein: packing the rows includes removing zero elements in the rows. 7. The system of claim 6 , wherein packing the rows further includes causing a single row, within one of the groups, having less than the maximum number of non-zero elements for the one group, to have the same length as other of the packed rows in the one group having the maximum number of non-zero elements by zero padding. 8. The system of claim 1 , further comprising prior to sorting the rows: dividing the sparse matrix evenly into blocks of columns, wherein the sorting, splitting, packing, and providing are performed on each of the blocks of the columns separately. 9. The system of claim 1 , further comprising: providing data for selecting elements of the vector to the computational units in the SIMD architecture for allowing them to perform a multiplication of appropriate elements between the sparse matrix and the vector. 10. The system of claim 1 , further comprising, while providing elements of the packed rows: providing row identification information of the packed rows in the sparse matrix to the computational units in the SIMD architecture. 11. A method comprising: sorting rows of the sparse matrix by a number of non-zero elements in the rows to generate sorted rows; splitting the sorted rows to generate groups of the sorted rows, wherein a number of rows in each group of the sorted rows is equal to a number (R) of rows updated in parallel; packing the sorted rows in each of the groups to generate packed rows, wherein each of the packed rows within the same group has a length; and providing, per clock cycle, C elements of the packed rows to computational units in an SIMD architecture, wherein C is the number of computational units. 12. The method of claim 11 , wherein each of the computational units includes a multiply accumulate (MAC) unit. 13. The method of claim 11 , wherein the rows of the sparse matrix are sorted as one of the following: ascending order of the numbers of the non-zero elements; and descending order of the numbers of the non-zero elements. 14. The method of claim 11 , wherein one of the following conditions is satisfied: R is equal to C; and R divides C. 15. The method of claim 11 , wherein: packing the rows includes removing zero elements in the rows. 16. The method of claim 15 , wherein packing the rows further includes causing a single row, within one of the groups, having less than the maximum number of non-zero elements for the one group, to have the same length as other of the packed rows in the one group having the maximum number of non-zero elements by zero padding. 17. The method of claim 11 , further comprising prior to sorting the rows: dividing the sparse matrix evenly into blocks of columns; and wherein the sorting, splitting, packing, and providing are performed on each of the blocks of the columns separately. 18. The method of claim 11 , further comprising: providing data for selecting elements of the vector to the computational units in the SIMD architecture for allowing them to perform a multiplication of appropriate elements between the sparse matrix and the vector. 19. A non-transitory computer-readable storage medium having embodied thereon instructions, which when executed by at least one processor, perform steps of a method, the method comprising: sorting rows of the sparse matrix by a number of non-zero elements in the rows to generate sorted rows; splitting the sorted rows to generate groups of the sorted rows, wherein a number of rows in each group of the sorted rows is equal to a number (R) of rows updated in parallel; packing the sorted rows in each of the groups to generate packed rows, wherein each of the packed rows within the same group has a length; and providing, per clock cycle, C elements of the packed rows to computational units in an SIMD architecture, wherein C is the number of computational units.

Assignees

Knowles Electronics Llc

Inventors

Rub Leonardo

Classifications

G06F17/16Primary
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title

Patent family

Related publications grouped by family.

View patent family 60407638

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9830302B1 cover?: Systems and methods for multiplying a sparse matrix by a vector using a single instruction multiple data (SIMD) architecture are provided. An example method includes sorting rows of the sparse matrix by a number of non-zero elements in the rows to generate sorted rows. The sorted rows are split to generate groups of the sorted rows. The number of rows in each group of the sorted rows is equal t…
Who is the assignee on this patent?: Knowles Electronics Llc
What technology area does this patent fall under?: Primary CPC classification G06F17/16. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).