Shifter implemented circulant permutation matrix operations
US-2024386072-A1 · Nov 21, 2024 · US
US9645974B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-9645974-B1 |
| Application number | US-201514644967-A |
| Country | US |
| Kind code | B1 |
| Filing date | Mar 11, 2015 |
| Priority date | Mar 11, 2015 |
| Publication date | May 9, 2017 |
| Grant date | May 9, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present disclosure relates to optimized matrix multiplication using vector multiplication of interleaved matrix values. Two matrices to be multiplied are organized into specially ordered vectors, which are multiplied together to produce a portion of a product matrix.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method executed by one or more processors, the method comprising operations including: identifying a first matrix and a second matrix to be multiplied to produce a result matrix, wherein the first matrix is defined by rows, each row including a plurality of row values at corresponding row indices, and the second matrix is defined by columns, each column including a plurality of column values at correspond column indices; creating a first intermediate matrix including the row values from the first matrix ordered such that row values at the same index in adjacent pairs of rows from the first matrix are included at concurrent indices within a same row of the first intermediate matrix, the first intermediate matrix including at least two rows; creating a second intermediate matrix including the column values from the second matrix ordered such that column values at the same index in adjacent pairs of columns from the second matrix are included at concurrent indices within a same column of the second intermediate matrix, the second intermediate matrix including at least two columns; for each adjacent pair of rows in the first intermediate matrix: selecting adjacent pairs of columns in the second intermediate matrix, and for each selected adjacent pair of columns, and each pair of column values at the same index in the selected adjacent pair of columns: initialize a column vector with a repeating pattern of the pair of column values at the same index in the adjacent pair of columns; multiply the column vector by row vectors from the adjacent pair of rows containing row values at the same index as the pair of column values; and increment numeric values in a row of the result matrix corresponding to the adjacent pair of rows by a product of multiplying the column vector by the row vectors. 2. The method of claim 1 , wherein the one or more processor include a matrix multiplication instruction, and the method does not include executing the matrix multiplication instruction. 3. The method of claim 1 , wherein the one or more processor include a matrix multiplication instruction operable to perform the operations and the method further comprises, before identifying the first matrix and the second matrix, executing the matrix multiplication instruction to perform the operations. 4. The method of claim 1 , wherein multiplying the column vector by the row vectors includes executing a vector multiplication instruction included in the one or more processors and operable to multiply values at same indices in the column vector and each row vector together to produce a temporary vector, and to add pairs of adjacent values together to produce a product vector. 5. The method of claim 4 , wherein multiplying the column vector by the row vectors includes converting each product vector into two larger product vectors, each larger product vector including twice as many bits as the product vector. 6. The method of claim 5 , wherein converting each product vector into two larger product vectors includes multiplying each product vector by a vector including corresponding values of one for each value in the product vector. 7. The method of claim 1 , wherein the one or more processors are a plurality of processors connected by a communications network, and the plurality of processors are operable to perform multiple instances of initializing the column vector and multiplying the column vector by the row vectors in parallel. 8. A computer-implemented method executed by one or more processors, the method comprising operations including: identifying a first matrix and a second matrix to be multiplied to produce a result matrix, wherein the first matrix is defined by columns, each column including a plurality of column values at corresponding column indices, and the second matrix is defined by rows, each row including a plurality of row values at correspond row indices; creating a first intermediate matrix including the column values from the first matrix ordered such that column values at the same index in adjacent pairs of columns from the first matrix are included at concurrent indices within a same column of the first intermediate matrix, the first intermediate matrix including at least two columns; creating a second intermediate matrix including the row values from the second matrix ordered such that row values at the same index in adjacent pairs of rows from the second matrix are included at concurrent indices within a same row of the second intermediate matrix, the second intermediate matrix including at least two rows; for each adjacent pair of columns in the first intermediate matrix: selecting adjacent pairs of rows in the second intermediate matrix, and for each selected adjacent pair of rows, and each pair of row values at the same index in the selected adjacent pair of rows: initialize a row vector with a repeating pattern of the pair of row values at the same index in the adjacent pair of rows; multiply the row vector by column vectors from the adjacent pair of columns containing column values at the same index as the pair of row values; and increment numeric values in a column of the result matrix corresponding to the adjacent pair of columns by a product of multiplying the row vector by the column vectors. 9. The method of claim 8 , wherein the one or more processor include a matrix multiplication instruction, and the method does not include executing the matrix multiplication instruction. 10. The method of claim 8 , wherein the one or more processor include a matrix multiplication instruction operable to perform the operations and the method further comprises, before identifying the first matrix and the second matrix, executing the matrix multiplication instruction to perform the operations. 11. The method of claim 8 , wherein multiplying the row vector by the column vectors includes executing a vector multiplication instruction included in the one or more processors and operable to multiply values at same indices in the row vector and each column vector together to produce a temporary vector, and to add pairs of adjacent values together to produce a product vector. 12. The method of claim 11 , wherein multiplying the row vector by the column vectors includes converting each product vector into two larger product vectors, each larger product vector including twice as many bits as the product vector. 13. The method of claim 12 , wherein converting each product vector into two larger product vectors includes multiplying each product vector by a vector including corresponding values of one for each value in the product vector. 14. The method of claim 8 , wherein the one or more processors are a plurality of processors connected by a communications network, and the plurality of processors are operable to perform multiple instances of initializing the row vector and multiplying the row vector by the column vectors in parallel. 15. A system comprising: memory for storing data; and one or more processors operable to perform operations comprising: identifying a first matrix and a second matrix to be multiplied to produce a result matrix, wherein the first matrix is defined by rows, each row including a plurality of row values at corresponding row indices, and the second matrix is defined by columns, each column including a plurality of column values at correspond column indices; creating a first intermediate matrix including the row values from the first matrix ordered such that row values at the same index in adjacent pairs of rows from the first matrix are included at concurrent ind
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.