Shift register with reduced wiring complexity
US-2017251184-A1 · Aug 31, 2017 · US
US9986187B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9986187-B2 |
| Application number | US-201715628527-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 20, 2017 |
| Priority date | Jul 1, 2016 |
| Publication date | May 29, 2018 |
| Grant date | May 29, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method is described that includes, on an image processor having a two dimensional execution lane array and a two dimensional shift register array, repeatedly shifting first content of multiple rows or columns of the two dimensional shift register array and repeatedly executing at least one instruction between shifts that operates on the shifted first content and/or second content that is resident in respective locations of the two dimensional shift register array that the shifted first content has been shifted into.
Opening claim text (preview).
The invention claimed is: 1. A processor comprising: a two-dimensional array of processing elements; and multiple shift-register planes, each shift-register plane comprising a separate two-dimensional shift-register array, wherein each shift register of each shift-register array is dedicated to one of the processing elements, wherein the processor is configured to execute instructions to perform a matrix multiplication operation on a first matrix stored in a first shift-register plane and a second matrix stored in a second shift-register plane to generate a result of the matrix multiplication operation stored in a third shift-register plane, wherein the first matrix comprises N columns, and wherein the instructions cause each processing element to perform, for N iterations, operations on shift registers dedicated to the processing element, the operations comprising: performing a multiplication operation between a first value stored in a first shift register of the first shift-register plane and a second value stored in a second shift register of the second shift-register plane, performing an addition operation between a result of the multiplication operation and a summation value stored in a third shift register of the third shift-register plane, updating the summation value stored in the third shift register of the third shift-register plane using a result of the addition operation, and based on determining that a current iteration is not a last iteration: shifting data in the first shift-register plane one unit in a first dimension corresponding to rows of the first matrix, and shifting data in the second shift-register plane one unit in a second dimension corresponding to columns of the second matrix. 2. The processor of claim 1 , wherein the first matrix comprises R rows and the second matrix comprises S columns, and wherein the operations further comprise: performing a row-wise rotational shearing transformation on the first matrix including shifting each row i, from row 0 to row R−1, by i units; and performing a column-wise rotational shearing transformation on the second matrix including shifting each column j, from column 0 to column S−1, by j units. 3. The processor of claim 1 , wherein the operations further comprise: loading an initial condition for the matrix multiplication by loading initial values into shift registers of the third shift-register plane. 4. The processor of claim 1 , wherein the operations further comprise computing a two-dimensional discrete Fourier transform (2D DFT) using a first matrix of real values, a first matrix of imaginary values, a second matrix of real values, and a second matrix of imaginary values including: computing a real portion of the 2D DFT including: performing a first matrix multiplication operation between the first matrix of real values and the second matrix of real values, performing a second matrix multiplication operation between the first matrix of imaginary values and the second matrix of imaginary values, and subtracting a result of the second matrix multiplication operation from the first matrix multiplication operation; and computing an imaginary portion of the 2D DFT including: performing a third matrix multiplication operation between the first matrix of real values and the second matrix of imaginary values, performing a fourth matrix multiplication operation between the second matrix of real values and the first matrix of imaginary values, and adding a result of the third matrix multiplication operation and a result of the fourth matrix multiplication operation. 5. The processor of claim 4 , wherein the operations further comprise: moving the computed real portion of the 2D DFT to another shift-register plane or to memory; reloading the first matrix of real values and the first matrix of imaginary values back into respective shift-register planes; and performing a row-wise rotational shearing transformation on the first matrix of real values and the first matrix of imaginary values. 6. A computer program product, encoded on one or more non-transitory computer storage media, comprising instructions that when executed by a processor comprising: a two-dimensional array of processing elements, and multiple shift-register planes, each shift-register plane comprising a separate two-dimensional shift-register array, wherein each shift register of each shift-register array is dedicated to one of the processing elements, cause the processor to perform a matrix multiplication operation on a first matrix stored in a first shift-register plane and a second matrix stored in a second shift-register plane to generate a result of the matrix multiplication operation stored in a third shift-register plane, wherein the first matrix comprises N columns, and wherein the instructions cause each processing element to perform, for N iterations, operations on shift registers dedicated to the processing element, the operations comprising: performing a multiplication operation between a first value stored in a first shift register of the first shift-register plane and a second value stored in a second shift register of the second shift-register plane, performing an addition operation between a result of the multiplication operation and a summation value stored in a third shift register of the third shift-register plane, updating the summation value stored in the third shift register of the third shift-register plane using a result of the addition operation, and based on determining that a current iteration is not a last iteration: shifting data in the first shift-register plane one unit in a first dimension corresponding to rows of the first matrix, and shifting data in the second shift-register plane one unit in a second dimension corresponding to columns of the second matrix. 7. The computer program product of claim 6 , wherein the first matrix comprises R rows and the second matrix comprises S columns, and wherein the operations further comprise: performing a row-wise rotational shearing transformation on the first matrix including shifting each row i, from row 0 to row R−1, by i units; and performing a column-wise rotational shearing transformation on the second matrix including shifting each column j, from column 0 to column S−1, by j units. 8. The computer program product of claim 6 , wherein the operations further comprise: loading an initial condition for the matrix multiplication by loading initial values into shift registers of the third shift-register plane. 9. The computer program product of claim 6 , wherein the operations further comprise computing a two-dimensional discrete Fourier transform (2D DFT) using a first matrix of real values, a first matrix of imaginary values, a second matrix of real values, and a second matrix of imaginary values including: computing a real portion of the 2D DFT including: performing a first matrix multiplication operation between the first matrix of real values and the second matrix of real values, performing a second matrix multiplication operation between the first matrix of imaginary values and the second matrix of imaginary values, and subtracting a result of the second matrix multiplication operation from the first matrix multiplication operation; and computing an imaginary portion of the 2D DFT including: performing a third matrix multiplication operation between the first matrix of real values and the second matrix of imaginary values, performing a fourth matrix multiplication operation between the second matrix of real values and the first matrix of imaginary values, and adding a result of the third matrix multiplication operation and a result of the fourth matrix multiplication operat
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Horizontal readout lines, multiplexers or registers · CPC title
Arithmetic instructions · CPC title
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
Two dimensional arrays, e.g. mesh, torus · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.