Dataflow architecture processor statically reconfigurable to perform n-dimensional affine transformation
US-2024233068-A1 · Jul 11, 2024 · US
US12475524B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12475524-B2 |
| Application number | US-202318095137-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 10, 2023 |
| Priority date | Jan 10, 2023 |
| Publication date | Nov 18, 2025 |
| Grant date | Nov 18, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A statically reconfigurable dataflow architecture processor (SRDAP) performs an N-dimensional affine transform specified by a matrix on an input image to produce an output image includes L address pattern memory units (PMUs) comprising a memory arranged as a vector of L banks, and L corresponding data PMUs. Each data PMU receives a copy of the input image. In parallel: each address PMU writes an L-vector of addresses of input pixels to the vector of L banks and reads a single address of the written L-vector of addresses from a predetermined bank corresponding to a PMU number of the address PMU among the L address PMUs, and each data PMU receives the single address from the corresponding address PMU and uses it to read a single input pixel from the data PMU memory. A tree of pattern compute units coalesces the L single input pixels into an L-vector of input pixels.
Opening claim text (preview).
The invention claimed is: 1 . A statically reconfigurable dataflow architecture processor (SRDAP) to perform an N-dimensional affine transform specified by a matrix on an N-dimensional input image to produce an N-dimensional output image comprising output pixels, each output pixel having a coordinate in each of the N dimensions, wherein Nis an integer having a value of at least two, comprising: a plurality of pattern memory units (PMUs), each comprising a respective memory, including L data PMUs respectively corresponding to L address PMUs, the respective memories of the L address PMUs each arranged as a vector of L banks, wherein L is a positive integer; and a plurality of pattern compute units (PCUs); wherein each data PMU of the L data PMUs is statically reconfigurable to receive a copy of the input image and to write the copy of the input image into the respective memory of the data PMU; wherein, in parallel, each address PMU of the L address PMUs is further statically reconfigurable to: write an L-vector of addresses of input pixels to the vector of L banks in the respective memory of the address PMU, wherein each address of an input pixel comprises flattened coordinates of the input pixel calculated by application of a respective row of the transform matrix to coordinates of an output pixel; and read a single address of the written L-vector of addresses from a predetermined bank of the L banks of the respective memory of the address PMU, wherein the predetermined bank corresponds to a PMU number of the address PMU among the L address PMUs; wherein, in parallel, each data PMU of the L data PMUs is further statically reconfigurable to receive the single address from the address PMU corresponding to the data PMU, and use the single address to read a single input pixel from the respective memory of the data PMU, whereby a set of L single input pixels are produced; and wherein three or more PCUs of the plurality of PCUs arranged as a tree of PCUs is statically reconfigurable to coalesce the set of L single input pixels read in parallel from the L data PMUs into an L-vector of input pixels. 2 . The SRDAP of claim 1 , further comprising: configuration stores loadable with configuration data to statically reconfigure the SRDAP. 3 . The SRDAP of claim 2 , wherein to statically reconfigure the SRDAP comprises loading the configuration stores with the configuration data prior to initiation of production of the output image without re-loading the configuration stores with the configuration data until completion of production of the output image. 4 . The SRDAP of claim 1 , further comprising: an output PMU of the plurality of PMUs, wherein the output PMU is statically reconfigurable to receive the coalesced L-vector of input pixels from the tree of PCUs to write into the respective memory of the output PMU. 5 . The SRDAP of claim 4 , wherein the SRDAP is statically reconfigurable to sustain writing a series of coalesced L-vector of input pixels, including the coalesced L-vector of input pixels, from the tree of PCUs to the respective memory of the output PMU at a throughput of at least one coalesced L-vector of input pixels of the series per N clock cycles. 6 . The SRDAP of claim 4 , wherein to form the output image in the output PMU: each address PMU of the L address PMUs is further statically reconfigurable to: write a series of L-vectors of addresses of input pixels, including the L-vector of addresses of input pixels, to the vector of L banks in the respective memory of the address PMU; and read a series of single addresses of the written L-vector of addresses, including the single address of the written L-vector of addresses, from the predetermined bank; each data PMU of the L data PMUs is further statically reconfigurable to receive the series of the single addresses from the address PMU corresponding to the data PMU, and use the series of the single addresses to read a series of single input pixels, including the single input pixel, from the respective memory of the data PMU, whereby a series of sets of L single input pixels, including the set of L single input pixels, are produced; the tree of PCUs is further statically reconfigurable to coalesce the series of sets of L single input pixels into a series of L-vectors of input pixels, including the L-vector of input pixels; and the output PMU is further statically reconfigurable to receive the series of L-vectors of input pixels from the tree of PCUs to write into the respective memory of the output PMU. 7 . The SRDAP of claim 6 , further comprising: one or more switches statically reconfigurable to receive the series of L-vectors of input pixels and to broadcast a copy of each of the L-vectors of input pixels of the series of L-vectors of input pixels to each of the L address PMUs for writing to the vector of L banks. 8 . The SRDAP of claim 6 , wherein each address PMU of the L address PMUs comprises a counter that provides an address into the respective memory of the address PMU; and wherein the counter is statically reconfigurable with an initial value equal to the PMU number of the address PMU, a stride value equal to L, and a maximum value equal to a size of the output image. 9 . The SRDAP of claim 6 , wherein the series of sets of L single input pixels comprises a number of sets of single input pixels equal to a quotient of a size of the output image divided by L; and wherein each data PMU of the L data PMUs comprises a counter is statically reconfigurable to count a number of times to control the data PMU to read from the respective memory of the data PMU to form the series of single input pixels. 10 . The SRDAP of claim 1 , further comprising: one or more switches statically reconfigurable to receive the input image and to broadcast the copies of the input image to the L data PMUs. 11 . The SRDAP of claim 10 , wherein the one or more switches are statically reconfigurable to receive the input image as a series of L-vectors of initial input pixels from a memory external to the SRDAP and to broadcast the copies of the input image to the L data PMUs as the series of the L-vectors of initial input pixels; wherein the series of the L-vectors of initial input pixels comprises a number of the L-vectors of initial input pixels equal to a quotient of a size of the output image divided by L; and wherein each data PMU of the L data PMUs comprises a counter statically reconfigurable to count a number of times to control the data PMU to write an L-vector of initial input pixels of the series of the L-vectors of input pixels to the respective memory of the data PMU. 12 . The SRDAP of claim 1 , wherein the tree of PCUs comprises: a first level of L/2 PCUs each configured to receive a respective two of the set of L single input pixels and to coalesce the respective two single input pixels into a respective 2-vector of input pixels; P intermediate levels of L/(4*J) PCUs each, wherein each intermediate level is denoted J, J is an integer from 1 through P, and P is an integer no less than (log 2 L)−2, wherein each PCU of intermediate level J is configured to receive a respective two (2{circumflex over ( )}J)-vectors of input pixels from a previous intermediate level J<1 and to coalesce the respective two (2{circumflex over ( )}J)-vectors of input pixels into a respective (2{circumflex over ( )}(J+1))-vector of input pixels; and a last level of one PCU configured to receive two L/2-vectors of input pixels from a previous intermediate level P and to coalesce the two L/2-vectors of input pixels into the L-vector of input pixels. 13 . The SRDAP of clai
Related publications grouped by family.
Answers are generated from the same data shown on this page.