Enabling hierarchical data loading in a resistive processing unit (RPU) array for reduced communication cost
US-12165046-B2 · Dec 10, 2024 · US
US2018373902A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2018373902-A1 |
| Application number | US-201616063892-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 21, 2016 |
| Priority date | Jan 21, 2016 |
| Publication date | Dec 27, 2018 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A circuit includes an engine to compute analog multiplication results between vectors of a sub-matrix, An analog to digital converter (ADC) generates a digital value for the analog multiplication results computed by the engine. A shifter shifts the digital value of analog multiplication results a predetermined number of bits to generate a shifted result. An adder adds the shifted result to the digital value of a second multiplication result to generate a combined multiplication result.
Opening claim text (preview).
What is claimed is: 1 . A circuit, comprising: an engine formed from a memristor array to compute analog multiplication results between vectors of a sub-matrix, the sub-matrix is programmed from a portion of an input matrix; an analog to digital converter (ADC) to generate a digital value for the analog multiplication results computed by the engine; a shifter to shift the digital value of analog multiplication result a predetermined number of bits to generate a shifted result; and an adder to add the shifted result to the digital value of a second multiplication result to generate a combined multiplication result. 2 . The circuit of claim 1 , wherein the engine configured to perform a matrix dot product operation between the vectors, a matrix cross product operation between the vectors, or a multiply operation between two scalar values. 3 . The circuit of claim 1 , further comprising a digital to analog converter (DAC) to generate analog representations of the vectors of the sub-matrix. 4 . The circuit of claim 3 , further comprising a vector buffer o store the vectors to be digitized by the DAC. 5 . The circuit of claim 3 , further comprising another engine that is configured as a cluster of engines with the engine, with the output of each engine in the cluster combined to form the combined multiplication result. 6 . The circuit of claim 5 , further comprising a configuration register and a truncation register, the configuration register to dynamically specify a number of DAC bits utilized by the DAC, a number of cell levels in a respective matrix, a number of bits in the ADC output, and a number for shifting the number of bits to generate the shifted result, the truncation register to truncate output from the adder to a predetermined bit width. 7 . The circuit of claim 5 , wherein the engines communicate across an active h-tree within the cluster of engines and the shift width varies at each level of the h-tree. 8 . The circuit of claim 7 , further comprising at least one other cluster of engines operating in parallel to the cluster of engines to process another portion of the input matrix, wherein output from each cluster of engines is added to form an overall multiplication result for the input matrix. 9 . The circuit of claim 8 , further comprising an analog to digital converter (ADC) array that is shared between at least two clusters to generate digital values for analog computation results from the respective clusters. 10 . The circuit of claim 9 , further comprising a system controller to control the ADC array and to aggregate the computation results from the respective clusters. 11 . A circuit, comprising: a first cluster to compute a first intermediate result by multiplying vectors from a first portion of an input matrix; a second cluster to compute a second intermediate result by multiplying vectors from a second portion of the input matrix; an analog to digital converter (ADC) to digitize the first and second intermediate results, respectively; and a controller to combine the digitized results of the first and second intermediate results, respectively, wherein each of the first and second clusters include: a plurality of engines formed from a memristor array to compute analog multiplication results between vectors of a sub-matrix, the sub-matrix is programmed from a portion of the input matrix; a shifter to shift a digital value of a first cluster analog multiplication result a predetermined number of bits to generate a shifted cluster result; and an adder to add the shifted cluster result to a digital value of a second cluster multiplication result to generate a combined multiplication result from the first cluster and the second cluster. 12 . The circuit of claim 11 , further comprising a digital to analog converter (DAC) to generate analog representations of the vectors of the first and second cluster, respectively. 13 . The circuit of claim 12 , further comprising a configuration register to dynamically specify a number of DAC bits utilized by the DAC, a number of cell levels in a respective matrix, a number of bits in the ADC output of the ADC array, and a number for shifting the number of bits to generate the shifted cluster result. 14 . A method, comprising: computing a first analog multiplication result between vectors of a first sub-matrix, the first sub-matrix is programmed from a portion of an input matrix; computing a second analog multiplication result between vectors of a second sub-matrix, the second sub-matrix is programmed from another portion of the input matrix; generating a digital value for the first and second analog multiplication results, respectively; shifting the digital value of first analog multiplication result a predetermined number of bits to generate a shifted result; and adding the shifted result to the digital value of he second multiplication result to generate a combined multiplication result from the first sub-matrix and the second sub-matrix. 15 . The method of claim 14 , performing a matrix dot product operation between the vectors, performing a matrix cross product operation between the vectors, or performing a multiply operation between two scalar values.
for multiplication or division {(G06G7/19 and G06G7/24 take precedence; measuring electric power G01R21/00)} · CPC title
Arrangements for executing machine instructions, e.g. instruction decode (for executing microinstructions G06F9/22) · CPC title
using multiple magnetic layers (G11C11/155 takes precedence) · CPC title
Digital stores characterised by the use of storage elements not covered by groups G11C11/00, G11C23/00, or G11C25/00 · CPC title
Data managing, e.g. manipulating data before writing or reading out, data bus switches or control circuits therefor · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.