Vector computational unit
US-2024427729-A1 · Dec 26, 2024 · US
US12340304B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12340304-B2 |
| Application number | US-202117398791-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 10, 2021 |
| Priority date | Aug 10, 2021 |
| Publication date | Jun 24, 2025 |
| Grant date | Jun 24, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods and apparatus for performing machine learning tasks, and in particular, to a neural-network-processing architecture and circuits for improved handling of partial accumulation results in weight-stationary operations, such as operations occurring in compute-in-memory (CIM) processing elements (PEs). One example PE circuit for machine learning generally includes an accumulator circuit, a flip-flop array having an input coupled to an output of the accumulator circuit, a write register, and a first multiplexer having a first input coupled to an output of the write register, having a second input coupled to an output of the flip-flop array, and having an output coupled to a first input of the first accumulator circuit.
Opening claim text (preview).
What is claimed is: 1. A processing element (PE) circuit comprising: a first accumulator circuit; a flip-flop array having an input coupled to an output of the first accumulator circuit; a write register; a first multiplexer having a first input coupled to an output of the write register, having a second input coupled to an output of the flip-flop array, and having an output coupled to a first input of the first accumulator circuit; an adder circuit; and an accumulator-and-shifter circuit having an input coupled to an output of the adder circuit and having an output coupled to a second input of the first accumulator circuit. 2. The PE circuit of claim 1 , further comprising a read register having an input coupled to the output of the flip-flop array. 3. The PE circuit of claim 2 , further comprising a write bus coupled to an output of the read register. 4. The PE circuit of claim 3 , further comprising a read bus coupled to an input of the write register. 5. A neural network circuit comprising a plurality of PE circuits, wherein at least one of the plurality of PE circuits comprises the PE circuit of claim 4 , the neural network circuit further comprising: a memory coupled to the write bus and to the read bus; and a global memory coupled to the read bus, wherein another one of the plurality of PE circuits has an output coupled to a second input of the first accumulator circuit. 6. The neural network circuit of claim 5 , wherein the other one of the plurality of PE circuits does not include a write register. 7. The PE circuit of claim 1 , further comprising a read bus coupled to an input of the write register, wherein the read bus is configured to couple to at least one of a tightly coupled memory or a global memory, external to the PE circuit. 8. The PE circuit of claim 1 , further comprising: a second accumulator circuit; and a second multiplexer having a first input coupled to an output of the second accumulator circuit and having an output coupled to the first input of the first accumulator circuit. 9. The PE circuit of claim 1 , wherein the PE circuit is a digital compute-in-memory (DCIM) PE circuit and wherein the PE circuit further comprises: a DCIM array; a bit-column adder tree circuit coupled to the DCIM array; and a weight-shift adder tree circuit coupled to the bit-column adder tree circuit. 10. The PE circuit of claim 9 , wherein the DCIM array comprises a plurality of compute-in-memory cells and wherein at least one of the compute-in-memory cells comprises an eight-transistor (8T) static random-access memory (SRAM) cell. 11. A method of neural network processing, comprising: receiving, at a first input of a multiplexer, first data from a write register; receiving, at a second input of the multiplexer, second data from a flip-flop array; receiving, at an accumulator circuit, third data from a processing element (PE) circuit; selecting, with the multiplexer, data to output to the accumulator circuit between the first data and the second data; and accumulating, with the accumulator circuit, the selected output data from the multiplexer and the third data received from the PE circuit to generate accumulated data, wherein the PE circuit comprises: an adder circuit; and an accumulator-and-shifter circuit having an input coupled to an output of the adder circuit and having an output coupled to an input of the accumulator circuit. 12. The method of claim 11 , further comprising: outputting the accumulated data to the flip-flop array; shifting, with the flip-flop array, the accumulated data to a read register; and writing the accumulated data from the read register to a memory via a write bus. 13. The method of claim 11 , further comprising: outputting the accumulated data to the flip-flop array; shifting, with the flip-flop array, the accumulated data to a read register; processing the accumulated data from the read register with digital post-processing logic; and writing the processed, accumulated data to a memory via a write bus coupled between the digital post-processing logic and the memory.
using electronic means · CPC title
Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title
Multiplying only · CPC title
Adding; Subtracting (G06F7/483 - G06F7/491, G06F7/544 - G06F7/556 take precedence) · CPC title
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.