Folding column adder architecture for digital compute in memory

US12524372B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12524372-B2
Application numberUS-202117391718-A
CountryUS
Kind codeB2
Filing dateAug 2, 2021
Priority dateAug 2, 2021
Publication dateJan 13, 2026
Grant dateJan 13, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Certain aspects provide an apparatus for performing machine learning tasks, and in particular, to computation-in-memory architectures. One aspect provides a circuit for in-memory computation. The circuit generally includes: a plurality of memory cells on each of multiple columns of a memory, the plurality of memory cells being configured to store multiple bits representing weights of a neural network, wherein the plurality of memory cells on each of the multiple columns are on different word-lines of the memory; multiple addition circuits, each coupled to a respective one of the multiple columns; a first adder circuit coupled to outputs of at least two of the multiple addition circuits; and an accumulator coupled to an output of the first adder circuit.

First claim

Opening claim text (preview).

What is claimed is: 1 . A circuit for in-memory computation, comprising: a plurality of memory cells on each of multiple columns of a memory, the plurality of memory cells being configured to store multiple bits representing weights of a neural network, wherein the plurality of memory cells on each of the multiple columns are on different word-lines of the memory; multiple addition circuits, each coupled to a respective one of the multiple columns; a first adder circuit coupled to outputs of at least two of the multiple addition circuits; and an accumulator coupled to an output of the first adder circuit, wherein one or more portions of the first adder circuit are configured to be selectively disabled. 2 . The circuit of claim 1 , wherein each of the multiple addition circuits comprises an adder tree coupled to the plurality of memory cells on the respective one of the multiple columns. 3 . The circuit of claim 1 , wherein each of the multiple addition circuits comprises another accumulator. 4 . The circuit of claim 1 , wherein a first portion of the first adder circuit is configured to be selectively disabled during a first computation cycle, and wherein a second portion of the first adder circuit is configured to be selectively disabled during a second computation cycle. 5 . The circuit of claim 1 , further comprising a plurality of second adder circuits, wherein each of the second adder circuits is coupled between a corresponding one of the multiple addition circuits and the respective one of the multiple columns. 6 . The circuit of claim 5 , wherein each of the second adder circuits comprises an adder tree coupled to two or more of the word-lines. 7 . The circuit of claim 6 , wherein the adder tree is configured to add output signals of the memory cells that are on the respective one of the multiple columns and the two or more of the word-lines. 8 . The circuit of claim 5 , further comprising a sense amplifier coupled between each of the second adder circuits and the respective one of the multiple columns. 9 . The circuit of claim 1 , wherein the first adder circuit comprises an adder tree configured to add output signals of the at least two of the multiple addition circuits. 10 . The circuit of claim 9 , wherein one or more adders of the adder tree comprise a bit-shift and add circuit. 11 . The circuit of claim 1 , further comprising a clock generator circuit having a first output configured to output a first clock signal and having a second output configured to output a second clock signal, wherein: the multiple addition circuits are coupled to the first output of the clock generator and are configured to operate based on the first clock signal; and the first adder circuit is coupled to the second output of the clock generator and is configured to operate based on the second clock signal, the second clock signal having a different frequency than the first clock signal. 12 . The circuit of claim 11 , wherein the clock generator circuit comprises a frequency multiplier configured to generate the second clock signal based on the first clock signal. 13 . The circuit of claim 1 , further comprising a plurality of half latch circuits, each half latch circuit being coupled between the first adder circuit and one of the multiple addition circuits. 14 . The circuit of claim 1 , wherein: the plurality of memory cells are configured to be sequentially activated based on different activation inputs; and the accumulator is configured to accumulate output signals of the first adder circuit after the plurality of memory cells are sequentially activated. 15 . The circuit of claim 1 , wherein the accumulator is the only accumulator coupled to the output of the first adder circuit. 16 . The circuit of claim 1 , wherein: the multiple columns comprise a first subset of the multiple columns and a second subset of the multiple columns; and the first subset is activated during a first computation cycle. 17 . The circuit of claim 16 , wherein the second subset is activated during a second computation cycle, the second computation cycle being after the first computation cycle. 18 . The circuit of claim 16 , wherein: at least some of the memory cells on each of the word-lines are configured to store one of the weights of the neural network; and a quantity of the first subset of the multiple columns is associated with a quantity of bits of the one of the weights. 19 . The circuit of claim 16 , further comprising a clock gating circuit having outputs coupled to the multiple addition circuits and configured to deactivate a clock signal associated with processing signals from the second subset of the multiple columns. 20 . A method for in-memory computation, comprising: adding, via each of multiple addition circuits, output signals on a respective one of multiple columns of a memory, wherein a plurality of memory cells are on each of the multiple columns, the plurality of memory cells storing multiple bits representing weights of a neural network, wherein the plurality of memory cells on each of the multiple columns are on different word-lines of the memory; adding, via a first adder circuit, output signals of at least two of the multiple addition circuits; accumulating, via an accumulator, output signals of the first adder circuit; and selectively disabling one or more portions of the first adder circuit based on a number of bits associated with each of the weights. 21 . The method of claim 20 , wherein adding the output signals on the respective one of the multiple columns comprises accumulating output signals of the memory cells on the respective one of the multiple columns after two or more of the word-lines are sequentially activated. 22 . The method of claim 21 , further comprising adding, via a plurality of second adder circuits, wherein each of the second adder circuits is coupled between a corresponding one of the multiple addition circuits and the respective one of the multiple columns, output signals of the memory cells that are on the respective one of the multiple columns and the two or more of the word-lines. 23 . The method of claim 22 , further comprising sensing, via a sense amplifier coupled between each of the second adder circuits and the respective one of the multiple columns, the output signals of the memory cells that are on the respective one of the multiple columns and the two or more of the word-lines, wherein the adding via the second adder circuits is based on the sensed output signals. 24 . The method of claim 20 , wherein the adding the output signals of the at least two of the multiple addition circuits comprises performing a bit-shift and addition operation on the at least two of the multiple addition circuits. 25 . The method of claim 20 , further comprising: generating a first clock signal, wherein the multiple addition circuits operate based on the first clock signal; and generating a second clock signal, wherein the first adder circuit operates based on the second clock signal, the second clock signal having a different frequency than the first clock signal. 26 . The method of claim 20 , further comprising sequentially activating the plurality of memory cells based on different activation inputs, wherein the accumulating the output signals of the first adder circuit occurs after the plurality of memory cells are s

Assignees

Inventors

Classifications

  • G06F7/5443Primary

    Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

  • Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title

  • Non-logic devices, e.g. operational amplifiers · CPC title

  • Bit-line management or control circuits · CPC title

  • Word line control circuits, e.g. word line drivers, - boosters, - pull-up, - pull-down, - precharge · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12524372B2 cover?
Certain aspects provide an apparatus for performing machine learning tasks, and in particular, to computation-in-memory architectures. One aspect provides a circuit for in-memory computation. The circuit generally includes: a plurality of memory cells on each of multiple columns of a memory, the plurality of memory cells being configured to store multiple bits representing weights of a neural n…
Who is the assignee on this patent?
Qualcomm Inc
What technology area does this patent fall under?
Primary CPC classification G06F7/5443. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 13 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).