What technology area does this patent fall under?

Primary CPC classification G06F15/8046. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 15 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Compute near memory convolution accelerator

US11726950B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11726950-B2
Application number	US-201916586975-A
Country	US
Kind code	B2
Filing date	Sep 28, 2019
Priority date	Sep 28, 2019
Publication date	Aug 15, 2023
Grant date	Aug 15, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A compute near memory (CNM) convolution accelerator enables a convolutional neural network (CNN) to use dedicated acceleration to achieve efficient in-place convolution operations with less impact on memory and energy consumption. A 2D convolution operation is reformulated as 1D row-wise convolution. The 1D row-wise convolution enables the CNM convolution accelerator to process input activations row-by-row, while using the weights one-by-one. Lightweight access circuits provide the ability to stream both weights and input rows as vectors to MAC units, which in turn enables modules of the CNM convolution accelerator to implement convolution for both [1×1] and chosen [n×n] sized filters.

First claim

Opening claim text (preview).

What is claimed is: 1. An integrated circuit comprising: a memory to store one or more channels of a same filter row of a filter, each channel of the same filter row to be stored contiguously in the memory, row by row, in a channel-wise order; an input buffer to receive one or more channels of input row vectors of input activations streamed to the input buffer, row by row, in the channel-wise order; circuitry, including: a multiplexer circuit to select a selected weight from a stored filter row, and a multiplexer array to access the input activations from the input buffer based on a stride input and a weight position of the selected weight; and at least one array of multiply and accumulate (MAC) units coupled to the circuitry, the at least one array of MAC units to compute, from the selected weight and the input activations, a partial sum for a convolution; and wherein the circuitry enables access to the memory and the input buffer by the at least one array of MAC units to accelerate the convolution. 2. The integrated circuit of claim 1 , wherein: the stride input is a number applied in the circuitry to shift access to an input row vector of the input activations streamed to the input buffer by buffer positions of the input buffer equal to the number; and the weight position is relative to weight positions of neighboring weights of the stored filter row from which the selected weight was selected. 3. The integrated circuit of claim 2 , further comprising an output buffer to store the partial sum computed by the at least one array of MAC units. 4. The integrated circuit of claim 3 , wherein a width of the output buffer is coordinated with a width of the input buffer, the width of the output buffer equal to a number of the MAC units in the at least one array of MAC units. 5. The integrated circuit of claim 1 , wherein the circuitry and the at least one array of MAC units comprise a compute near memory (CNM) circuit block of a CNM accelerator, the integrated circuit further comprising a systolic array of CNM circuit blocks arranged to accumulate partial sums computed by respective arrays of MAC units in the systolic array of CNM circuit blocks into an output feature map representing the convolution. 6. The integrated circuit of claim 5 , wherein the one or more channels of input row vectors of input activations streamed to the input buffer are reused in each CNM circuit block in the systolic array of CNM circuit blocks, row by row, in the channel-wise order. 7. The integrated circuit of claim 5 , wherein one or more channels of same filter rows are distributed to the systolic array of CNM circuit blocks in row-wise order, the distributed filter rows to be stored contiguously in the memory of each CNM circuit block, row by row, in the channel-wise order. 8. The integrated circuit of claim 1 , wherein the memory includes any of a static random access memory (SRAM) and a register file (RF).

Assignees

Intel Corp

Inventors

Classifications

G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06F15/8046Primary
Systolic arrays · CPC title
G06F17/153
Multidimensional correlation or convolution · CPC title
G06N3/063
using electronic means · CPC title
G06F7/5443Primary
Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

Patent family

Related publications grouped by family.

View patent family 69178126

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11726950B2 cover?: A compute near memory (CNM) convolution accelerator enables a convolutional neural network (CNN) to use dedicated acceleration to achieve efficient in-place convolution operations with less impact on memory and energy consumption. A 2D convolution operation is reformulated as 1D row-wise convolution. The 1D row-wise convolution enables the CNM convolution accelerator to process input activation…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06F15/8046. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 15 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for reducing power consumption of convolution operations for artificial neural networks

Accelerating 2d convolutional layer mapping on a dot product architecture

Convolutional operation device with dimensional conversion

Systolic convolutional neural network

Efficient direct convolution using simd instructions

Method and Apparatus for Performing Different Types of Convolution Operations with the Same Processing Elements

Frequently asked questions