What technology area does this patent fall under?

Primary CPC classification G06N5/04. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Dec 06 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Neural network weight distribution from a grid of memory elements

US11521085B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11521085-B2
Application number	US-202016842035-A
Country	US
Kind code	B2
Filing date	Apr 7, 2020
Priority date	Apr 7, 2020
Publication date	Dec 6, 2022
Grant date	Dec 6, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Neural inference chips for computing neural activations are provided. In various embodiments, a neural inference chip comprises at least one neural core, a memory array, an instruction buffer, and an instruction memory. The instruction buffer has a position corresponding to each of a plurality of elements of the memory array. The instruction memory provides at least one instruction to the instruction buffer. The instruction buffer advances the at least one instruction between positions in the instruction buffer. The instruction buffer provides the at least one instruction to at least one of the plurality of elements of the memory array from its associated position in the instruction buffer when the memory of the at least one of the plurality of elements contains data associated with the at least one instruction. Each element of the memory array provides a data block from its memory to its horizontal buffer in response to the arrival of an associated instruction from the instruction buffer. The horizontal buffer of each element of the memory array provides a data block to the horizontal buffer of another of the elements of the memory array or to the at least one neural core.

First claim

Opening claim text (preview).

What is claimed is: 1. A neural inference chip for computing neural activations, the neural inference chip comprising: at least one neural core; a memory array operatively coupled to the at least one neural core, the memory array comprising a plurality of elements, each element comprising a memory and a horizontal buffer, the horizontal buffer of each element of the memory array being in communication with either the horizontal buffer of another of the elements of the memory array or to the at least one neural core; an instruction buffer in communication with the memory array, the instruction buffer having a position corresponding to each of the plurality of elements of the memory array; an instruction memory in communication with the instruction buffer, wherein the instruction memory is adapted to provide at least one instruction to the instruction buffer, the instruction buffer is adapted to advance the at least one instruction between positions in the instruction buffer, the instruction buffer is adapted to provide the at least one instruction to at least one of the plurality of elements of the memory array from its associated position in the instruction buffer when the memory of the at least one of the plurality of elements contains data associated with the at least one instruction, each of the plurality of elements of the memory array is adapted to provide a data block from its memory to its horizontal buffer in response to the arrival of an associated instruction from the instruction buffer, the horizontal buffer of each element of the memory array is adapted to provide a data block to the horizontal buffer of another of the elements of the memory array or to the at least one neural core. 2. The neural inference chip of claim 1 , wherein: the instruction buffer is adapted to advance instructions between positions in the instruction buffer at a rate of one position per cycle, the horizontal buffer of each element of the memory array is adapted to provide a data block to the horizontal buffer of another of the elements of the memory array or to the at least one neural core at a rate of one data block per cycle. 3. The neural inference chip of claim 1 , comprising an array of neural cores, the array of neural cores comprising the at least one neural core and having a plurality of rows. 4. The neural inference chip of claim 1 , wherein the memory array is one-dimensional, the plurality of elements of the memory array being arranged in one row and a plurality of columns. 5. The neural inference chip of claim 1 , wherein the memory array is two-dimensional, the plurality of elements of the memory array being arranged in a plurality of rows and a plurality of columns. 6. The neural inference chip of claim 5 , wherein each element of the memory array further comprises a vertical buffer, the vertical buffer of each element of the memory array being in communication with the vertical buffer of another element of the memory array. 7. The neural inference chip of claim 6 , wherein: each of the plurality of elements of the memory array is adapted to provide a data block from its memory to its vertical buffer in response to the arrival of an associated instruction from the instruction buffer, each of the plurality of elements of the memory array is adapted to provide the data block from its vertical buffer to its horizontal buffer, the vertical buffer of each element of the memory array is adapted to provide the data block to the vertical buffer of another of the elements of the memory array. 8. The neural inference chip of claim 7 , wherein: the horizontal buffer of each element of the memory array is adapted to provide a data block to the horizontal buffer of another of the elements of the memory array or to the at least one neural core at a rate of one data block per cycle, the vertical buffer of each element of the memory array is adapted to provide a data block to the vertical buffer of another of the elements of the memory array at a rate of one data block per cycle. 9. The neural inference chip of claim 6 , wherein each element of the memory array further comprises a layover buffer, the layover buffer of each element of the memory array being in communication with the horizontal buffer and the vertical buffer of that element of the memory array. 10. The neural inference chip of claim 9 , wherein: each of the plurality of elements of the memory array is adapted to provide a data block from its memory to its vertical buffer in response to the arrival of an associated instruction from the instruction buffer, each of the plurality of elements of the memory array is adapted to provide the data block from its vertical buffer to its layover buffer, each of the plurality of elements of the memory array is adapted to provide the data block from its layover buffer to its horizontal buffer, the vertical buffer of each element of the memory array is adapted to provide the data block to the vertical buffer of another of the elements of the memory array. 11. The neural inference chip of claim 10 , wherein: the horizontal buffer of each element of the memory array is adapted to provide a data block to the horizontal buffer of another of the elements of the memory array or to the at least one neural core at a rate of one data block per cycle, the vertical buffer of each element of the memory array is adapted to provide a data block to the vertical buffer of another of the elements of the memory array at a rate of one data block per cycle, the layover buffer of each element of the memory array is adapted to provide a data block to the horizontal buffer of that element of the memory array at a rate of one data block per cycle. 12. The neural inference chip of claim 10 , wherein: the instruction memory is adapted to provide a plurality of instructions to the instruction buffer per cycle, each position of the instruction buffer is adapted to store a plurality of instructions, and the instruction buffer is adapted to advance a plurality of instructions between positions in the instruction buffer per cycle. 13. A neural inference chip for computing neural activations, the neural inference chip comprising: at least one neural core; a memory array operatively coupled to the at least one neural core, the memory array comprising a plurality of elements, each element comprising a memory, a horizontal buffer, and a vertical buffer, the horizontal buffer of each element of the memory array being in communication with either the horizontal buffer of another of the elements of the memory array or to the at least one neural core and the vertical buffer of each element of the memory array being in communication with the vertical buffer of another element of the memory array; a plurality of instruction buffers in communication with the memory array, each of plurality of instruction buffers having a position corresponding to one of the plurality of elements of the memory array; a plurality of instruction memories, each in communication with one of the plurality of instruction buffers, wherein each instruction memory is adapted to provide at least one instruction to its instruction buffer, each instruction buffer is adapted to advance the at least one instruction between positions in that instruction buffer, each instruction buffer is adapted to provide the at least one instruction to at least one of the plurality of elements of the memory array from its associated position in that instruction buffer when the memory of the at least one of the plurality of elements contains data associated with the at least one instruction, each of the plurality of

Assignees

Inventors

Classifications

G06N5/04Primary
Inference or reasoning models · CPC title
G06N3/063Primary
using electronic means · CPC title
G06F17/16
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
G06F17/153Primary
Multidimensional correlation or convolution · CPC title

Patent family

Related publications grouped by family.

View patent family 74418445

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11521085B2 cover?: Neural inference chips for computing neural activations are provided. In various embodiments, a neural inference chip comprises at least one neural core, a memory array, an instruction buffer, and an instruction memory. The instruction buffer has a position corresponding to each of a plurality of elements of the memory array. The instruction memory provides at least one instruction to the instr…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Dec 06 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Instruction distribution in an array of neural network cores

Multi-memory on-chip computational network

Neuromorphic computer with reconfigurable memory mapping for various neural network topologies

Processor with memory array operable as either last level cache slice or neural network unit memory

Neural network hardware accelerator architectures and operating method thereof

Deep Learning Neural Network Classifier Using Non-volatile Memory Array

Conditional parallel processing in fully-connected neural networks

Frequently asked questions