What technology area does this patent fall under?

Primary CPC classification G06N3/063. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 08 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Multi-mode low-precision inner-product computation circuits for massively parallel neural inference engine

US11270196B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11270196-B2
Application number	US-201916653366-A
Country	US
Kind code	B2
Filing date	Oct 15, 2019
Priority date	Oct 15, 2019
Publication date	Mar 8, 2022
Grant date	Mar 8, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Neural inference chips for computing neural activations are provided. In various embodiments, the neural inference chip is adapted to: receive an input activation tensor comprising a plurality of input activations; receive a weight tensor comprising a plurality of weights; Booth recode each of the plurality of weights into a plurality of Booth-coded weights, each Booth coded value having an order; multiply the input activation tensor by the Booth coded weights, yielding a plurality of results for each input activation, each of the plurality of results corresponding to the orders of the Booth-coded weights; for each order of the Booth-coded weights, sum the corresponding results, yielding a plurality of partial sums, one for each order; and compute a neural activation from a sum of the plurality of partial sums.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer program product for computing neural activations, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a neural inference chip to cause the neural inference chip to perform a method comprising: receiving an input activation tensor comprising a plurality of input activations, the input activation tensor representing an image, each of the plurality of input activations corresponding to a value at a location in the image; receiving a weight tensor comprising a plurality of weights; Booth recoding each of the plurality of weights into a plurality of Booth-coded weights, each Booth coded value having an order; multiplying the input activation tensor by the Booth coded weights, yielding a plurality of results for each input activation, each of the plurality of results corresponding to the orders of the Booth-coded weights; for each order of the Booth-coded weights, summing the corresponding results, yielding a plurality of partial sums, one for each order; and computing a neural activation from a sum of the plurality of partial sums. 2. The computer program product of claim 1 , wherein the input activation tensor has a dimension of one. 3. The computer program product of claim 1 , wherein the weight tensor has a dimension of two. 4. The computer program product of claim 1 , wherein computing the neural activation comprises shifting each of the plurality of partial sums according to its corresponding order. 5. The computer program product of claim 1 , wherein computing the neural activation comprises shifting each of the plurality of partial sums according to a precision of the input activations. 6. The computer program product of claim 1 , wherein computing the neural activation comprises applying a nonlinear activation function to the sum of the plurality of partial sums. 7. The computer program product of claim 1 , wherein summing said corresponding results comprises applying a plurality of carry-save adders. 8. A computer program product for computing neural activations, the computer program product comprising a computer readable storage medium having program instructions embodied therewith, the program instructions executable by a neural inference chip to cause the neural inference chip to perform a method comprising: receiving an input activation tensor comprising a plurality of input activations, the input activation tensor representing an image, each of the plurality of input activations corresponding to a value at a location in the image; receiving a weight tensor comprising a plurality of weights; Booth recoding each of the plurality of input activations into a plurality of Booth-coded input activations, each Booth coded value having an order; multiplying the weight tensor by the Booth coded input activations, yielding a plurality of results for each weight, each of the plurality of results corresponding to the orders of the Booth-coded input activations; for each order of the Booth-coded input activations, summing the corresponding results, yielding a plurality of partial sums, one for each order; and computing a neural activation from a sum of the plurality of partial sums. 9. The computer program product of claim 8 , wherein the input activation tensor has a dimension of one. 10. The computer program product of claim 8 , wherein the weight tensor has a dimension of two. 11. The computer program product of claim 8 , wherein computing the neural activation comprises shifting each of the plurality of partial sums according to its corresponding order. 12. The computer program product of claim 8 , wherein computing the neural activation comprises shifting each of the plurality of partial sums according to a precision of the input activations. 13. The computer program product of claim 8 , wherein computing the neural activation comprises applying a nonlinear activation function to the sum of the plurality of partial sums. 14. The computer program product of claim 8 , wherein summing said corresponding results comprises applying a plurality of carry-save adders. 15. A neural inference chip for computing neural activations, the neural inference chip adapted to: receive an input activation tensor comprising a plurality of input activations, the input activation tensor representing an image, each of the plurality of input activations corresponding to a value at a location in the image; receive a weight tensor comprising a plurality of weights; Booth recode each of the plurality of weights into a plurality of Booth-coded weights, each Booth coded value having an order; multiply the input activation tensor by the Booth coded weights, yielding a plurality of results for each input activation, each of the plurality of results corresponding to the orders of the Booth-coded weights; for each order of the Booth-coded weights, sum the corresponding results, yielding a plurality of partial sums, one for each order; compute a neural activation from a sum of the plurality of partial sums. 16. The neural inference chip of claim 15 , wherein computing the neural activation comprises shifting each of the plurality of partial sums according to its corresponding order. 17. The neural inference chip of claim 15 , wherein computing the neural activation comprises shifting each of the plurality of partial sums according to a precision of the input activations. 18. The neural inference chip of claim 15 , wherein computing the neural activation comprises applying a nonlinear activation function to the sum of the plurality of partial sums. 19. The neural inference chip of claim 15 , wherein summing said corresponding results comprises applying a plurality of carry-save adders. 20. A neural inference chip for computing neural activations, the neural inference chip adapted to: receive an input activation tensor comprising a plurality of input activations, the input activation tensor representing an image, each of the plurality of input activations corresponding to a value at a location in the image; receive a weight tensor comprising a plurality of weights; Booth recode each of the plurality of input activations into a plurality of Booth-coded input activations, each Booth coded value having an order; multiply the weight tensor by the Booth coded input activations, yielding a plurality of results for each weight, each of the plurality of results corresponding to the orders of the Booth-coded input activations; for each order of the Booth-coded input activations, sum the corresponding results, yielding a plurality of partial sums, one for each order; compute a neural activation from a sum of the plurality of partial sums.

Assignees

Inventors

Classifications

G06N3/048
Activation functions · CPC title
G06N3/063Primary
using electronic means · CPC title
G06N3/0481
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 72840508

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11270196B2 cover?: Neural inference chips for computing neural activations are provided. In various embodiments, the neural inference chip is adapted to: receive an input activation tensor comprising a plurality of input activations; receive a weight tensor comprising a plurality of weights; Booth recode each of the plurality of weights into a plurality of Booth-coded weights, each Booth coded value having an ord…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 08 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).