Neural network processing unit including approximate multiplier and system on chip including the same

US12223288B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12223288-B2
Application numberUS-201916239046-A
CountryUS
Kind codeB2
Filing dateJan 3, 2019
Priority dateJan 9, 2018
Publication dateFeb 11, 2025
Grant dateFeb 11, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A neural network processing unit may be configured to perform an approximate multiplication operation and a system on chip may include the neural network processing unit. The neural network processing unit may include a plurality of neural processing units and may perform a computation based on one or more instances of input data and a plurality of weights. At least one neural processing unit is configured to receive a first value and a second value and perform an approximate multiplication operation based on the first value and the second value and is further configured to perform a stochastic rounding operation based on an output value of the approximate multiplication operation.

First claim

Opening claim text (preview).

What is claimed is: 1. A neural network processing unit configured to perform a computation based on one or more instances of input data and a plurality of weights, the neural network processing unit comprising: processing circuitry configured to output at least a first control signal and a second control signal to at least one neural processor (NPU), the first and second control signals respectively configured to enable a selection by the at least one NPU between a training mode and an inference mode; and a plurality of neural processors (NPUs) configured to implement a neural network, and including the at least one NPU, wherein the at least one NPU of the plurality of NPUs is configured to switch to a fixed-point approximate multiplication training mode in response to receiving the first control signal, receive a first value and a second value while in the fixed-point approximate multiplication training mode, perform a fixed-point approximate multiplication operation based on the first value and the second value in response to receiving the first value and the second value while in the fixed-point approximate multiplication training mode, review an output value for a loss of accuracy, the output value including a result of the fixed-point approximate multiplication operation and the review including performing a stochastic rounding operation on the output value, and determining, based on a result of the stochastic rounding operation, the loss of accuracy for the output value, and train the neural network by tuning a parameter of the at least one NPU based on the determined loss, and wherein the at least one NPU of the plurality of NPUs is configured to select a general multiplication inference mode in response to receiving the second control signal, receive an input value while in the general multiplication inference mode, and perform a general multiplication operation based on the input value and the tuned parameter in response to receiving the input value while in the general multiplication inference mode. 2. The neural network processing unit of claim 1 , wherein the at least one NPU is further configured to alternatively select one element of the one or more instances of input data and an output value of one NPU of the plurality of NPUs, and output the selected one element as the first value. 3. The neural network processing unit of claim 1 , wherein the second value includes at least one weight of the plurality of weights. 4. The neural network processing unit of claim 1 , wherein the at least one NPU is further configured to accumulate one or more output values of the fixed-point approximate multiplication operation; and perform an addition operation based on the output value of the approximate multiplication operation and an output value of the accumulating. 5. The neural network processing unit of claim 4 , wherein the at least one NPU is configured to perform the stochastic rounding operation on the output value of the accumulating. 6. The neural network processing unit of claim 1 , wherein the at least one NPU includes a fixed-point-type device. 7. A system on chip, comprising: one or more semiconductor intellectual property cores (IPs); processing circuitry configured to output at least a first control signal and a second control signal to at least one neural processor (NPU), the first and second control signals respectively configured to enable a selection by the at least one NPU between a training mode and an inference mode; and a neural network processing unit configured implement a neural network and to receive input data from the one or more IPs, and perform a neural network computation based on the input data and a plurality of weights, the neural network processing unit including a plurality of neural processors (NPUs) and the plurality of NPUs include the at least one NPU, wherein the at least one NPU of the plurality of NPUs is configured to switch to a fixed-point approximate multiplication training mode in response to receiving the first control signal, receive a first value and a second value while in the fixed-point approximate multiplication training mode, perform a fixed-point approximate multiplication operation on the first value and the second value in response to receiving the first value and the second value while in the fixed-point approximate multiplication training mode, review an output value for a loss of accuracy, the output value including a result of the approximate multiplication operation and the review including perform a stochastic rounding operation on the output value to output a post activation regarding the output of the approximate multiplication operation, and determine, based on the result of the stochastic rounding operation, the loss of accuracy for the output value, and train the neural network by tuning a parameter of the at least one NPU based on the determined loss, and wherein the at least one NPU of the plurality of NPUs is configured to select a general multiplication inference mode in response to receiving the second control signal, receive an input value while in the general multiplication inference mode, and perform a general multiplication operation based on the input value and the tuned parameter in response to receiving the input value while in the general multiplication inference mode. 8. The system on chip of claim 7 , wherein the neural network processing unit further includes data random access memory (data RAM) configured to receive training data from the one or more IPs in the fixed-point approximate multiplication training mode and store the training data. 9. The system on chip of claim 8 , wherein the at least one NPU is configured to receive training data output from the data RAM and an output value of one of the plurality of NPUs, select one of the training data and the output value, and output the selected one of the training data and the output value as the first value. 10. The system on chip of claim 7 , wherein the second value includes at least one weight of the plurality of weights. 11. The system on chip of claim 7 , wherein the at least one NPU is configured to accumulate one or more output values of the fixed-point approximate multiplication operation, perform an addition operation based on the output value of the fixed-point approximate multiplication operation and an output value of the accumulating, and perform the stochastic rounding operation on the output value of the accumulating. 12. A neural network processing unit configured to perform a training operation based on one or more instances of training data and a plurality of weights in a training mode or to perform an inference operation based on one or more instances of input data and the plurality of weights in an inference mode, the neural network processing unit comprising: a controller configured to output at a first control signal and a second control signal to at least one neural processor (NPU), the first and second control signals respectively configured to enable a selection by the at least one NPU between the training mode and the inference mode; and a plurality of neural processors (NPUs) including the at least one NPU, wherein the at least one NPU of the plurality of NPUs is configured to, switch to a fixed-point approximate multiplication training mode in response to receiving the first control signal, receive a first value and a second value while in the fixed-point approximate multiplication training mode, perform a fixed-point approximate multiplication operation on the first value and the second value in the training mode in response to receiving the fir

Assignees

Inventors

Classifications

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Supervised learning · CPC title

  • G06N3/063Primary

    using electronic means · CPC title

  • Architecture, e.g. interconnection topology · CPC title

  • Rounding · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12223288B2 cover?
A neural network processing unit may be configured to perform an approximate multiplication operation and a system on chip may include the neural network processing unit. The neural network processing unit may include a plurality of neural processing units and may perform a computation based on one or more instances of input data and a plurality of weights. At least one neural processing unit i…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/063. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 11 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).