Apparatus employing user-specified binary point fixed point arithmetic

US10228911B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10228911-B2
Application numberUS-201615090796-A
CountryUS
Kind codeB2
Filing dateApr 5, 2016
Priority dateOct 8, 2015
Publication dateMar 12, 2019
Grant dateMar 12, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus includes a plurality of arithmetic logic units each having an accumulator and an integer arithmetic unit that receives and performs integer arithmetic operations on integer inputs and accumulates integer results of a series of the integer arithmetic operations into the accumulator as an integer accumulated value. A register is programmable with an indication of a number of fractional bits of the integer accumulated values and an indication of a number of fractional bits of integer outputs. A first bit width of the accumulator is greater than twice a second bit width of the integer outputs. A plurality of adjustment units scale and saturate the first bit width integer accumulated values to generate the second bit width integer outputs based on the indications of the number of fractional bits of the integer accumulated values and outputs programmed into the register.

First claim

Opening claim text (preview).

The invention claimed is: 1. An apparatus for a neural network unit, comprising: a plurality of arithmetic logic units each having: an accumulator; and an integer arithmetic unit that receives and performs integer arithmetic operations on integer inputs and accumulates integer results of a series of the integer arithmetic operations into the accumulator as an integer accumulated value; a register programmable with: a number of fractional bits of the integer accumulated value; a number of fractional bits of an integer output, wherein a first bit width of the integer accumulated value is greater than twice a second bit width of the integer output; and a round control value; and a plurality of activation function units each having: a first multiplexer, coupled to the accumulator and the register, wherein the first multiplexer has a first input receiving the integer accumulated value shifted right according to the number of fractional bits of the integer output, a second input receiving a rounded version of the integer accumulated value, and an output outputting one of the first input and the second input as an output value according to the round control value; a saturator, coupled to the output of the first multiplexer, generating a saturator output by saturating the output value of the first multiplexer to a maximum value expressible in a canonical form if the output value of the first multiplexer is greater than the maximum value expressible in the canonical form; and a plurality of activation function circuits, each coupled to the saturator, each performing a different activation function on the saturator output to produce an activation result, wherein the integer output is generated according to the activation results. 2. The apparatus of claim 1 , wherein each of the plurality of activation function units further comprises an output binary point aligner coupled to the accumulator and the first input of the first multiplexer, wherein the output binary point aligner calculates a difference between the number of fractional bits of the integer output and the number of fractional bits of the integer accumulated value, and shifts the integer accumulated value right by the difference to provide to the first input of the first multiplexer. 3. The apparatus of claim 1 , further comprising: the accumulator has at least Q bits of storage to store the integer accumulated value; and Q is a sufficient number of bits to accumulate a series of P of the integer results without loss of precision. 4. The apparatus of claim 3 , further comprising: Q is M plus log 2 P, where M is a bit width of the integer results. 5. The apparatus of claim 4 , further comprising: P is a predetermined maximum permissible number of the series of the integer results to generate the integer accumulated value. 6. The apparatus of claim 1 , further comprising: a memory that hold the integer inputs and provides them to the plurality of arithmetic logic units. 7. The apparatus of claim 6 , wherein the plurality of activation function units write the integer outputs to the memory. 8. The apparatus of claim 1 , further comprising: each of the plurality of arithmetic logic units comprises: an integer multiplier that performs an integer multiplication of a pair of the integer inputs to generate an integer product; and an integer adder that performs an integer addition of the integer product and the integer accumulated value to generate an integer sum for storage back into the accumulator. 9. The apparatus of claim 1 , each of the plurality of activation function units further comprising: a rounder, coupled to the accumulator and the second input of the first multiplexer, rounding the integer accumulated values based on the least significant J bits of the integer accumulated value to generate the rounded version of the integer accumulated value, where J is a difference between the number of fractional bits of the integer outputs and the number of fractional bits of the integer accumulated values. 10. The apparatus of claim 1 , wherein the integer inputs have a bit width that is the same as the second bit width of the integer outputs. 11. A method for a neural network unit, comprising: programming a register with: a number of fractional bits of an integer accumulated value; and a number of fractional bits of an integer output, wherein a first bit width of the integer accumulated value is greater than twice a second bit width of the integer output; and a round control value; by each of a plurality of arithmetic logic units having an accumulator and an integer arithmetic unit: performing, by the integer arithmetic unit, integer arithmetic operations on integer inputs; and accumulating integer results of a series of the integer arithmetic operations into the accumulator as the integer accumulated value; by each of a plurality of activation function units having a first multiplexer, a saturator, and a plurality of activation function circuits, wherein the first multiplexer is coupled to the accumulator and the register, wherein the saturator is coupled to the output of the first multiplexer, and wherein each of the plurality of activation function circuits is coupled to the saturator: receiving, by a first input of the first multiplexer, the integer accumulated value shifted right according to the number of fractional bits of the integer output; receiving, by a second input of the first multiplexer, a rounded version of the integer accumulated value; outputting, by an output of the first multiplexer, one of the first input and the second input as an output value according to the round control value; generating, by the saturator, a saturator output by saturating the output value of the first multiplexer to a maximum value expressible in a canonical form if the output value of the first multiplexer is greater than the maximum value expressible in the canonical form; and performing, by each of the plurality of activation modules, a different activation function on the saturator output to produce an activation result, wherein the integer output is generated according to the activation results. 12. The method of claim 11 , wherein each of the plurality of activation function units further comprises an output binary point aligner coupled to the accumulator and the first input of the first multiplexer, the method further comprising: calculating, by the output binary point aligner, a difference between the number of fractional bits of the integer outputs and the number of fractional bits of the integer accumulated values; and shifting the integer accumulated value right by the difference to provide to the first input of the first multiplexer. 13. The method of claim 11 , further comprising: the accumulator has at least Q bits of storage to store the integer accumulated value; and Q is a sufficient number of bits to accumulate a series of P of the integer results without loss of precision. 14. The method of claim 13 , further comprising: Q is M plus log 2 P, where M is a bit width of the integer results. 15. The method of claim 11 , wherein each of the plurality of activation function units further comprises a rounder coupled to the accumulator and the second input of the first multiplexer, the method further comprising: rounding, by the rounder, the integer accumulated values based on the least significant J bits of the integer accumulated value to generate the rounded version of the integer accumulated value, where J is a difference between the number of fractional bits of the integer outputs and the number of fractional

Assignees

Inventors

Classifications

  • Activation functions · CPC title

  • Combinations of networks · CPC title

  • for shifting, e.g. justifying, scaling, normalising {(digital stores in which the information is moved stepwise, e.g. shift-registers G11C19/00; digital stores in which the information circulates G11C21/00)} · CPC title

  • Significance control · CPC title

  • Rounding · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10228911B2 cover?
An apparatus includes a plurality of arithmetic logic units each having an accumulator and an integer arithmetic unit that receives and performs integer arithmetic operations on integer inputs and accumulates integer results of a series of the integer arithmetic operations into the accumulator as an integer accumulated value. A register is programmable with an indication of a number of fraction…
Who is the assignee on this patent?
Via Alliance Semiconductor Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F7/49947. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 12 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).