Neural network unit
US-2018225116-A1 · Aug 9, 2018 · US
US12067375B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12067375-B2 |
| Application number | US-202217932537-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 15, 2022 |
| Priority date | Nov 27, 2019 |
| Publication date | Aug 20, 2024 |
| Grant date | Aug 20, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods are provided to perform multiply-accumulate operations of at least one normalized number in a systolic array. The systolic array can obtain a first input and detect that the first input is denormal. Based on determining the first input is denormal, the systolic array can generate a first normalized number by normalizing the first input. Processing elements of the systolic array can include a multiplier and an adder. The multiplier can multiply the first normalized number by a second normal or normalized number to generate a multiplier product and the adder can add an input partial sum to the multiplier product to generate an addition result.
Opening claim text (preview).
What is claimed is: 1. A systolic circuit comprising: a systolic array comprising processing elements; and a first normalizer comprising a first denormal detector, the first normalizer configured to: receive a first input represented in floating-point with a first bit-length; detect by the first denormal detector, that the first input is denormal; and generate a first normalized number represented in floating-point with a second bit-length, wherein the second bit-length is greater than the first bit-length; wherein an individual processing element of the systolic array is configured to: multiply the first normalized number by a second number to generate a multiplier product; and add an input partial sum with the multiplier product to generate an addition result. 2. The systolic circuit of claim 1 , wherein the processing elements are arranged into a plurality of rows. 3. The systolic circuit of claim 1 , wherein to detect that the first input is denormal, the first normalizer is configured to detect that the first input is denormal based at least in part on a value of an exponent of the first input or a value of a significand of the first input. 4. The systolic circuit of claim 1 , wherein the second number is: a second normalized number; or a second normal number. 5. The systolic circuit of claim 1 , wherein the first normalizer is further configured to: receive a second input represented in floating-point with a third bit-length; detect by the first denormal detector, that the second input is denormal; and generate a second normalized number, wherein the second normalized number is represented in floating-point with a fourth bit-length, wherein the fourth bit-length is greater than the third bit-length, wherein the second number is the second normalized number. 6. The systolic circuit of claim 1 , wherein the systolic circuit further comprises a second normalizer, the second normalizer comprising a second denormal detector, the second normalizer configured to: receive a second input represented in floating-point with a third bit-length; detect by the second denormal detector, that the second input is denormal; and generate a second normalized number, wherein the second normalized number is represented in floating-point with a fourth bit-length, wherein the fourth bit-length is greater than the third bit-length, wherein the second number is the second normalized number. 7. The systolic circuit of claim 1 , wherein the first normalizer is further configured to: receive a second input; and detect by the first denormal detector, that the second input is normal, wherein the second input comprises the second number. 8. The systolic circuit of claim 1 , wherein the systolic circuit further comprises a second normalizer, the second normalizer comprising a second denormal detector, the second normalizer configured to: receive a second input; and detect by the second denormal detector, that the second input is normal, wherein the second input comprises the second number. 9. The systolic circuit of claim 1 , wherein individual processing elements of the systolic array further comprise: a multiplier; and an adder; wherein the multiplier and adder lack support for inputs provided in denormal form. 10. The systolic circuit of claim 1 , wherein the first normalizer is further configured to: receive an input data element and a weight; generate one or more of a normalized input data element or a normalized weight; and select a normalized input data element or a normalized weight to be produced. 11. The systolic circuit of claim 1 , wherein the first normalizer further comprises: a first leading zero encoder or counter configured to detect a number of leading zeros in a significand of the first input; a first exponent expander configured to expand a numerical range of an exponent of the first input; and a first shifter configured to shift the significand of the first input based at least in part on the number of leading zeros in the significand of the first input. 12. The systolic circuit of claim 1 , wherein: the first normalizer is further configured to convert the first input into the first normalized number, wherein the first normalizer is configured to support at least n-bit floating-point numbers, wherein n can be any number; and individual processing elements of the systolic array further comprise: a multiplier configured to multiply at least two n-bit numbers; and an adder configured to add two m-bit numbers, wherein m is greater than n. 13. A method for systolic processing by a processing element in a systolic array of processing elements, the method comprising: receiving, by a normalizer of the processing element, a first input represented in floating-point with a first bit-length; detecting, by a first denormal detector of the normalizer, that the first input is denormal; generating a first normalized number represented in floating-point with a second bit-length; multiplying, by the processing element, the first normalized number by a second number to generate a multiplier product; and adding, by the processing element, an input partial sum with the multiplier product to generate an addition result. 14. The method of claim 13 , wherein the second number is: a second normalized number, or a second normal number. 15. The method of claim 13 , further comprising: receiving, by the normalizer, a second input represented in floating-point with a third bit-length; detecting, by the first denormal detector, that the second input is denormal; and generating a second normalized number, wherein the second normalized number is represented in floating-point with a fourth bit-length, wherein the fourth bit-length is greater than the third bit-length, wherein the second number is the second normalized number. 16. The method of claim 13 , further comprising: receiving, by the normalizer, a second input; and detecting, by the first denormal detector, that the second input is normal, wherein the second input comprises the second number. 17. Non-transitory computer-readable media including computer-executable instructions that, when executed by a systolic circuit comprising a normalizer and an array of processing elements, cause the systolic circuit to: receive, by the normalizer, a first input represented in floating-point with a first bit-length; detect, by a denormal detector of the normalizer, that the first input is denormal; generate a first normalized number represented in floating-point with a second bit-length; multiply the first normalized number by a second number to generate a multiplier product; and add an input partial sum with the multiplier product to generate an addition result. 18. The non-transitory computer-readable media of claim 17 , wherein the second number is: a normalized number, or a normal number. 19. The non-transitory computer-readable media of claim 17 , wherein execution of the computer-executable instructions by the systolic circuit further causes the systolic circuit to: detect that a second input is denormal, wherein the second input is represented in floating point with a third bit-length; and generate the second number based on the second input, wherein the second number is represented in floating-point with a fourth bit-length, wherein the fourth bit-length is greater than the third bit-length. 20. The non-transitory computer-readable media of claim 17 , wherein execution of the computer-executable instruc
Convolutional networks [CNN, ConvNet] · CPC title
Systolic arrays · CPC title
Combinations of networks · CPC title
to perform miscellaneous control operations, e.g. NOP · CPC title
with variable precision · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.