Floating-point adder, semiconductor device, and control method for floating-point adder
US-2016248439-A1 · Aug 25, 2016 · US
US2025199762A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2025199762-A1 |
| Application number | US-202519069183-A |
| Country | US |
| Kind code | A1 |
| Filing date | Mar 3, 2025 |
| Priority date | Mar 27, 2019 |
| Publication date | Jun 19, 2025 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A programmable device may be configured to support machine learning training operations using matrix multiplication circuitry. In some embodiments, the multiplication is implemented on a systolic array. The systolic array includes an array of processing elements, each of which includes hybrid floating-point dot-product circuitry.
Opening claim text (preview).
What is claimed is: 1 . An integrated circuit, comprising: a floating-point multiplier; a fixed-point multiplier; and an adder coupled to the floating-point multiplier and the fixed-point multiplier, wherein the adder generates output data based on receiving a first signal from the floating-point multiplier and a second signal from the fixed-point multiplier. 2 . The integrated circuit of claim 1 , wherein the floating-point multiplier comprises hard logic circuitry, and wherein the fixed-point multiplier comprises hard logic circuitry and soft logic circuitry. 3 . The integrated circuit of claim 1 , wherein the floating-point multiplier and the fixed-point multiplier receive input signals of a first floating-point format, and wherein the floating-point multiplier outputs signals in a second floating-point format that is different than the first floating-point format. 4 . The integrated circuit of claim 3 , wherein the fixed-point multiplier outputs signals in a third floating-point format that is different than the first and second floating-point formats. 5 . The integrated circuit of claim 4 , comprising a format conversion circuit coupled to the floating-point multiplier, wherein the format conversion circuit converts the first signal from the second floating-point format to the third floating-point format having a greater number of exponent bits than the first floating-point format. 6 . The integrated circuit of claim 3 , wherein the first floating-point format is a BFLOAT16 format having one sign bit, eight exponent bits, and at most seven fraction bits. 7 . The integrated circuit of claim 4 , wherein the adder generates an amount of truncation for the third floating-point format and the third floating-point format has an adjustable number of fraction bits. 8 . The integrated circuit of claim 1 , comprising interface circuitry configurable to receive a first data matrix and a second data matrix from off-chip memory. 9 . The integrated circuit of claim 8 , comprising a load circuit coupled to the interface circuitry, wherein the load circuit receives first matrix data and second matrix data. 10 . The integrated circuit of claim 9 , comprising a multiplier circuit configurable to generate the first signal and the second signal based on loading the first matrix data and the second matrix data in the floating-point multiplier and the fixed-point multiplier. 11 . The integrated circuit of claim 10 , wherein the multiplier circuit generates the first signal and the second signal based at least in part by: loading a first portion of the first matrix data and the second matrix data in the floating-point multiplier; and loading a second portion of the first matrix data and the second matrix data in the fixed-point multiplier. 12 . The integrated circuit of claim 1 , comprising accumulation storage to receive the output data. 13 . The integrated circuit of claim 12 , wherein the adder generates the output data based on feedback data from the accumulation storage. 14 . The integrated circuit of claim 1 , comprising circuitry to compensate a latency discrepancy between routing to the floating-point multiplier and routing to the fixed-point multiplier. 15 . A machine learning training circuit, comprising: a load circuit configurable to receive, from off-chip memory, first matrix data and second matrix data; a multiplier circuit configurable to generate result data based on loading the first matrix data and the second matrix data in a floating-point multiplier and in a fixed-point multiplier; and a store circuit configurable to write, to the off-chip memory, the result data. 16 . The machine learning training circuit of claim 15 , comprising one or more systolic arrays, wherein the multiplier circuit is configurable to generate result data based on loading the first matrix data and the second matrix data in the floating-point multiplier and in the fixed-point multiplier using the one or more systolic arrays. 17 . The machine learning training circuit of claim 15 , wherein the multiplier circuit generates the result data based at least in part by: loading a first portion of the first matrix data and the second matrix data in the floating-point multiplier; and loading a second portion of the first matrix data and the second matrix data in the fixed-point multiplier. 18 . Circuitry, comprising: a floating-point multiplier; a fixed-point multiplier; and one or more delay registers to delay first input data transmitted to the floating-point multiplier relative to second input data transmitted to the fixed-point multiplier. 19 . The circuitry of claim 18 , wherein a delay added via the one or more delay registers is configurable to compensate for a latency discrepancy between the floating-point multiplier and the fixed-point multiplier. 20 . The circuitry of claim 18 , comprising an adder coupled to the floating-point multiplier and the fixed-point multiplier, wherein the adder generates output data based on the first input data and the second input data.
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
with variable precision · CPC title
Conversion to or from floating-point codes · CPC title
Half or full adders, i.e. basic adder cells for one denomination · CPC title
Activation functions · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.