Floating point accumulation

US2025244951A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2025244951-A1
Application numberUS-202418428446-A
CountryUS
Kind codeA1
Filing dateJan 31, 2024
Priority dateJan 31, 2024
Publication dateJul 31, 2025
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

There is provided an apparatus, a system, a chip containing product, a method and a medium, the apparatus comprises decoder circuitry responsive to a floating point accumulate instruction identifying pairs of floating point operands and an accumulation source, and processing circuitry comprising a plurality of arithmetic combination units to perform an arithmetic operation to combine the pairs of operands, and summation circuitry to perform an arithmetically precise summation operation to calculate an intermediate result by summing results generated by the arithmetic combination units. The intermediate result is independent of an order in which the arithmetic results are summed. The processing circuitry comprises accumulation circuitry to accumulate the intermediate result into the accumulation source and rounding circuitry to perform a rounding operation after accumulating, and is configured to propagate additional precision information relating to the arithmetic results, the intermediate result, and/or the prior accumulation value to the rounding circuitry.

First claim

Opening claim text (preview).

We claim: 1 . An apparatus comprising: decoder circuitry responsive to a floating point accumulate instruction identifying a plurality of pairs of floating point operands and an accumulation source, to generate control signals to trigger a floating point accumulation operation; and processing circuitry responsive to the control signals, to perform the floating point accumulation operation, the processing circuitry comprising: a plurality of arithmetic combination units configured, for each pair of operands of the plurality of pairs of floating point operands, to perform an arithmetic operation to combine that pair of operands; summation circuitry configured to receive arithmetic results generated by each of the plurality of arithmetic combination units and to perform an arithmetically precise summation operation to calculate an intermediate result by summing the arithmetic results, wherein the intermediate result is independent of an order in which the arithmetic results are summed; accumulation circuitry configured to retrieve a prior accumulation value from the accumulation source and to accumulate the intermediate result into the accumulation source; and rounding circuitry configured to perform a rounding operation after accumulating the intermediate result into the accumulation source, wherein the processing circuitry is configured to propagate additional precision information to the rounding circuitry, the additional precision information relating to at least one of the arithmetic results, the intermediate result, or the prior accumulation value. 2 . The apparatus of claim 1 , wherein the processing circuitry is configured to defer rounding of results of the arithmetic operation for each pair of operands until the rounding operation, and to defer rounding of results of the arithmetically precise summation operation until the rounding operation. 3 . The apparatus of claim 1 , wherein: the processing circuitry is configured to support at least one floating point format specifying at least a significand portion having a significand bit width and an exponent portion having an exponent bit width; and the additional precision information comprises additional significand information extending a bit width of at least one of the arithmetic results by at least one additional bit relative to a bit width of the significand portion of the plurality of pairs of floating point operands. 4 . The apparatus of claim 3 , wherein the bit width of the significand portion of each of the arithmetic results is greater than or equal to a sum of the bit width of the significand portions of each of the pair of operands from which that arithmetic result is generated. 5 . The apparatus of claim 3 , wherein the processing circuitry comprises alignment circuitry to perform at least one alignment operation to align the arithmetic results to a common magnitude prior to the arithmetically precise summation operation. 6 . The apparatus of claim 5 , wherein the common magnitude is determined based on at least one of: a magnitude of a largest exponent of the arithmetic results; and a summation of a maximum calculable arithmetic exponent and a characteristic multiplicity factor determined as a ceiling value of a base-2 logarithm of a multiplicity of the plurality of pairs of floating point operands. 7 . The apparatus of claim 6 , wherein the summation circuitry is configured to output the intermediate result having an intermediate result bit width extended to comprise: a most significant portion comprising sufficient bit width to store the significand portion of the prior accumulation value; and a least significant portion having a bit width sufficient to store the intermediate result. 8 . The apparatus of claim 7 , wherein the accumulation circuitry comprises accumulator alignment circuitry to perform an accumulator alignment operation to align the significand portion of the prior accumulation value to the common magnitude based on a magnitude stored in the exponent portion of the prior accumulation value and the common magnitude. 9 . The apparatus of claim 8 , wherein when the magnitude stored in the exponent portion of the prior accumulation value exceeds the common magnitude by an amount greater than a bit width of the most significant portion, the accumulator alignment operation comprises aligning the significand portion of the prior accumulation value to the most significant portion. 10 . The apparatus of claim 8 , wherein the accumulator alignment circuitry is configured: to shift the significand portion of the prior accumulation value by a shift amount equal to a difference between the magnitude of the exponent portion of the prior accumulation value and the common magnitude; when the shift amount indicates a shift in a first direction, to shift the significand of the prior accumulation value in the first direction by the shift amount; and when the shift amount indicates a shift in a second direction opposite to the first direction, to shift the significand portion of the prior accumulation value in the first direction by an amount equal to a difference between a predetermined correction amount and the shift amount and to shift the significand portion of the prior accumulation value by the predetermined correction amount in the second direction. 11 . The apparatus of claim 3 , wherein: the accumulation circuitry is configured to output an unrounded result obtained by accumulating the intermediate result into the accumulation source; and the apparatus comprises normalising circuitry configured, prior to the rounding operation, to normalise the unrounded result. 12 . The apparatus of claim 3 , wherein: the processing circuitry comprises alignment circuitry to perform at least one alignment operation to align the arithmetic results to a common magnitude prior to the arithmetically precise summation operation; the floating point accumulation operation is a multi-cycle operation performed over a plurality of instruction cycles; the arithmetic operation for each pair of operands is performed in a different instruction cycle to the arithmetically precise summation operation; the at least one alignment operation comprises a first alignment operation performed in a same clock cycle as the arithmetic operation and a second alignment operation performed in a same clock cycle as the arithmetically precise summation operation; and a bit width of an output of the first alignment operation is less than a bit width of the output of the second alignment operation. 13 . The apparatus of claim 12 , wherein: the accumulation circuitry comprises accumulator alignment circuitry to perform an accumulator alignment operation to align the significand portion of the prior accumulation value to the common magnitude based on a magnitude stored in the exponent portion of the prior accumulation value and the common magnitude; and the accumulator alignment operation is performed in a same clock cycle as accumulating the intermediate result into the accumulation source. 14 . The apparatus of claim 1 , wherein the floating point accumulate instruction is a floating point dot product instruction, and the floating point accumulation operation is a dot product with accumulation operation. 15 . The apparatus of claim 14 , wherein: the plurality of arithmetic combination units comprises N arithmetic combination units; the decoder circuitry is responsive to at least two types of floating point dot product instruction including an N-product floating point dot product instruction specifying N pairs of floating point oper

Assignees

Inventors

Classifications

  • Mantissa overflow or underflow in handling floating-point numbers · CPC title

  • Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

  • Rounding · CPC title

  • G06F7/483Primary

    Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2025244951A1 cover?
There is provided an apparatus, a system, a chip containing product, a method and a medium, the apparatus comprises decoder circuitry responsive to a floating point accumulate instruction identifying pairs of floating point operands and an accumulation source, and processing circuitry comprising a plurality of arithmetic combination units to perform an arithmetic operation to combine the pairs …
Who is the assignee on this patent?
Advanced Risc Mach Ltd
What technology area does this patent fall under?
Primary CPC classification G06F7/49915. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 31 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).