Process for performing floating point multiply-accumulate operations with precision based on exponent differences for saving power

US12175209B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12175209-B2
Application numberUS-202117352372-A
CountryUS
Kind codeB2
Filing dateJun 21, 2021
Priority dateJun 21, 2021
Publication dateDec 24, 2024
Grant dateDec 24, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A process for a floating point multiplier-accumulator (MAC) is operative on N pairs of floating point values using N MAC processes operating concurrently, each MAC process operating on a pair of values comprising an input value and a coefficient value. Each MAC process simultaneously generates an integer form fraction accompanied by a sign bit and an exponent difference computed by subtracting an exponent sum from a maximum exponent sum of all exponent sums. A range estimating process determines a possible range of values from the exponent differences and determines an adder precision. A summing process adds all of the integer form fractions using the determined adder precision, and converts the sum to a floating point value using the maximum exponent sum, sign bit of the summed integer form fractions, and optionally performs a 2's complement of the summed integer form fraction if the sign bit is negative.

First claim

Opening claim text (preview).

I claim: 1. A process for performing floating point multiplier-accumulator (MAC) operations on N pairs of values, each pair of values comprising an input value and a coefficient value, the process comprising: a sign process operative with an exclusive OR gate configured to compute, for each of the N pairs, an exclusive OR operation performed on a sign bit of the input value and a sign bit of the coefficient value and generating a sign bit; a multiplier process operative with a hardware multiplier, the hardware multiplier computing, for each of the N pairs of values, an integer multiplication of a mantissa of the input value and a mantissa of the coefficient value and outputting a fraction; an maximum exponent sum process having maximum sum detection hardware and computing, for each of the N pairs of values, an exponent difference between a maximum exponent sum from all exponent sums of the N pairs and exponent sum for a pair of values; a Pad, Complement, Shift (PCS) process operating on pad hardware, complement hardware, and shift hardware, the PCS process performing, for each of the N pairs of values: a pad operation by pre-pending 0 values and appending 0 values to an associated fraction to form a first value; complementing the first value if an associated sign bit is negative to generate a second value; shifting the second value to the right by an associated exponent difference value to generate a PCS value; computing a sum of all PCS values to form a PCS sum; a normalize process operating using hardware configured to perform normalization, the normalize process normalizing the PCS sum, extracting a final sign bit from the normalized PCS sum, performing a 2s complement of the normalized PCS sum if the final sign bit is negative to form a final mantissa; a concatenation process concatenating the final sign bit, the final mantissa, and a final exponent computed from an adjusted maximum exponent, number of leading 0s in the sum of all PCS values, and number of PCS pre-pended 0s into a final floating point result. 2. The process of claim 1 where the fraction of each of the N pairs of values has a precision determined by the exponent difference. 3. The process of claim 2 where the fraction of each of the N pairs of values has a precision of 4 bits when an associated exponent difference is greater than 24. 4. The process of claim 2 where the fraction of each of the N pairs of values has a precision of 8 bits when an associated exponent difference is greater than 21. 5. The process of claim 1 where the fraction of each of the N pairs of values has a precision of 12 bits when an associated exponent difference is larger than 12. 6. The process of claim 1 where an exponent difference for a MAC processor that does not have a largest exponent sum is incremented if an associated integer multiplication does not overflow and an associated exponent difference is 0. 7. The process of claim 1 where an exponent difference for a MAC processor that does not have a largest exponent sum is decremented if an associated integer multiplication has an overflow and an associated exponent difference is greater than 0. 8. The process of claim 1 where the maximum exponent sum is incremented if an exponent difference is 0 and an associated multiplication has an overflow. 9. The process of claim 1 where an estimate of minimum value and maximum value is based on an associated exponent difference. 10. The process of claim 1 where computing the sum of all PCS values is done with a variable precision based on an estimate of mantissa range. 11. The process of claim 9 where a sum of minimum values and a sum of maximum values determines a precision of a computed sum of PCS values. 12. The process of claim 11 where the computed sum of PCS values has a full precision and a less than full precision, and the less than full precision is enabled when a sum of exponent processor maximum values and a sum of exponent processor minimum values are either both positive values or both negative values. 13. The process of claim 11 where the computed sum of PCS values is performed using configurable cascadable 8 bit adders. 14. The process of claim 13 where the sum of all PCS values is computed in at least one of a 16 bit mode, a 24 bit mode, and a 32 bit mode. 15. A process for a floating point multiplier-accumulator (MAC) coupled to hardware configured to multiply and accumulate, the process multiplying and accumulating N pairs of values, each pair of values comprising an input value and a coefficient value, the process operative on a plurality N of MAC processes, each MAC process receiving a corresponding input value and an associated coefficient value, each MAC process comprising: a sign process coupled to exclusive OR hardware, the sign process performing an exclusive OR operation on a sign bit of the corresponding input value and a sign bit of the associated coefficient value, and outputting a sign bit; a mantissa process coupled to a hardware multiplier and configured to perform an integer multiplication of a hidden bit restored mantissa of the corresponding input value with a hidden bit restored mantissa of the associated coefficient value and output a fraction, upon a fraction overflow condition, the mantissa process dividing the output fraction by two and asserting an exponent increment; an exponent process coupled to a hardware adder, the exponent process generating an exponent sum of an exponent of the corresponding input value and an exponent of the associated coefficient value, the exponent process receiving a maximum exponent from a centralized find maximum exponent process, the exponent process modifying the maximum exponent and also outputting an exponent difference computed by subtracting the exponent sum from the maximum exponent, the exponent process also using the exponent difference and sign bit to estimate a minimum value and a maximum value; a Pad, Complement, Shift (PCS) process coupled to a hardware shift register, the PCS process receiving the output fraction from the mantissa process and also the sign bit from the sign process, the PCS process configured to pad the fraction by pre-pending and appending 0s to the fraction to generate a first value, thereafter generating a second value by performing a two's complement of the first value if the sign bit is negative and otherwise taking no action on the first value, the PCS process configured to performing a shift operation on the second value by right shifting the second value by the exponent difference to generate a PCS output; the centralized find maximum exponent process receiving the exponent sum from each exponent process of a first pipeline stage, the centralized find maximum exponent process outputting a maximum exponent value corresponding to a maximum exponent process sum; a central range process operative to sum minimum values from the exponent process and also to sum maximum values from each exponent process, the central range process forming an adder precision based on the sum of minimum values and the sum of maximum values; an adder process coupled to a second hardware adder, the adder process summing N PCS output values to a single value, the adder process configured to perform addition using the adder precision; a final stage process normalizing the single value, generating a final stage mantissa by performing a 2s complement of the single value if the single value is negative, generating a final stage sign bit, and concatenating the final stage sign bit, the final stage mantissa, and an adjusted maximum exponent into a MAC result. 16. The

Assignees

Inventors

Classifications

  • Multiplying · CPC title

  • Pipelining · CPC title

  • G06F7/5443Primary

    Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12175209B2 cover?
A process for a floating point multiplier-accumulator (MAC) is operative on N pairs of floating point values using N MAC processes operating concurrently, each MAC process operating on a pair of values comprising an input value and a coefficient value. Each MAC process simultaneously generates an integer form fraction accompanied by a sign bit and an exponent difference computed by subtracting …
Who is the assignee on this patent?
Ceremorphic Inc
What technology area does this patent fall under?
Primary CPC classification G06F7/5443. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 24 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).