What technology area does this patent fall under?

Primary CPC classification G06F7/5443. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Process for dual mode floating point multiplier-accumulator with high precision mode for near zero accumulation results

US12197889B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12197889-B2
Application number	US-202117352374-A
Country	US
Kind code	B2
Filing date	Jun 21, 2021
Priority date	Jun 21, 2021
Publication date	Jan 14, 2025
Grant date	Jan 14, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A process for a floating point multiplier-accumulator (MAC) is operative on N pairs of floating point values using N MAC processes operating concurrently, each MAC process operating on a pair of values comprising an input value and a coefficient value. Each MAC process simultaneously generates: an integer form fraction at a first bitwidth and a second bitwidth greater than the first bitwidth, a sign bit, and an exponent difference computed by subtracting an exponent sum from a maximum exponent sum of all exponent sums. The integer form fractions of the first bitwidths are provided to an adder tree using the first bitwidth, and if the sum has an excess percentage of leading 0s, then the second bitwidth is used by an adder tree using the second bitwidth to form a great precision integer form fraction. The sign, integer form fraction, and maximum exponent are provided to an normalizer which generates a floating point result.

First claim

Opening claim text (preview).

I claim: 1. A process for a floating point multiplier-accumulator (MAC) multiplying and accumulating N operands, each operand comprising an input value and a coefficient value, the process operative on a MAC controller comprising register-pipelined stages, the process comprising: a plurality N of MAC processes operating in parallel on a first register-pipeline stage, each MAC process receiving a unique one of the N operands comprising an associated input value and an associated coefficient value, each MAC process of the first register-pipeline stage comprising: a sign process operative on exclusive OR hardware and performing an exclusive OR operation on a sign bit of the associated input value and a sign bit of the associated coefficient value, the sign process outputting a corresponding sign bit; a mantissa process operative on a hardware multiplier and performing an integer multiplication of a mantissa of the associated input value and a mantissa of the associated coefficient value and outputting a fraction, the mantissa process asserting an exponent increment and dividing the fraction by two if an overflow condition occurs; an exponent process operative on summing and subtracting hardware and determining an exponent sum of an exponent of the associated input value and an exponent of the associated coefficient value, the exponent process receiving a maximum exponent sum value from a centralized find maximum exponent process, the exponent process incrementing the maximum exponent sum value when the exponent increment is asserted and an exponent difference is zero, the exponent process also outputting the exponent difference between the maximum exponent sum value and the exponent sum; a Pad, Complement, Shift (PCS) process operative in a second register-pipeline stage and receiving the fraction from the first pipeline stage mantissa process, the corresponding sign bit from the sign process, and the exponent difference, the PCS process returning a 2s complement when the sign bit is negative, padding the fraction by pre-pending and appending 0s to the fraction to generate a first value, and right shifting by the exponent difference and outputting a result as a PCS first output value having a first bitwidth, and also outputting the result as a PCS second output value having a second bitwidth greater than the first bitwidth; the centralized find maximum exponent process operative on comparator hardware and receiving an exponent sum from each exponent process of the MAC processes, identifying a maximum exponent sum and outputting the maximum exponent sum; storing the N PCS second output values in a pipeline register of the second register-pipeline stage; summing N PCS output first values using first summing hardware, and using the first bitwidth to output a first sum; summing N PCS output second values using second summing hardware, and using the second bitwidth when the first sum has more than a threshold percentage of leading 0s and returning a second sum; normalizing hardware outputting a floating point value by normalizing the second sum to generate a sign bit, a mantissa, and a number N of left shift bit positions to remove the leading 0s from the second sum to generate an output sum, a final stage process operative on the normalizing hardware thereafter forming a final output by concatenating the sign bit, the mantissa, and an exponent derived from the maximum exponent. 2. The process of claim 1 where the exponent of the floating point output value derived from the maximum exponent is derived by subtracting N from the maximum exponent and also subtracting an exponent correction. 3. The process of claim 2 where the exponent derived from the maximum exponent is an 8 bit value and the exponent correction is 127 and performed in either each MAC process exponent process, or the final stage. 4. The process of claim 1 where normalizing the second sum comprises: when the most significant bit (MSB) of the output sum is set, replacing the output sum with a 2s complement of the output sum, thereafter replacing the output sum with a left shifted output sum a number N of bit positions until no leading 0 bits remain and setting the sign bit. 5. The process of claim 2 where the exponent's precision is 8 bits and the exponent correction comprises subtracting 127. 6. The process of claim 1 where a stall condition is asserted when summing the N PCS output first values generates a sum with a number of leading 0 bits of a first final stage process mantissa which exceeds a threshold, the stall condition resulting in summing the N PCS output second values using the second bitwidth to be performed after the stall condition. 7. The process of claim 1 where the summing N PCS output second values using the second bitwidth is performed when the summing N PCS output first values using the first bitwidth has more than 50% or 75% leading 0s of the first bitwidth. 8. The process of claim 1 where the PCS process is operative with a bitwidth determined from a MAC process exponent difference. 9. The process of claim 1 where the exponent difference of each MAC process is incremented when the mantissa process does not overflow and an associated exponent difference is not 0. 10. The process of claim 1 where the exponent difference of each MAC process is decremented when the mantissa process has an overflow and an associated exponent difference is not 0. 11. The process of claim 1 where the maximum exponent is incremented when the exponent difference is 0 and an associated mantissa process has an overflow. 12. The process of claim 1 where the mantissa's precision is 4 bits when the exponent difference is greater than 24. 13. The process of claim 1 where the mantissa's precision is 8 bits when the exponent difference is greater than 21. 14. The process of claim 1 where the mantissa's precision is 12 bits when the exponent difference is larger than 12. 15. A process for a floating point multiplier-accumulator (MAC) comprising a first pipeline stage process operating on a first pipeline stage and a second pipeline stage process operating on a second pipeline stage, the process simultaneously multiplying and accumulating N operands in parallel operations, each operand comprising an input value and a coefficient value, the first pipeline stage process including a process operating on a MAC controller, the first pipeline stage process comprising: a plurality N of MAC processes, each MAC process receiving an associated input value and associated coefficient value, each MAC process operating simultaneously and in parallel with other MAC processes of the first pipeline stage, the first pipeline stage comprising: a sign process performing an exclusive OR operation with exclusive OR gates on a sign bit of the associated input value and a sign bit of the associated coefficient value resulting in a sign bit output; a mantissa process configured to use an integer multiplier to perform an integer multiplication of a hidden bit restored mantissa of the associated input value and a hidden bit restored mantissa of the associated coefficient value and outputting a resulting fraction, and upon an overflow condition of the resulting fraction, the mantissa process dividing the resulting fraction by two and asserting an exponent increment; an exponent process configured to use hardware adders and generating an exponent sum of an exponent of the associated input value and an exponent of the associated coefficient value, the exponent process receiving a maximum exponent from a centralized find maximum exponent sum process, the exponent process modifying the maximum e

Assignees

Ceremorphic Inc

Inventors

Finch Dylan

Classifications

G06F7/4876
Multiplying · CPC title
G06F2207/3884
Pipelining · CPC title
G06F7/5443Primary
Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title

Patent family

Related publications grouped by family.

View patent family 84489244

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12197889B2 cover?: A process for a floating point multiplier-accumulator (MAC) is operative on N pairs of floating point values using N MAC processes operating concurrently, each MAC process operating on a pair of values comprising an input value and a coefficient value. Each MAC process simultaneously generates: an integer form fraction at a first bitwidth and a second bitwidth greater than the first bitwidth, a…
Who is the assignee on this patent?: Ceremorphic Inc
What technology area does this patent fall under?: Primary CPC classification G06F7/5443. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Apparatuses and methods to accelerate matrix multiplication

Compressing like-magnitude partial products in multiply accumulation

Reconfigurable Processor Circuit Architecture

Radix-1000 decimal floating-point numbers and arithmetic units using a skewed representation of the fraction

Arithmetic processing apparatus and controlling method therefor

Multi-path fused multiply-add with power control

Reconfigurable multi-precision integer dot-product hardware accelerator for machine-learning applications

Frequently asked questions