What technology area does this patent fall under?

Primary CPC classification G06F7/5443. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Sep 27 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Ultra-low precision floating-point fused multiply-accumulate unit

US11455142B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11455142-B2
Application number	US-201916432358-A
Country	US
Kind code	B2
Filing date	Jun 5, 2019
Priority date	Jun 5, 2019
Publication date	Sep 27, 2022
Grant date	Sep 27, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments for implementing a fused multiply-multiply-accumulate (“FMMA”) unit by one or more processors in a computing system. Mantissas for two products, an exponent difference of the two products serving as an alignment shift amount for a product of the two products having a smallest exponent, and an alignment shift amount for an addend relative to an alternative product of the two product having a larger exponent may be determined in parallel. The addend may be aligned relative to the alternative product having the larger exponent. The product having the smallest exponent may be aligned relative to the alternative product having the larger exponent according to the alignment shift amount.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, by one or more processors, for implementing a fused multiply-multiply-accumulate (FMMA) operation in a computing environment, comprising: receiving, by the one or more processors, an instruction stored in a memory, wherein the instruction contains at least two operands of mixed bit-precision formats; and executing the instruction, wherein, when executing the instruction, the one or more processors implement a FMMA unit to perform an internal rounding operation associated with floating point arithmetic of the instruction by performing each of: determining by multiplier circuitry within the FMMA unit, in parallel, mantissas for two products, an exponent difference of the two products serving as an alignment shift amount for a product of the two products having a smallest exponent, and an alignment shift amount for an addend relative to an alternative product of the two products having a larger exponent, wherein the mantissas are pre-shifted prior to aligning the addend and the product relative to the alternative product, and wherein the addend and the product having the smallest exponent are aligned prior to receiving a select signal indicating to a selector to select between one of the pre-shifted mantissas when performing the alignment of the addend and the product relative to the alternative product; aligning, by aligning circuitry within the FMMA unit, the addend relative to the alternative product having the larger exponent; and aligning, by the aligning circuitry, the product having the smallest exponent relative to the alternative product having the larger exponent according to the alignment shift amount for the product of the two products having the smallest exponent. 2. The method of claim 1 , further including adding or subtracting the mantissas of the two products according to a sign of the addend and the two products. 3. The method of claim 1 , further including retaining a selected number of bits while discarding an alternative number of bits of the product for aligning the product having the smallest exponent relative to the alternative product having the larger exponent. 4. The method of claim 1 , further including retaining a selected number of bits while discarding an alternative number of bits of the addend for aligning the addend relative to the alternative product having the larger exponent. 5. The method of claim 1 , further including normalizing and rounding an intermediate summation or difference of aligned mantissas for each of the two products and the aligned addend to a targeted precision. 6. The method of claim 1 , further including: performing a mixed-precision FMMA operation by using one or more inputs, one or more outputs, or a combination thereof in a selected format; or performing a hybrid-fused FMMA operation by enabling a very low precision format (VLP) operand to use a plurality of formats. 7. The method of claim 1 , wherein the FMMA unit implements both a half-precision fused multiple add (FMA) operation and a very low precision format (VLP) FMMA operation, wherein the VLP is a format using less than sixteen bits comprising a sign bit, exponent bits (e), and mantissa bits (m), and the FMMA unit is selectively configured to perform the FMA operation or the FMMA operation. 8. A system for implementing a fused multiply-multiply-accumulate (FMMA) operation in a computing environment, comprising: one or more hardware memory storing executable instructions; one or more hardware processors; and a FMMA unit implemented within the one or more hardware processors, wherein the one or more hardware processors are configured to: receive, by the one or more hardware processors, one of the executable instructions stored in the one or more memory, wherein the instruction contains at least two operands of mixed bit-precision formats; and execute the one of the executable instructions by implementing the FMMA unit to perform an internal rounding operation associated with floating point arithmetic performing each of: determining by multiplier circuitry within the FMMA unit, in parallel, mantissas for two products, an exponent difference of the two products serving as an alignment shift amount for a product of the two products having a smallest exponent, and an alignment shift amount for an addend relative to an alternative product of the two products having a larger exponent, wherein the mantissas are pre-shifted prior to aligning the addend and the product relative to the alternative product, and wherein the addend and the product having the smallest exponent are aligned prior to receiving a select signal indicating to a selector to select between one of the pre-shifted mantissas when performing the alignment of the addend and the product relative to the alternative product; aligning, by aligning circuitry within the FMMA unit, the addend relative to the alternative product having the larger exponent; and aligning, by the aligning circuitry, the product having the smallest exponent relative to the alternative product having the larger exponent according to the alignment shift amount for the product of the two products having the smallest exponent. 9. The system of claim 8 , wherein the executable instructions further add or subtract the mantissas of the two products according to a sign of the addend and the two products. 10. The system of claim 8 , wherein the executable instructions further retain a selected number of bits while discarding an alternative number of bits of the product for aligning the product having the smallest exponent relative to the alternative product having the larger exponent. 11. The system of claim 8 , wherein the executable instructions further retain a selected number of bits while discarding an alternative number of bits of the addend for aligning the addend relative to the alternative product having the larger exponent. 12. The system of claim 8 , wherein the executable instructions further normalize and round an intermediate summation or difference of aligned mantissas for each of the two products and the aligned addend to a targeted precision. 13. The system of claim 8 , wherein the executable instructions further: perform a mixed-precision FMMA operation by using one or more inputs, one or more outputs, or a combination thereof in a selected format; or perform a hybrid-fused FMMA operation by enabling a very low precision format (VLP) operand to use a plurality of formats. 14. The system of claim 8 , wherein the FMMA unit implements both a half-precision fused multiple add (FMA) operation and a very low precision format (VLP) FMMA operation, wherein the VLP is a format using less than sixteen bits comprising a sign bit, exponent bits (e), and mantissa bits (m), and the FMMA unit is selectively configured to perform the FMA operation or the FMMA operation. 15. A computer program product for, by a processor, implementing a fused multiply-multiply-accumulate (FMMA) operation in a computing environment, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising: an executable portion that receives, by the processor, an instruction stored in a memory, wherein the instruction contains at least two operands of mixed bit-precision formats; and an executable portion that executes the instruction, wherein, when executing the instruction, the one or more processors implement a FMMA unit to perform an internal rounding operation associated with floating point arithmetic of the instruc

Assignees

Inventors

Classifications

G06F7/5443Primary
Sum of products (for applications thereof, see the relevant places, e.g. G06F17/10, H03H17/00) · CPC title
G06F7/483
Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title

Patent family

Related publications grouped by family.

View patent family 73651518

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11455142B2 cover?: Embodiments for implementing a fused multiply-multiply-accumulate (“FMMA”) unit by one or more processors in a computing system. Mantissas for two products, an exponent difference of the two products serving as an alignment shift amount for a product of the two products having a smallest exponent, and an alignment shift amount for an addend relative to an alternative product of the two product …
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F7/5443. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Sep 27 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Instructions for fused multiply-add operations with variable precision input operands

Closepath fast incremented sum in a three-path fused multiply-add design

Generalized acceleration of matrix multiply accumulate operations

Compute optimization mechanism

Floating-Point Multiply-Add with Down-Conversion

Binary fused multiply-add floating-point calculations

Frequently asked questions