Integrated circuits with machine learning extensions

US12056461B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12056461-B2
Application numberUS-202117484845-A
CountryUS
Kind codeB2
Filing dateSep 24, 2021
Priority dateNov 20, 2017
Publication dateAug 6, 2024
Grant dateAug 6, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An integrated circuit with specialized processing blocks are provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results from the carry-propagate adders may be added using a floating-point adder of the first precision. Results from the floating-point adder may be optionally cast to a second precision that is higher or more accurate than the first precision. The adder data path may include an adder of the second precision that combines the results from the floating-point adder with zero, with a general-purpose input, or with other dot product terms. Operated in this way, the specialized processing block provides a technical improvement of greatly increasing the functional density for implementing machine learning algorithms.

First claim

Opening claim text (preview).

What is claimed is: 1. An integrated circuit, comprising: a multiplier data path operable in a floating-point mode and a fixed-point mode; an adder data path configurable to receive signals from the multiplier data path only during the floating-point mode; and an adder operable to receive multiplier output signals from the multiplier data path and to feed the signals into the multiplier data path, wherein the adder comprises an additional adder, and the adder and the additional adder have different precisions. 2. The integrated circuit of claim 1 , wherein the multiplier data path is decomposed into a plurality of smaller multiplier data paths. 3. The integrated circuit of claim 2 , wherein each of the smaller multiplier data paths comprises: a partial product generator; a compressor operable to receive signals from the partial product generator; and a carry-propagate adder operable to receive signals from the compressor and to generate a corresponding product. 4. The integrated circuit of claim 1 , wherein the adder is operable to perform arithmetic operations on the signals from the multiplier data path with one or more floating-point values. 5. The integrated circuit of claim 4 , wherein the adder receives the one or more floating-point values via a multiplexing data path. 6. The integrated circuit of claim 5 , wherein the multiplexing data path is operable to: receive a first floating-point input from the multiplier data path, a second floating-point input from an adjacent digital signal processing (DSP) block, and a third floating-point input from an input to the multiplexing data path; and select at least two from the first floating-point input, the second floating-point input, and the third floating-point input to transmit to the adder data path. 7. A method of operating an integrated circuit, comprising: receiving a selection of a floating-point mode for the integrated circuit, wherein the selection is made between the floating-point mode and an integer mode; generating, via a multiplier data path, an output signal, wherein the multiplier data path is decomposed into a plurality of smaller multiplier data paths that each includes a partial product generator, a compressor operable to receive signals from the partial product generator, and a carry-propagate adder operable to receive a compressor output from the compressor and to generate a corresponding product to generate the output signal; and transmitting, via the multiplier data path, the output signal to an adder to act on the output signal to generate signals and to feed the signals into an adder data path when the integrated circuit is in the floating-point mode. 8. The method of claim 7 , wherein the multiplier data path is operable in the floating-point mode and the integer mode. 9. The method of claim 7 , wherein the multiplier data path comprises a multiplier with a first precision, where the multiplier is operable to output a product with the first precision. 10. The method of claim 9 , wherein the multiplier data path generates the output signal based at least on the product. 11. The method of claim 7 , comprising transmitting, via the multiplier data path, the output signal to an adjacent digital signal processing (DSP) block. 12. An integrated circuit, comprising: a multiplier data path comprising a plurality of data paths operable to generate a first floating-point product with a first precision through a first data path of the plurality of data paths and a second floating-point product with the first precision through a second data path of the plurality of data paths, where the multiplier data path is operable in a floating-point mode and a fixed-point mode; and an adder data path comprising: a first floating-point adder of the first precision, wherein the first floating-point adder is operable to add the first floating-point product and the second floating-point product to generate a first sum with the first precision; and a second floating-point adder of a second precision greater than the first precision, wherein the second floating-point adder is operable to add the first sum with an input. 13. The integrated circuit of claim 12 , wherein the input to the second floating-point adder is received from a first adjacent digital signal processing (DSP) block. 14. The integrated circuit of claim 13 , wherein the multiplier data path transmits the first floating-point product and the second floating-point product to a second adjacent DSP block. 15. The integrated circuit of claim 12 , wherein the adder data path is operable to pass the first sum through the second floating-point adder unchanged. 16. The integrated circuit of claim 12 , wherein the multiplier data path is operable to generate a third floating-point product with the second precision through the first data path and the second data path. 17. The integrated circuit of claim 12 , wherein the second floating-point adder is operable to output an accumulated value using a feedback path.

Assignees

Inventors

Classifications

  • Accepting both fixed-point and floating-point numbers · CPC title

  • Reconfigurable for different fixed word lengths · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Adding; Subtracting {(G06F7/4833, G06F7/4836 take precedence)} · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12056461B2 cover?
An integrated circuit with specialized processing blocks are provided. A specialized processing block may be optimized for machine learning algorithms and may include a multiplier data path that feeds an adder data path. The multiplier data path may be decomposed into multiple partial product generators, multiple compressors, and multiple carry-propagate adders of a first precision. Results fro…
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F7/5443. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 06 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).