Lossless exponent and lossy mantissa weight compression for training deep neural networks

US11615301B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11615301-B2
Application numberUS-201916559241-A
CountryUS
Kind codeB2
Filing dateSep 3, 2019
Priority dateSep 3, 2019
Publication dateMar 28, 2023
Grant dateMar 28, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and apparatuses are provided for compressing values. A plurality of parameters may be obtained from a memory, each parameter comprising a floating-point number that is used in a relationship between artificial neurons or nodes in a model. A mantissa value and an exponent value may be extracted from each floating-point number to generate a set of mantissa values and a set of exponent values. The set of mantissa values may be compressed to generate a mantissa lookup table (LUT) and a plurality of mantissa LUT index values. The set of exponent values may be encoded to generate an exponent LUT and a plurality of exponent LUT index values. The mantissa LUT, mantissa LUT index values, exponent LUT, and exponent LUT index values may be provided to one or more processing entities to train the model.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for compressing values, comprising: one or more processors; and one or more memory devices that store program code configured to be executed by the one or more processors, the program code comprising: a floating-point number separator configured to: obtain a plurality of parameters from a parameter memory, each parameter comprising a floating-point number that is used in a relationship between artificial neurons or nodes in a model; extract a mantissa value and an exponent value from each floating-point number to generate a set of mantissa values and a set of exponent values; a mantissa compressor configured to compress the set of mantissa values to generate a mantissa lookup table and a plurality of mantissa lookup table index values, each parameter being assigned one of the plurality of mantissa lookup table index values; an exponent encoder configured to encode the set of exponent values to generate an exponent lookup table and a plurality of exponent lookup table index values, each parameter being assigned one of the plurality of exponent lookup table index values; and a compressed parameter communicator configured to provide the mantissa lookup table, mantissa lookup table index values, exponent lookup table, and exponent lookup table values to at least one processing entity to train the model. 2. The system of claim 1 , wherein the at least one processing entity comprises at least one hardware accelerator, and wherein the model comprises a deep-neural network. 3. The system of claim 2 , wherein the at least one processing entity is configured to: generate a set of decompressed fixed-point values based at least on the mantissa lookup table, the mantissa lookup table index values, the exponent lookup table, and the exponent lookup table index values; convert the set of decompressed fixed-point values into a set of decompressed floating-point parameters; and train the deep neural network using the set of decompressed floating-point parameters. 4. The system of claim 1 , wherein the mantissa compressor is configured to compress the set of mantissa values to generate the mantissa lookup table by: partitioning the set of mantissa values into a plurality of mantissa clusters, each cluster comprising a fixed-point cluster centroid; and populating the mantissa lookup table with the fixed-point cluster centroids, each mantissa lookup table index value identifying a particular one of the fixed-point cluster centroids. 5. The system of claim 1 , wherein the encoded set of exponent values is lossless. 6. The system of claim 1 , wherein the mantissa compressor is configured to compress the set of mantissa values in parallel with the exponent encoder encoding the set of exponent values. 7. The system of claim 1 , wherein each floating-point number is one of a single-precision floating-point number or a double-precision floating-point number. 8. A method for compressing values, comprising: obtaining a plurality of parameters from a parameter memory, each parameter comprising a floating-point number that is used in a relationship between artificial neurons or nodes in a model; extracting a mantissa value and an exponent value from each floating-point number to generate a set of mantissa values and a set of exponent values; compressing the set of mantissa values to generate a mantissa lookup table and a plurality of mantissa lookup table index values, each parameter being assigned one of the plurality of mantissa lookup table index values; encoding the set of exponent values to generate an exponent lookup table and a plurality of exponent lookup table index values, each parameter being assigned one of the plurality of exponent lookup table index values; and providing the mantissa lookup table, mantissa lookup table index values, exponent lookup table, and exponent lookup table values to at least one processing entity to train the model. 9. The method of claim 8 , wherein the at least one processing entity comprises at least one hardware accelerator, and wherein the model comprises a deep-neural network. 10. The method of claim 9 , further comprising: generating a set of decompressed fixed-point values based at least on the mantissa lookup table, the mantissa lookup table index values, the exponent lookup table, and the exponent lookup table index values; converting the set of decompressed fixed-point values into a set of decompressed floating-point parameters; and training the deep neural network using the set of decompressed floating-point parameters. 11. The method of claim 8 , wherein the compressing the set of mantissa values to generate the mantissa lookup table comprises: partitioning the set of mantissa values into a plurality of mantissa clusters, each cluster comprising a fixed-point cluster centroid; and populating the mantissa lookup table with the fixed-point cluster centroids, each mantissa lookup table index value identifying a particular one of the fixed-point cluster centroids. 12. The method of claim 8 , wherein the encoded set of exponent values is lossless. 13. The method of claim 8 , wherein the compressing the set of mantissa values is performed in parallel with the encoding the set of exponent values. 14. The method of claim 8 , wherein each floating-point number is one of a single-precision floating-point number or a double-precision floating-point number. 15. A device comprising: a floating-point number separator circuit configured to: obtain a plurality of parameters from a parameter memory, each parameter comprising a floating-point number that is used in a relationship between artificial neurons or nodes in a model; and extract a mantissa value and an exponent value from each floating-point number to generate a set of mantissa values and a set of exponent values; a mantissa compressor circuit configured to compress the set of mantissa values to generate a mantissa lookup table and a plurality of mantissa lookup table index values, each parameter being assigned one of the plurality of mantissa lookup table index values; an exponent encoder circuit configured to encode the set of exponent values to generate an exponent lookup table and a plurality of exponent lookup table index values, each parameter being assigned one of the plurality of exponent lookup table index values; and a compressed parameter outputting circuit configured to output the mantissa lookup table, mantissa lookup table index values, exponent lookup table, and exponent lookup table values for use by at least one processing entity to train the model. 16. The device of claim 15 , wherein the at least one processing entity comprises at least one hardware accelerator, and wherein the model comprises a deep-neural network. 17. The device of claim 16 , wherein the at least one processing entity comprises circuitry configured to: generate a set of decompressed fixed-point values based at least on the mantissa lookup table, the mantissa lookup table index values, the exponent lookup table, and the exponent lookup table index values; convert the set of decompressed fixed-point values into a set of decompressed floating-point parameters; and train the deep neural network using the set of decompressed floating-point parameters. 18. The device of claim 15 , wherein the mantissa compressor circuit is configured to: partition the set of mantissa values into a plurality of mantissa clusters, each cluster comprising a fixed-point cluster centroid; and populate the mantissa lookup table with the fixed-poin

Assignees

Inventors

Classifications

  • Distributed learning, e.g. federated learning · CPC title

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title

  • Parallelization · CPC title

  • using electronic means · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11615301B2 cover?
Systems, methods, and apparatuses are provided for compressing values. A plurality of parameters may be obtained from a memory, each parameter comprising a floating-point number that is used in a relationship between artificial neurons or nodes in a model. A mantissa value and an exponent value may be extracted from each floating-point number to generate a set of mantissa values and a set of ex…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/08. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 28 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).