Flow for quantized neural networks

US11645493B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11645493-B2
Application numberUS-201815972054-A
CountryUS
Kind codeB2
Filing dateMay 4, 2018
Priority dateMay 4, 2018
Publication dateMay 9, 2023
Grant dateMay 9, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus are disclosed supporting a design flow for developing quantized neural networks. In one example of the disclosed technology, a method includes quantizing a normal-precision floating-point neural network model into a quantized format. For example, the quantized format can be a block floating-point format, where two or more elements of tensors in the neural network share a common exponent. A set of test input is applied to a normal-precision flooding point model and the corresponding quantized model and the respective output tensors are compared. Based on this comparison, hyperparameters or other attributes of the neural networks can be adjusted. Further, quantization parameters determining the widths of data and selection of shared exponents for the block floating-point format can be selected. An adjusted, quantized neural network is retrained and programmed into a hardware accelerator.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: prior to applying a quantization operation, reshaping or splitting at least one tensor that is included among multiple tensors of a normal-precision neural network model, said reshaping or splitting facilitates handling of shared exponents over granulates that are finer than an entire tensor; performing the quantization operation by quantizing the normal-precision neural network model, which comprises the multiple tensors of normal-precision floating-point numbers, producing a quantized neural network model in a quantized-precision format; evaluating the quantized neural network model by applying input tensors to an input layer of the quantized neural network model, producing quantized output; comparing the quantized output to output generated by applying the input tensors to the normal-precision floating-point model; and based on the comparing, selecting a new quantized-precision format having at least one quantization parameter different than the quantized-precision format, wherein the at least one different quantization parameter comprises, for at least one layer of the quantized neural network, at least a parameter to share an exponent on a per-row basis and/or a parameter to share an exponent on a per-column basis. 2. The method of claim 1 , wherein: the quantized-precision format is a block floating-point format where at least two elements of the quantized neural network model share a common exponent. 3. The method of claim 1 , further comprising: based on the comparing, retraining the quantized neural network model by adjusting at least one or more training parameters used to train the normal-precision neural network and training the quantized neural network with the adjusted at least one training parameter. 4. The method of claim 3 , wherein the adjusted at least one of the training parameters comprises at least one of the following: a batch size, a momentum value, a number of training epochs, or a drop out rate. 5. The method of claim 3 , further comprising: producing the normal-precision neural network by training an untrained normal-precision neural network according to one or more training parameters at a selected learning rate; and wherein the adjusting the at least one of the training parameters comprises adjusting a learning rate to be lower than the selected learning rate used to train the untrained normal-precision neural network. 6. The method of claim 1 , further comprising: quantizing the normal-precision neural network model to produce a re-quantized neural network model in the new quantized-precision format. 7. The method of claim 6 , wherein the at least one different quantization parameter comprises, for at least one layer of the quantized neural network, at least one of: a bit width used to represent bit widths of node weight mantissas, a bit width used to represent bit widths of node weight exponents, a bit width used to represent bit widths of activation value mantissas, a bit width used to represent bit widths of activation value exponents, a tile size for a shared exponent, or a parameter specifying a method of common exponent selection. 8. The method of claim 1 , further comprising, based on the comparing, sparsifying at least one weight of the quantized neural network. 9. The method of claim 1 , further comprising: based on the comparing, changing a hyperparameter used to train the normal-precision neural network or the quantized-precision network and retraining the quantized-precision neural network with the changed hyperparameter; and wherein the changed hyperparameter includes one of a number of hidden layers in the normal-precision neural network, a node type for a layer of the normal-precision neural network, or a learning rate for training the neural network. 10. The method of claim 1 , wherein the normal-precision neural network model is quantized according to a set of one or more quantization parameters, the method further comprising: based on the comparing, adjusting at least one of the quantization parameters; and retraining the quantized neural network model using the adjusted at least one of the quantization parameters. 11. A quantization-enabled system for modeling a neural network comprising tensors representing node weights and edges, the system comprising: one or more processors; and one or more computer readable storage media that store computer-readable instructions that are executable by the one or more processors to cause the system to: prior to applying a quantization operation, reshape or split at least one tensor that is included among multiple tensors of a normal-precision neural network model, said reshaping or splitting facilitates handling of shared exponents over granulates that are finer than an entire tensor; transform the normal-precision neural network model to a block floating-point format neural network model according to a set of quantization parameters, the block floating-point format model including at least one shared exponent; apply input tensors to an input layer of the block floating-point format neural network model, producing first output values; calculate differences between the first output values and second output values generated by applying the input tensors to the normal-precision neural network model; and responsive to the calculated differences, select a new block floating-point format having at least one parameter different than the set of quantization parameters, wherein the at least one different parameter comprises, for at least one layer of the block floating-point format neural network model, a parameter to share an exponent on a per-row basis and/or a parameter to share an exponent on a per-column basis. 12. The system of claim 11 , wherein execution of the computer-readable instructions further causes the system to: retrain the normal-precision neural network model by adjusting a hyperparameter and retraining the normal-precision neural network model with the adjusted hyperparameter. 13. The system of claim 11 , wherein execution of the computer-readable instructions further causes the system to: retrain the block floating-point format neural network model by adjusting a hyperparameter and retraining the block floating-point format neural network model with the adjusted hyperparameter. 14. The system of claim 11 , wherein execution of the computer-readable instructions further causes the system to: perform the quantization operation by quantizing the normal-precision neural network model to produce a re-quantized neural network model in the new block floating-point format. 15. The system of claim 14 , wherein the at least one different parameter comprises, for at least one layer of block floating-point format neural network model: a bit width used to represent bit widths of node weight mantissas, a bit width used to represent bit widths of node weight exponents, a bit width used to represent bit widths of activation value mantissas, a bit width used to represent bit widths of activation value exponents, a tile size for a shared exponent, or a parameter specifying a method of common exponent selection. 16. The system of claim 11 , further comprising: a hardware accelerator configured to evaluate the block floating-point format neural network model by receiving input tensors, processing operations for nodes of the block floating-point neural network model representing in the block floating-point format, and produce an output tensor; and wherein the processors are configured to configure the hardware accelerator with the block floati

Assignees

Inventors

Classifications

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • G06N3/0495Primary

    Quantised networks; Sparse networks; Compressed networks · CPC title

  • Supervised learning · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11645493B2 cover?
Methods and apparatus are disclosed supporting a design flow for developing quantized neural networks. In one example of the disclosed technology, a method includes quantizing a normal-precision floating-point neural network model into a quantized format. For example, the quantized format can be a block floating-point format, where two or more elements of tensors in the neural network share a c…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/0495. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).