Method and apparatus for generating fixed-point quantized neural network

US11588496B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11588496-B2
Application numberUS-201816051788-A
CountryUS
Kind codeB2
Filing dateAug 1, 2018
Priority dateAug 4, 2017
Publication dateFeb 21, 2023
Grant dateFeb 21, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel, determining fractional lengths of a bias and a weight for each channel among the parameters of the fixed-point expression for each channel based on a result of performing a convolution operation, and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of generating a fixed-point quantized neural network, the method comprising: analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network; determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channel; determining fractional lengths of a bias and a weight for each channel among the parameters for each channel based on a result of performing a convolution operation, including determining a maximum fractional length and/or a minimum fractional length among fractional lengths of fixed-point expressions respectively corresponding to the result of performing the convolution operation; and generating a fixed-point quantized neural network in which the bias and the weight for each channel have the determined fractional lengths being different for at least some channels included in layers of the fixed-point quantized neural network, including performing a channel-wise quantization for each channel included in layers of the pre-trained floating-point neural network. 2. The method of claim 1 , wherein the analyzing of the statistical distribution comprises obtaining statistics for each channel of floating-point parameter values of weights, input activations, and output activations used in each channel during pre-training of the pre-trained floating-point neural network. 3. The method of claim 1 , wherein the convolution operation comprises a partial sum operation between a plurality of channels, the partial sum operation comprises a plurality of multiply-accumulate (MAC) operations and an Add operation, and the determining of the fractional lengths comprises determining the fractional length of the bias based on fractional lengths of input activations and fractional lengths of weights input to the MAC operations. 4. The method of claim 3 , wherein the determining of the fractional length of the bias comprises determining the fractional length of the bias based on a maximum fractional length among fractional lengths of fixed-point expressions corresponding to results of the MAC operations. 5. The method of claim 4 , wherein the partial sum operation comprises: a first MAC operation between a first input activation of a first channel of an input feature map of the feature maps and a first weight of a first channel of the kernel; a second MAC operation between a second input activation of a second channel of the input feature map and a second weight of a second channel of the kernel; and an Add operation between a result of the first MAC operation, a result of the second MAC operation, and the bias, and the determining of the fractional length of the bias further comprises: obtaining a first fractional length of a first fixed-point expression corresponding to the result of the first MAC operation; obtaining a second fractional length of a second fixed-point expression corresponding to the result of the second MAC operation; and determining the fractional length of the bias to be a maximum fractional length among the first fractional length and the second fractional length. 6. The method of claim 5 , further comprising bit-shifting a fractional length of a fixed-point expression having a smaller fractional length among the first fixed-point expression and the second fixed-point expression based on the determined fractional length of the bias, wherein the fixed-point quantized neural network comprises information about an amount of the bit-shifting. 7. The method of claim 3 , wherein the determining of the fractional length of the bias comprises determining the fractional length of the bias to be a minimum fractional length among fractional lengths of fixed-point expressions respectively corresponding to results of the MAC operations, and the determining of the fractional lengths further comprises determining the fractional length of the weight for each channel by decreasing the fractional length of the weight by a difference between the fractional length of one of the fixed-point expressions corresponding to the result of one of the MAC operations to which the weight was input and the determined fractional length of the bias. 8. The method of claim 3 , wherein the partial sum operation comprises: a first MAC operation between a first input activation of a first channel of an input feature map of the feature maps and a first weight of a first channel of the kernel; a second MAC operation between a second input activation of a second channel of the input feature map and a second weight of a second channel of the kernel; and an Add operation between a result of the first MAC operation, a result of the second MAC operation, and the bias, the determining of the fractional lengths further comprises: obtaining a first fractional length of a first fixed-point expression corresponding to the result of the first MAC operation; and obtaining a second fractional length of a second fixed-point expression corresponding to the result of the second MAC operation, the determining of the fractional length of the bias comprises determining the fractional length of the bias to be a minimum fractional length among the first fractional length and the second fractional length, and the determining of the fractional lengths further comprises tuning a fractional length of the weight input to one of the first MAC operation and the second MAC operation that produces a result having a fixed-point expression having the minimum fractional length by decreasing the fractional length of the weight by a difference between the first fractional length and the second fractional length. 9. The method of claim 1 , wherein the statistical distribution for each channel is a distribution approximated by a normal distribution or a Laplace distribution, and the determining of the fixed-point expression of each of the parameters for each channel comprises determining the fixed-point expression based on a fractional length for each channel determined based on any one or any combination of any two or more of a mean, a variance, a standard deviation, a maximum value, and a minimum value of the parameter values for each channel obtained from the statistical distribution for each channel. 10. The method of claim 1 , further comprising retraining, after the determining of the fractional lengths is completed, the fixed-point quantized neural network with the determined fractional lengths of the bias and the weight for each channel set as constraints of the fixed-point quantized neural network to fine tune the fixed-point quantized neural network. 11. A non-transitory computer-readable medium storing instructions that, when executed by a processor, cause the processor to perform the method of claim 1 . 12. An apparatus for generating a fixed-point quantized neural network, the apparatus comprising: a memory configured to store at least one program; and a processor configured to execute the at least one program, wherein the processor executing the at least one program configures the processor to: analyze a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determine a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the floating-point parameter values based on the statistical distribution for each channe

Assignees

Inventors

Classifications

  • Multidimensional correlation or convolution · CPC title

  • Transfer learning · CPC title

  • Activation functions · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • G06N3/0495Primary

    Quantised networks; Sparse networks; Compressed networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11588496B2 cover?
A method of generating a fixed-point quantized neural network includes analyzing a statistical distribution for each channel of floating-point parameter values of feature maps and a kernel for each channel from data of a pre-trained floating-point neural network, determining a fixed-point expression of each of the parameters for each channel statistically covering a distribution range of the fl…
Who is the assignee on this patent?
Samsung Electronics Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/0495. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 21 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).