What technology area does this patent fall under?

Primary CPC classification G06N3/0495. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 11 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Method and apparatus for neural network quantization

US11625577B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11625577-B2
Application number	US-202016738338-A
Country	US
Kind code	B2
Filing date	Jan 9, 2020
Priority date	Jan 9, 2019
Publication date	Apr 11, 2023
Grant date	Apr 11, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determining one or more layers, from among the layers, to be quantized with a lower-bit precision based on the analyzed statistic, and generating a second neural network by quantizing the determined one or more layers with the lower-bit precision.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for neural network quantization on a neural network including a plurality of layers, the method comprising: performing a plurality of cycles of feedforward and backpropagation learning on each layer of a first neural network having a first-bit precision; obtaining, for each cycle of the feedforward and backpropagation learning, a weight difference between an initial weight and an updated weight, the updated weight being determined by the backpropagation learning of each cycle; analyzing, for each layer of the first neural network, a statistic of the weight differences; determining, based on the analyzed statistic, one or more layers, from among the plurality of layers, to be quantized with a second bit precision that is lower than the first bit precision; and generating a second neural network by quantizing the determined one or more layers with the second bit precision. 2. The method of claim 1 , wherein the statistic comprises performing a mean square of each weight difference of each cycle for each of the layers. 3. The method of claim 1 , further comprising sorting the plurality of layers in an order of a size of the analyzed statistic, wherein the determining of the one or more layers to be quantized comprises identifying, from among the sorted layers, the one or more layers having a relatively small analyzed statistic size. 4. The method of claim 3 , wherein the determining of the one or more layers to be quantized comprises identifying, using a binary search algorithm and in response to an accuracy loss of the second neural network in which the one or more layers among the sorted layers are quantized with the second bit precision is equal or within a threshold in comparison with the first neural network in which the one or more layers among the sorted layers are not quantized with the second bit precision, the one or more layers to be quantized. 5. The method of claim 4 , wherein the accuracy loss comprises a recognition rate of the first neural network. 6. The method of claim 3 , wherein the determining of the one or more layers to be quantized comprises determining a number of layers from among the sorted layers to be the one or more layers in ascending order based on a size of the analyzed statistic. 7. The method of claim 3 , wherein the determining of the one or more layers to be quantized comprises selecting to not determine a layer having a smallest analyzed statistic size from among the sorted layers to be the one or more layers to be quantized. 8. The method of claim 1 , further comprising, quantizing, in response to the first neural network having layers of floating-point parameters of the first bit precision, layers other than the one or more layers, of the plurality of layers, to layers of fixed-point parameters of a fourth bit precision that is lower than the first bit precision and higher than the second bit precision, wherein the quantized second neural network comprises the determined one or more layers having fixed-point parameters of the second bit precision and the layers have fixed-point parameters of the fourth bit precision. 9. A non-transitory computer-readable storage medium storing instructions that, when executed by a processor, cause the processor to perform the method defined in claim 1 . 10. A method for neural network quantization, the method comprising: performing feedforward and backpropagation learning for a plurality of cycles on a first neural network having a first bit precision; obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network; analyzing a statistic of the weight differences for each of the layers; determining one or more layers, from among the layers, to be quantized with a second bit precision lower than the first bit precision, based on the analyzed statistic; and generating a second neural network by quantizing the determined one or more layers with the second bit precision, wherein the first neural network has layers of fixed point parameters of the first bit precision and is quantized from a third neural network having layers of floating point parameters of a third bit precision that is higher than the first bit precision, and the quantized second neural network comprises the determined one or more layers have fixed-point parameters of the second bit precision and other layers with the fixed-point parameters of the first bit precision. 11. An apparatus for neural network quantization on a neural network including a plurality of layers, the apparatus comprising: a processor configured to: perform a plurality of cycles of feedforward and backpropagation learning on each layer of the first neural network having a first bit precision; obtain, for each cycle of the feedforward and backpropagation learning, a weight difference between an initial weight and an updated weight, the updated weight being determined by the backpropagation learning of each cycle; analyze, for each layer of the first neural network, a statistic of weight differences; determine, based on the analyzed statistic, one or more layers, from among the plurality of layers, to be quantized with a second bit precision that is lower than the first bit precision; and generate a second neural network by quantizing the determined one or more layers with the second bit precision. 12. The apparatus of claim 11 , wherein the statistic comprises performing a mean square of each weight difference of each cycle for each of the layers. 13. The apparatus of claim 11 , wherein the processor is further configured to: sort the plurality of layers in an order of a size of the analyzed statistic; and identify, from among the sorted layers, the one or more layers having relatively small analyzed statistic size. 14. The apparatus of claim 13 , wherein the processor is further configured to identify, using a binary search algorithm and in response to an accuracy loss of the second neural network in which the one or more layers among the sorted layers are quantized with the second bit precision is equal or within a threshold in comparison with the first neural network in which the one or more layers among the sorted layers are not quantized with the second bit precision, the one or more layers to be quantized. 15. The apparatus of claim 14 , wherein the accuracy loss comprises a recognition rate of the neural network. 16. The apparatus of claim 13 , wherein the processor is further configured to determine a number of layers from among the sorted layers to be the one or more layers in ascending order based on a size of the analyzed statistic. 17. The apparatus of claim 13 , wherein the processor is further configured to not determine a layer having a smallest analyzed statistic size from among the sorted layers to be the one or more layers to be quantized. 18. The apparatus of claim 11 , wherein the first neural network has layers of fixed point parameters of the first bit precision and is quantized from a third neural network having layers of floating point parameters of a third bit precision that is higher than the first bit precision, and the quantized second neural network comprises the determined one or more layers have fixed-point parameters of the second bit precision and other layers with the fixed-point parameters of the first bit precision. 19. The apparatus of claim 11 , wherein the processor is further configured to quantize, in response to the first neural network ha

Assignees

Samsung Electronics Co Ltd

Inventors

Classifications

G06N3/09
Supervised learning · CPC title
G06N3/0495Primary
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/0499
Feedforward networks · CPC title

Patent family

Related publications grouped by family.

View patent family 68654403

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11625577B2 cover?: According to a method and apparatus for neural network quantization, a quantized neural network is generated by performing learning of a neural network, obtaining weight differences between an initial weight and an updated weight determined by the learning of each cycle for each of layers in the first neural network, analyzing a statistic of the weight differences for each of the layers, determ…
Who is the assignee on this patent?: Samsung Electronics Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/0495. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 11 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).