What technology area does this patent fall under?

Primary CPC classification G06N3/0495. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Apr 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Dynamic quantization of neural networks

US12282852B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12282852-B2
Application number	US-202318363408-A
Country	US
Kind code	B2
Filing date	Aug 1, 2023
Priority date	Dec 28, 2017
Publication date	Apr 22, 2025
Grant date	Apr 22, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus for applying dynamic quantization of a neural network is described herein. The apparatus includes a scaling unit and a quantizing unit. The scaling unit is to calculate an initial desired scale factors of a plurality of inputs, weights and a bias and apply the input scale factor to a summation node. Also, the scaling unit is to determine a scale factor for a multiplication node based on the desired scale factors of the inputs and select a scale factor for an activation function and an output node. The quantizing unit is to dynamically requantize the neural network by traversing a graph of the neural network.

First claim

Opening claim text (preview).

What is claimed is: 1. At least one non-transitory computer readable medium comprising instructions to cause at least one processor circuit to at least: determine a scale factor based on a maximum value over a window of past first values; train a neural network based on second values scaled by the scale factor; and quantize respective ones of the scaled second values. 2. The at least one non-transitory computer readable medium of claim 1 , wherein the neural network is a floating point neural network. 3. The at least one non-transitory computer readable medium of claim 1 , wherein the neural network is a long short term memory (LSTM) network. 4. The at least one non-transitory computer-readable medium of claim 1 , wherein the window of the past first values includes a past maximum value. 5. The at least one non-transitory computer-readable medium of claim 1 , wherein the second values correspond to weights of the neural network. 6. The at least one non-transitory computer-readable medium of claim 1 , wherein the second values correspond to third values to be operated upon by the neural network. 7. The at least one non-transitory computer readable medium of claim 1 , wherein one or more of the at least one processor circuit is to perform the quantizing of the respective ones of the scaled second values from floating point values to integer values. 8. The at least one non-transitory computer readable medium of claim 1 , wherein one or more of the at least one processor circuit is to apply the scale factor to the second values in floating point form before the quantization to preserve resolution of the second values in integer form after the quantization. 9. An apparatus comprising: interface circuitry; computer-readable instructions; and programmable circuitry to at least one of instantiate or execute the computer-readable instructions to at least: determine a scale factor based on a maximum value over a window of past first values; train a neural network based on a plurality of second values scaled by the scale factor; and quantize respective ones of the scaled second values. 10. The apparatus of claim 9 , wherein the neural network includes a floating point neural network. 11. The apparatus of claim 9 , wherein the neural network includes a long short term memory (LSTM) network. 12. The apparatus of claim 9 , wherein the window of the past first values includes a past maximum value. 13. The apparatus of claim 9 , wherein the second values correspond to weights of the neural network. 14. The apparatus of claim 9 , wherein the second values correspond to third values to be operated upon by the neural network. 15. The apparatus of claim 9 , wherein the programmable circuitry is to perform the quantizing of the respective ones of the scaled second values from floating point values to integer values. 16. The apparatus of claim 9 , wherein the programmable circuitry is to apply the scale factor to the second values in floating point form before the quantization to preserve resolution of the second values in integer form after the quantization. 17. A method comprising: determining a scale factor based on a maximum value over a window of past first values; training a neural network based on a plurality of second values scaled by the scale factor; and quantizing respective ones of the plurality of scaled second values. 18. The method of claim 17 , wherein the neural network is at least one of a floating point neural network or a long short term memory (LSTM) network. 19. The method of claim 17 , wherein the window of the past first values includes a past maximum value. 20. The method of claim 17 , wherein the plurality of second values correspond to at least one of weights of the neural network or third values to be operated upon by the neural network.

Assignees

Intel Corp

Inventors

Deisher Michael E

Classifications

G06N3/0442
characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU] · CPC title
G06N3/0495Primary
Quantised networks; Sparse networks; Compressed networks · CPC title
G06N3/0464
Convolutional networks [CNN, ConvNet] · CPC title
G06N3/048
Activation functions · CPC title
G06F7/023
adaptive, e.g. self learning · CPC title

Patent family

Related publications grouped by family.

View patent family 65230133

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12282852B2 cover?: An apparatus for applying dynamic quantization of a neural network is described herein. The apparatus includes a scaling unit and a quantizing unit. The scaling unit is to calculate an initial desired scale factors of a plurality of inputs, weights and a bias and apply the input scale factor to a summation node. Also, the scaling unit is to determine a scale factor for a multiplication node bas…
Who is the assignee on this patent?: Intel Corp
What technology area does this patent fall under?: Primary CPC classification G06N3/0495. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Apr 22 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).