Methods and systems for digital neural processing with discrete-level synapes and probabilistic STDP
US-9129220-B2 · Sep 8, 2015 · US
US12511535B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12511535-B2 |
| Application number | US-202117551572-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 15, 2021 |
| Priority date | May 25, 2017 |
| Publication date | Dec 30, 2025 |
| Grant date | Dec 30, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Provided are a neural network method and an apparatus, the method including obtaining a set of floating point data processed in a layer included in a neural network, determining a weighted entropy based on data values included in the set of floating point data, adjusting quantization levels assigned to the data values based on the weighted entropy, and quantizing the data values included in the set of floating point data in accordance with the adjusted quantization levels.
Opening claim text (preview).
What is claimed is: 1 . A processor-implemented neural network method, the method comprising: obtaining a set of weights and a set of activations from a set of floating point data processed in a layer included in a neural network; quantizing the set of weights using a clustering based quantization method; quantizing the set of activations using a logarithm data representation-based quantization method, which is different from the clustering based quantization method, in consideration that, while the set of weights is fixed after training the neural network, the set of activations varies in accordance with input data in an inference process implementing the neural network, with the quantizing the set of activations comprising, using one or more processors: repeating, for a plurality of iterations, for a current iteration among the plurality of iterations, quantizing the set of activations into a corresponding plurality of log-based quantization levels based on a corresponding one or more parameters that control a corresponding first size of a first value corresponding to a first quantization level among the corresponding plurality of log-based quantization levels and a corresponding first interval size between the corresponding plurality of log-based quantization levels, for the current iteration, determining the activation weighted entropy for the corresponding plurality of log-based quantization levels of the current iteration by applying corresponding determined importance weights for the corresponding plurality of log-based quantization levels of the current iteration to relative, compared to a total number of activations in the set of activations, activation frequencies of each of the corresponding plurality of log-based quantization levels of the current iteration, and setting the corresponding one or more parameters of a next iteration among the plurality of iterations, while the determined activation weighted entropy of the current iteration is not determined to be maximized and the current iteration is not a final iteration among the plurality of iterations; quantizing the set of activations into a final plurality of log-based quantization levels that are based on the set corresponding one or more parameters of one of the plurality of iterations that has a determined maximized activation weighted entropy; executing, based on input data provided to the neural network, the neural network using the set of activations that have been quantized into the final plurality of log-based quantization levels; and indicating a result of the executing of the neural network. 2 . The method of claim 1 , wherein, the obtaining of the set of activations, the repeating, the quantizing of the set of activations into the final plurality of log-based quantization levels, the executing of the neural network, and the indicating are performed for each of a plurality of layers included in the neural network. 3 . The method of claim 1 , further comprising training the layer of the neural network based on the set of activations that have been quantized into the final plurality of log-based quantization levels, wherein the executing of the neural network and the indicating of the result are operations of the training of the layer of the neural network. 4 . The method of claim 1 , wherein the floating point data includes weight quantization levels, assigned for the set of weights, that are adjusted based on a weight weighted entropy, and wherein the quantizing of the set of weights using the clustering based quantization method is performed in accordance with the adjusted weight quantization levels. 5 . The method of claim 1 , wherein the corresponding determined importance weights of each of the plurality of iterations are based on respective set importances of the each of the activations. 6 . A neural network apparatus, the apparatus comprising: one or more processors; and one or more memories comprising code, which when executed by the one or more processors configures the one or more processors to: obtain a set of weights and a set of activations from a set of floating point data processed in a layer included in a neural network; quantize the set of weights using a clustering based quantization method; quantize the set of activations using a logarithm data representation-based quantization method, which is different from the clustering based quantization method, in consideration that, while the set of weights is fixed after training the neural network, the set of activations varies in accordance with input data in an inference process implementing the neural network, with the quantization of the set of activations comprising a repetition, for a plurality of iterations, for a current iteration among the plurality of iterations, a quantization of the set of activations into a corresponding plurality of log-based quantization levels based on a corresponding one or more parameters that control a corresponding first size of a first value corresponding to a first quantization level among the corresponding plurality of log-based quantization levels and a corresponding first interval size between the corresponding plurality of log-based quantization levels, for the current iteration, a determination of the activation weighted entropy for the corresponding plurality of log-based quantization levels of the current iteration through an application of corresponding determined importance weights for the corresponding plurality of log-based quantization levels of the current iteration to relative, compared to a total number of activations in the set of activations, activation frequencies of each of the corresponding plurality of log-based quantization levels of the current iteration, and set the corresponding one or more parameters of a next iteration among the plurality of iterations, while the determined activation weighted entropy of the current iteration is not determined to be maximized and the current iteration is not a final iteration among the plurality of iterations; quantize the quantized set of activations into a final plurality of log-based quantization levels that are based on the set corresponding one or more parameters of one of the plurality of iterations that has a determined maximized activation weighted entropy; execute, based on input data provided to the neural network, the neural network using the set of activations that have been quantized into the final plurality of log-based quantization levels; and indicate a result of the execution of the neural network. 7 . The apparatus of claim 6 , wherein the execution of the code configures the one or more processors to perform the obtaining of the set of activations, the repeating, the quantizing of the set of activations into the final plurality of log-based quantization levels, the executing of the neural network, and the indicating for each of a plurality of layers included in the neural network. 8 . The apparatus of claim 6 , wherein the floating point data includes weight quantization levels, assigned for the set of weights, that are adjusted based on a weight weighted entropy, and wherein the quantizing of the set of weights using the clustering based quantization method is performed in accordance with the adjusted weight quantization levels. 9 . The apparatus of claim 6 , wherein the corresponding determined importance weights of each of the plurality of iterations are based on respective set importances of the each of the activations. 10 . The apparatus of claim 6 , wherein the execution of the code configures the one or more processors to train the layer of the neural network based on the set of activations quantized into the corresponding
Learning methods · CPC title
Architecture, e.g. interconnection topology · CPC title
Combinations of networks · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
using electronic means · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.