Approximation of non-linear functions in fixed point using look-up tables
US-2018060278-A1 · Mar 1, 2018 · US
US10262259B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10262259-B2 |
| Application number | US-201514936594-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 9, 2015 |
| Priority date | May 8, 2015 |
| Publication date | Apr 16, 2019 |
| Grant date | Apr 16, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method for selecting bit widths for a fixed point machine learning model includes evaluating a sensitivity of model accuracy to bit widths at each computational stage of the model. The method also includes selecting a bit width for parameters, and/or intermediate calculations in the computational stages of the mode. The bit width for the parameters and the bit width for the intermediate calculations may be different. The selected bit width may be determined based on the sensitivity evaluation.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method for selecting bit widths for values of a fixed point machine learning model stored in a memory of a neural computing device, comprising: applying, to an input received at the neural computing device, the model to classify the input; evaluating, while applying the model to the input, an amount of system resources for the neural computing device and a sensitivity of model accuracy to bit widths at a computational stage of the model; dynamically selecting a new bit width for values corresponding to one or more of parameters and intermediate calculations in the computational stage of the model based at least in part on at least one of the amount of system resources, the model accuracy, or a combination thereof being less than a threshold; and applying the model to classify the input with the new bit width. 2. The method of claim 1 , in which the model accuracy comprises a signal quantization to noise ratio (SQNR) at an output of the model or classification accuracy. 3. The method of claim 1 , in which: the model comprises a neural network and the computational stage is a layer of the neural network; the parameters comprise one or more of bias values and weights; and the intermediate calculations comprise activation values. 4. The method of claim 3 , in which the new bit width is based at least in part on connectivity of the network. 5. The method of claim 4 , in which the connectivity comprises a fully connected configuration, a convolutional configuration, or a configuration with a specific sparsity. 6. The method of claim 5 , in which a bit width for a fully connected layer is less than a bit width for a convolutional layer of the neural network. 7. The method of claim 6 , in which the weights and/or the bias values of the fully connected layer and the convolutional layer are random in a transfer learning arrangement. 8. The method of claim 3 , in which selecting of the new bit width is based at least in part on whether the new bit width is for a bias value, weight, or activation value. 9. The method of claim 3 , in which the new bit width for one or more of the bias values, the weights, and the activation values is based at least in part on a number of weights per layer, a number of activation values per layer, filter size per layer, filter stride per layer, and number of filters per layer in the neural network. 10. The method of claim 3 , further comprising fine-tuning the network after selecting one or more of the new bit width for the bias values, the activation values, and the weights of each layer. 11. The method of claim 1 , in which a bit width for the intermediate calculations of the computational stage is less than a bit width for the parameters in the computational stage. 12. The method of claim 1 , further comprising: injecting noise into the computational stage of the model; determining a model accuracy for the computational stage of the injected noise; and selecting a level of injected noise that provides a desired level of model accuracy. 13. The method of claim 1 , further comprising dynamically selecting the new bit width based at least in part on performance specifications or user input. 14. The method of claim 1 , in which an output layer uses a floating point number format. 15. A neural computing device for selecting bit widths for values of a fixed point machine learning model, comprising: a memory; and at least one processor coupled to the memory, the at least one processor configured: to apply, to an input received at the neural computing device, the model to classify the input; to evaluate, while applying the model to the input, an amount of system resources for the neural computing device and a sensitivity of model accuracy to bit widths at a computational stage of the model; to dynamically select a new bit width for values corresponding to one or more of parameters and intermediate calculations in the computational stage of the model based at least in part on at least one of the amount of system resources, the model accuracy, or a combination thereof being less than a threshold, the values stored in the memory; and to apply the model to classify the input with the new bit width. 16. The neural computing device of claim 15 , in which the model accuracy comprises a signal quantization to noise ratio (SQNR) at an output of the model or classification accuracy. 17. The neural computing device of claim 15 , in which: the model comprises a neural network and the computational stage is a layer of the neural network; the parameters comprise one or more of bias values and weights; and the intermediate calculations comprise activation values. 18. The neural computing device of claim 17 , in which the at least one processor is further configured to select the new bit width based at least in part on connectivity of the network. 19. The neural computing device of claim 18 , in which the connectivity comprises a fully connected configuration, a convolutional configuration or a configuration with a specific sparsity. 20. The neural computing device of claim 19 , in which a bit width for a fully connected layer is less than a bit width for a convolutional layer of the neural network. 21. The neural computing device of claim 20 , in which one or more of the weights or the bias values of the fully connected layer and the convolutional layer are random in a transfer learning arrangement. 22. The neural computing device of claim 17 , in which the at least one processor is further configured to select the new bit width based at least in part on whether the new bit width is for a bias value, weight, or activation value. 23. The neural computing device of claim 17 , in which the at least one processor is further configured to select the new bit width for one or more of the bias values, the weights, and the activation values based at least in part on a number of weights per layer, a number of activation values per layer, filter size per layer, filter stride per layer, and number of filters per layer in the neural network. 24. The neural computing device of claim 17 , in which the at least one processor is further configured to fine-tune the network after selecting one or more of the new bit width for the bias values, the activation values, and the weights of each layer. 25. The neural computing device of claim 15 , in which the at least one processor is further configured to select a bit width for the intermediate calculations of the computational stage to be less than a bit width for the parameters in the computational stage. 26. The neural computing device of claim 15 , in which the at least one processor is further configured: to inject noise into the computational stage of the model; to determine a model accuracy for the computational stage of the injected noise; and to select a level of injected noise that provides a desired level of model accuracy. 27. The neural computing device of claim 15 , in which the at least one processor is further configured to dynamically select the new bit width based at least in part on performance specifications or user input. 28. The neural computing device of claim 15 , in which an output layer of the model uses a floating point number format. 29. An apparatus for selecting bit widths for values of a fixed point machine
Learning methods · CPC title
Interfaces, programming languages or software development kits, e.g. for simulating neural networks · CPC title
using electronic means · CPC title
for solving equations {, e.g. nonlinear equations, general mathematical optimization problems (optimization specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.