Browsing interface for item counterparts having different scales and lengths
US-2017262959-A1 · Sep 14, 2017 · US
US12093148B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12093148-B2 |
| Application number | US-202117547972-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 10, 2021 |
| Priority date | Jun 12, 2019 |
| Publication date | Sep 17, 2024 |
| Grant date | Sep 17, 2024 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The technical solution involves a board card including a storage component, an interface apparatus, a control component, and an artificial intelligence chip. The artificial intelligence chip is connected to the storage component, the control component, and the interface apparatus, respectively; the storage component is used to store data; the interface apparatus is used to implement data transfer between the artificial intelligence chip and an external device; and the control component is used to monitor a state of the artificial intelligence chip. The board card is used to perform an artificial intelligence operation.
Opening claim text (preview).
The invention claimed is: 1. A method for adjusting a data bit width in a convolution neural network layer during a neural network computation, comprising: obtaining a data bit width used to perform a quantization on data to be quantized, wherein the data to be quantized includes at least one type of neurons, weights, gradients, or biases, the data bit width indicates the data bit width of the quantized data after the data to be quantized being quantized; performing a quantization on a group of data to be quantized based on the data bit width to convert the group of data to be quantized to a group of quantized data, wherein the group of quantized data has the data bit width; comparing the group of data to be quantized with the group of quantized data to determine a quantization error correlated with the data bit width; adjusting the data bit width based on the determined quantization error; and applying the adjusted data bit width during quantization in the convolution neural network layer. 2. The method of claim 1 , wherein the comparing of the group of data to be quantized with the group of quantized data to determine the quantization error correlated with the data bit width includes: determining a quantization interval according to the data bit width; and determining the quantization error according to the quantization interval, the group of the quantized data and the group of data to be quantized. 3. The method of claim 2 , wherein the determining the quantization error according to the quantization interval, the group of the quantized data and the group of the data to be quantized includes: inversely quantizing the group of quantized data according to the quantization interval to obtain a group of inversely quantized data, wherein a data format of the group of inversely quantized data is the same with a data format of the group of the data to be quantized; and determining a quantization error according to the group of inversely quantized data and the group of data to be quantized. 4. The method of claim 1 , wherein the adjusting the data bit width based on the determined quantization error includes: comparing the quantization error and a preset threshold, wherein the preset threshold includes at least one of a first threshold and a second threshold; and adjusting the data bit width according to a comparison result. 5. The method of claim 4 , wherein the adjusting the data bit width according to the comparison result includes: increasing the data bit width when the quantization error is greater than or equal to the first threshold; wherein the increasing the data bit width includes: increasing the data bit width according to a first preset bit width stride to determine an adjusted data bit width; wherein the method further comprises: iteratively performing the quantization on the group of data to be quantized based on the adjusted data bit width to convert the group of data to be quantized to another group of quantized data, wherein the other group of quantized data has the adjusted data bit width; and comparing the group of data to be quantized with the other group of quantized data to determine another quantization error correlated with an adjusted data bit width until the other quantization error is less than the first preset threshold. 6. The method of claim 4 , wherein the adjusting the data bit width according to the comparison result includes: decreasing the data bit width when the quantization error is less than or equal to a second threshold; wherein the decreasing the data bit width includes: decreasing the data bit width according to a second preset bit width stride to determine an adjusted bit width; wherein the method further comprises: iteratively performing the quantization on the group of data to be quantized based on the adjusted data bit width to convert the group of data to be quantized to another group of quantized data, wherein the other group of quantized data has the adjusted data bit width; and determining another quantization error correlated with the adjusted data bit width based on the group of data to be quantized and the other group of quantized data, until the other quantization error is greater than the second preset threshold. 7. The method of claim 4 , wherein the adjusting the data bit width according to the comparison result includes: maintaining the data bit width when the quantization error is between the first threshold and the second threshold. 8. The method of claim 1 , further comprising: updating a quantization parameter configured to perform the quantization on the group of data to be quantized based on the group of data to be quantized and the adjusted bit width; and performing the quantization on the group of data to be quantized based on an updated quantization parameter. 9. The method of claim 1 , further comprising: obtaining a data variation range of data to be quantized; and according to the data variation range of the data to be quantized, determining a target iteration interval to adjust the data bit width according to the target iteration interval, wherein the target iteration interval includes at least one iteration. 10. The method of claim 9 , wherein the determining the target iteration interval according to the data variation range of the data to be quantized includes: determining the target iteration interval according to the first error, wherein the target iteration interval is negatively correlated with the first error. 11. The method of claim 9 , wherein the obtaining of the data variation range of the data to be quantized includes: obtaining a variation trend of the data bit width; and determining the data variation range of the data to be quantized according to a variation range of a point location and the variation trend of the data bit width. 12. A device for adjusting a data bit width in a convolution neural network layer during a neural network computation, comprising: an obtaining circuit configured to obtain a data bit width used to perform a quantization on data to be quantized, wherein the data to be quantized includes at least one type of neurons, weights, gradients, or biases, the data bit width indicates the data bit width of the quantized data after the data to be quantized being quantized; a quantization circuit configured to perform a quantization on a group of data to be quantized based on the data bit width to convert the group of data to be quantized to a group of quantized data, wherein the group of quantized data has the data bit width; and a determination circuit configured to compare the group of data to be quantized with the group of quantized data to determine a quantization error correlated with the data bit width, and adjust the data bit width based on the determined quantization error, before the adjusted data bit width is applied during quantization in the convolution neural network layer. 13. An artificial intelligence chip comprising the device of claim 12 . 14. A non-transitory computer readable storage medium, wherein a computer program is stored in the non-transitory computer readable storage medium, and the method of claim 1 are implemented when the computer program is executed by a processor. 15. An electronic device comprising the artificial intelligence chip of claim 13 .
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Quantised networks; Sparse networks; Compressed networks · CPC title
Knowledge representation; Symbolic representation · CPC title
Backpropagation, e.g. using gradient descent · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.