Method and apparatus for neural network quantization
US-2020218962-A1 · Jul 9, 2020 · US
US11568222B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11568222-B2 |
| Application number | US-201916367067-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 27, 2019 |
| Priority date | Mar 27, 2019 |
| Publication date | Jan 31, 2023 |
| Grant date | Jan 31, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computing device includes one or more processors, random access memory (RAM), and a non-transitory computer-readable storage medium storing instructions for execution by the one or more processors. The computing device receives first data and classifies the first data using a neural network that includes at least one quantized layer. The classifying includes reading values from the random access memory for a set of weights of the at least one quantized layer of the neural network using first read parameters corresponding to a first error rate.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: performing, at a computing device that includes one or more processors, a random access memory (RAM), and a non-transitory computer-readable storage medium including instructions for execution by the one or more processors, a set of operations including: receiving first data; and classifying the first data using a neural network that includes at least one quantized layer, wherein the neural network further includes at least one floating point layer, wherein the classifying includes reading values from the random access memory for a set of weights of the at least one quantized layer of the neural network using first read parameters corresponding to a first error rate, wherein the at least one quantized layer comprises at least half of an amount of the RAM used to store the quantized and floating point layers of the neural network. 2. The method of claim 1 , wherein the classifying does not include performing error detection. 3. The method of claim 1 , wherein the first error rate comprises read disturb errors and retention errors. 4. The method of claim 1 , wherein the one or more processors reside on a same chip as the random access memory. 5. The method of claim 1 , wherein the RAM is magnetic RAM. 6. The method of claim 1 , wherein the first read parameters include a read current selected such that the computing device reads values from the RAM at the first error rate. 7. The method of claim 1 , wherein the first error rate is greater than 0.5%. 8. The method of claim 7 , wherein the first error rate is less than 20%. 9. The method of claim 7 , wherein the first error rate is less than 20%. 10. The method of claim 1 , wherein the neural network comprises an XNOR neural network. 11. The method of claim 1 , wherein each of the at least one quantized layer comprises a binary layer. 12. The method of claim 1 , wherein the first read parameters include a read current selected such that the computing device reads values from the RAM at the first error rate. 13. The method of claim 1 , wherein the first error rate is greater than 0.5%. 14. An electronic system, comprising: one or more processors; a random access memory (RAM); read circuitry configured to read data from the RAM; and a non-transitory computer-readable storage medium including instructions for execution by the one or more processors, a set of operations including: receiving first data; and classifying the first data using a neural network that includes at least one quantized layer, wherein the neural network further includes at least one floating point layer, wherein the classifying includes reading values from the random access memory for a set of weights of the at least one quantized layer of the neural network using first read parameters corresponding to a first error rate, wherein the at least one quantized layer comprises at least half of an amount of the RAM used to store the quantized and floating point layers of the neural network. 15. The electronic system of claim 14 , wherein the electronic system comprises a chip. 16. The electronic system of claim 14 , wherein the electronic system comprises a smartphone. 17. A method, comprising: performing, at a computing device that includes one or more processors, a random access memory (RAM), and a non-transitory computer-readable storage medium including instructions for execution by the one or more processors, a set of operations including: receiving first data; and classifying the first data using a neural network that includes at least one quantized layer, wherein the classifying includes reading values from the random access memory for a set of weights of the at least one quantized layer of the neural network using first read parameters corresponding to a first error rate, wherein 50%, 60%, 70%, 80%, or 90% of the weights for the entire neural network are binary weights. 18. The method of claim 17 , wherein the classifying does not include performing error detection. 19. The method of claim 17 , wherein the first error rate comprises read disturb errors and retention errors.
using elements in which the storage effect is based on magnetic spin effect · CPC title
Neural networks · CPC title
using electronic means · CPC title
Architecture, e.g. interconnection topology · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.