Hardware Implementation of a Convolutional Neural Network
US-2017323196-A1 · Nov 9, 2017 · US
US11429862B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11429862-B2 |
| Application number | US-201816133446-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 17, 2018 |
| Priority date | Mar 20, 2018 |
| Publication date | Aug 30, 2022 |
| Grant date | Aug 30, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques are disclosed for training a deep neural network (DNN) for reduced computational resource requirements. A computing system includes a memory for storing a set of weights of the DNN. The DNN includes a plurality of layers. For each layer of the plurality of layers, the set of weights includes weights of the layer and a set of bit precision values includes a bit precision value of the layer. The weights of the layer are represented in the memory using values having bit precisions equal to the bit precision value of the layer. The weights of the layer are associated with inputs to neurons of the layer. Additionally, the computing system includes processing circuitry for executing a machine learning system configured to train the DNN. Training the DNN comprises optimizing the set of weights and the set of bit precision values.
Opening claim text (preview).
What is claimed is: 1. A computing system that trains a deep neural network (DNN) for reduced computing resource requirements, the computing system comprising: a memory storing a set of weights of the DNN, the DNN including a plurality of layers, wherein for each layer of the plurality of layers, the set of weights includes weights of the layer and a set of bit precision values includes a bit precision value of the layer, the weights of the layer being represented in the memory using values having bit precisions equal to the bit precision value of the layer, the weights of the layer being associated with inputs to neurons of the layer; and processing circuitry for executing a machine learning system configured to train the DNN, wherein training the DNN comprises optimizing the set of weights and the set of bit precision values. 2. The computing system of claim 1 , wherein the machine learning system is configured such that, as part of training the DNN, the machine learning system: applies a backpropagation algorithm over a plurality of iterations, wherein each iteration of the backpropagation algorithm updates the set of weights and optimizes the set of bit precision values. 3. The computing system of claim 1 , wherein two or more of the layers of the DNN have different bit precision values. 4. The computing system of claim 1 , wherein: the set of weights is a first set of weights, the memory stores a second set of weights that includes a fixed precision set of weights for each layer in the plurality of layers, each weight in the second set of weights having a bit precision equal to a predefined maximum bit precision value, and the machine learning system is configured such that, as part of training the DNN, the machine learning system performs a plurality of iterations to train the DNN, wherein the machine learning system is configured such that, as part of performing the plurality of iterations, the machine learning system, for each iteration of the plurality of iterations: uses the second set of weights as weights of inputs of neurons in the DNN to calculate a first output data set based on a first input data set, determines a loss function; updates the second set of weights based on the loss function; updates the set of bit precision values based on the loss function; and after updating the second set of weights and after updating the set of bit precision values, updates the first set of weights based on the updated second set of weights and the updated set of bit precision values, and the machine learning system is further configured to use the first set of weights as the weights of the inputs of the neurons in the DNN to calculate a second output data set based on a second input data set. 5. The computing system of claim 4 , wherein the machine learning system is configured such that, as part of determining the loss function, the machine learning system: determines a first operand, the first operand being an intermediate loss function; determines a second operand such that the second operand is equal to a multiplication product of a value of a first hyperparameter and a sum of quantization errors for each of the layers in the plurality of layers; determines a third operand such that the third operand is equal to a multiplication product of the value of a second hyperparameter and Σ i=1 N 2 b i , where i is an index, N is a total number of layers in the plurality of layers, and b i is the bit precision value for the i′th layer in the plurality of layers; and determines the loss function as the sum of the first operand, the second operand, and the third operand. 6. The computing system of claim 5 , wherein the machine learning system is further configured to: for each layer of the plurality of layers, determine the quantization errors for the layer based on differences between weights of the layer in the first set of weights and weights of the layer in the second set of weights. 7. The computing system of claim 5 , wherein: the first input data set comprises a batch of training data-label pairs, the machine learning system is configured such that, as part of determining the first operand, the machine learning system determines the first operand such that the first operand is equal to: - 1 B ∑ i = 1 B log ( X i , y i ( N ) ) where B is a total number of data-label pairs in the batch of data-label pairs, each label in the batch of data-label pairs is an element in a set of labels that includes B labels, i is an index, log (⋅) is a logarithm function, N is the total number of layers in the plurality of layers, y i is the i′th label in the set of labels, and X i,y i (N) is output of the N'th layer of the plurality of layers when the DNN is given as input the data of the i′th data-label pair of the batch of data-label pairs, wherein the data-label pairs in the batch of data-label pairs are independent identically distributed data-label pairs. 8. The computing system of claim 4 , wherein the machine learning system is configured such that, as part of updating the set of bit precision values, the machine learning system: determines the updated set of bit precision values such that the updated set of bit precision values is set equal to: b - μ · sign ( ∂ l ( ) ∂ b ) where b is the set of bit precision values, μ is a learning rate, {tilde over (w)} is the first set of weights, and ∂ l ( ) ∂ b is a partial derivative of the loss function with respect to the set of bit precision values, and sign(⋅) is a function that returns a sign of an argument of t
Analogue means · CPC title
Activation functions · CPC title
Combinations of networks · CPC title
Recurrent networks, e.g. Hopfield networks · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.