Resistive processing unit with hysteretic updates for neural network training
US-2018253642-A1 · Sep 6, 2018 · US
US11449754B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11449754-B1 |
| Application number | US-202217672713-A |
| Country | US |
| Kind code | B1 |
| Filing date | Feb 16, 2022 |
| Priority date | Sep 12, 2021 |
| Publication date | Sep 20, 2022 |
| Grant date | Sep 20, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention discloses a neural network training method for a memristor memory for memristor errors, which is mainly used for solving the problem of decrease in inference accuracy of a neural network based on the memristor memory due to a process error and a dynamic error. The method comprises the following steps: performing modeling on a conductance value of a memristor under the influence of the process error and the dynamic error, and performing conversion to obtain a distribution of corresponding neural network weights; constructing a prior distribution of the weights by using the weight distribution obtained after modeling, and performing Bayesian neural network training based on variational inference to obtain a variational posterior distribution of the weights; and converting a mean value of the variational posterior of the weights into a target conductance value of the memristor memory.
Opening claim text (preview).
What is claimed is: 1. A neural network training method for a memristor memory for memristor errors, comprising the following steps: step 1: performing modeling on neural network weights to be deployed on the memristor memory under the influence of a process error and a dynamic error, which comprises the following steps: step 1-1: decomposing an actual conductance value of one memristor on the memristor memory corresponding to each of the neural network weights into four parts: a target conductance value, a global conductance error, a local conductance error, and a dynamic conductance error, step 1-2: approximating the local conductance error and the dynamic conductance error as products of a value of an error function taking the target conductance value as a variable and a local process error as well as the dynamic error; step 1-3: respectively modeling the local process error and the dynamic error as Gaussian random variables, which are independent of each other; step 1-4: obtaining modeling representation of the actual conductance value of one memristor on the memristor memory corresponding to each of the neural network weights under the influence of the process error and the dynamic error by means of results of the steps 1-1 to 1-3, which is a Gaussian random variable, wherein a mean value of the modeling representation is the sum of the target conductance value and the global conductance error, and a variance is a product of a square of the value of the error function taking the target conductance value as the variable and the sum of the variances of two Gaussian random variables corresponding to the local process error and the dynamic error; and step 1-5: mapping the conductance value obtained by modeling into the neural network weights by using a conductance value-weight mapping relationship, thus obtaining statistical representation of the neural network weights; step 2: performing Bayesian neural network training based on variational posterior by taking a statistical distribution of the neural network weights obtained by modeling in the step 1 as a prior distribution of the neural network weights, thus obtaining a variational posterior distribution of the neural network weights; and step 3: computing a mean value of the variational posterior distribution of the neural network weights obtained in the step 2, mapping the mean value into a conductance by reversely using the conductance-weight mapping relationship, and taking the conductance as the actual conductance value of the memristor on the memristor memory. 2. The neural network training method for the memristor memory for memristor errors according to claim 1 , wherein in the step 1-2, the error function taking the target conductance value as the variable is of the following form: ƒ( g 0 )=cg 0 wherein c is a constant indicating a process error level and a dynamic error level, and is set according to an error level in an application scenario; and g 0 is a vector consisting of the target conductance value of each memristor on the memristor memory. 3. The neural network training method for the memristor memory for memristor errors according to claim 1 , wherein in the step 1-3, the Gaussian random variable Δr l for representing the local process error and the Gaussian random variable Δr d for representing the dynamic error obtained by modeling respectively meet the following Gaussian distributions: Δ r l ˜N (0, σ l 2 ) Δ r d ˜N (0, σ d 2 ) wherein σ l 2 and σ d 2 are variances of the two Gaussian distributions respectively and are measured experimentally, N indicating the Gaussian distribution. 4. The neural network training method for the memristor memory for memristor errors according to claim 1 , wherein in the step 1-4, the modeling representation obtained by modeling to represent the actual conductance value of one memristor on the memristor memory corresponding to each of the neural network weights under the influence of the process error and the dynamic error is of the following form: g˜N ( g 0 +Δg g , ƒ( g 0 +Δg g ) 2 (σ l 2 +σ d 2 )) wherein g is the actual conductance value, g 0 is the target conductance value, Δg g is the global conductance error, σ l 2 and σ d 2 are variances of the two Gaussian distributions respectively, and parameters are all measured experimentally. 5. The neural network training method for the memristor memory for memristor errors according to claim 1 , wherein in the step 1-5, an i-th element w i of the neural network weights w=[w 1 , w 2 , . . . , w i , . . . , w n ] T obtained through the conductance value-weight mapping relationship is of the following form: w i =c 0 +c 1 g 0 ˜N (μ i , Ψ(μ i ))= N ( c 0 +c 1 ( g 0, i =Δg g, i ), c 1 ƒ( g 0, i +Δ g g, i ) 2 (σ l, i 2 +σ d, i 2 )) wherein g i is an actual conductance value of a memristor corresponding to an i-th neural network weight, μ i and Ψ(μ i ) are a mean value and a variance of w i respectively, g 0, i is a target conductance value of the memristor corresponding to the i-th neural network weight, Δg g, i is a global conductance error of the memristor corresponding to the i-th neural network weight, σ l, i 2 and σ d, i 2 are variances of the local process error value and a dynamic error value of the memristor corresponding to the i-th neural network weight respectively, and c 0 and c 1 are two constant factors of conductance value-weight linear mapping, which are computed through the following expression: c 1 = w max - w min g max - g min , c 0 = w min - c 1 g min wherein w max and w min are a maximum value and a minimum value of all the neural network weights in the neural network respectively, and are obtained from the traditional neural network training, and g max and g min are a maximum value and a minimum value of the conductance range that can be adjusted by the memristor on the memristor memory, respectively. 6. The neural network training method for the memristor memory for memristor errors according to claim 1 , wherein in the step 2, the variational posterior distribution q(w|θ) of the neural network weights w is Gaussian distribution, θ is parameters of the Gaussian distribution, including the mean value and the variance, and a target function for the Bayesian neural network training based on variational inference is of the following form: −ELBO( g 0 ; θ)=+ q(w|θ) [logP( D|w )]+KL [ q ( w |θ) P ( w )] wherein g 0 is a vector consisting of
Related publications grouped by family.
Answers are generated from the same data shown on this page.