Device and method of training a generative neural network
US-2022076119-A1 · Mar 10, 2022 · US
US11526690B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11526690-B2 |
| Application number | US-201916553223-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 28, 2019 |
| Priority date | Jan 22, 2019 |
| Publication date | Dec 13, 2022 |
| Grant date | Dec 13, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A learning device includes one or more processors. The processors generate a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function. The processors calculate a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the learning data out of the plurality of pieces of learning data. The partial data is obtained by dividing the learning data. The first partial gradient is a gradient of the objective function relating to the parameter for the partial data. The noise is calculated based on a second partial gradient calculated for another piece of the learning data. The processors update the parameter using the first partial gradient.
Opening claim text (preview).
What is claimed is: 1. A learning device comprising: one or more processors configured to: generate a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function; calculate a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the learning data out of the plurality of pieces of learning data, the partial data being obtained by dividing the learning data, the first partial gradient being a gradient of the objective function relating to the parameter for the partial data, the noise being calculated based on a second partial gradient calculated for another piece of the learning data; and update the parameter using the first partial gradient. 2. The learning device according to claim 1 , wherein the one or more processors calculate an overall gradient using a plurality of partial gradients calculated for a plurality of pieces of partial data obtained by dividing the learning data, the overall gradient being a gradient of the objective function relating to the parameter for the learning data, and calculate the noise using the overall gradient and the second partial gradient. 3. The learning device according to claim 2 , wherein the one or more processors calculate the noise using a difference between the overall gradient and the second partial gradient. 4. The learning device according claim 2 , wherein the one or more processors calculate a second parameter for the learning data by updating a first parameter calculated for the other learning data using the overall gradient, and calculate the first partial gradient using the second parameter added with the noise and the partial data. 5. The learning device according to claim 1 , wherein the one or more processors calculate the first partial gradient using a value that is obtained by adding the noise to a first parameter calculated for the other learning data, and the partial data. 6. A learning method comprising: generating a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function; calculating a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the learning data out of the plurality of pieces of learning data, the partial data being obtained by dividing the learning data, the first partial gradient being a gradient of the objective function relating to the parameter for the partial data, the noise being calculated based on a second partial gradient calculated for another piece of the learning data; and updating the parameter using the first partial gradient. 7. A computer program product having a non-transitory computer readable medium including programmed instructions, wherein the instructions, when executed by a computer, cause the computer to perform: generating a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function; and calculating a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the learning data out of the plurality of pieces of learning data, the partial data being obtained by dividing the learning data, the first partial gradient being a gradient of the objective function relating to the parameter for the partial data, the noise being calculated based on a second partial gradient calculated for another piece of the learning data; and updating the parameter using the first partial gradient.
Backpropagation, e.g. using gradient descent · CPC title
Organisation of the process, e.g. bagging or boosting · CPC title
using neural networks · CPC title
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.