Learning device, learning method, and computer program product

US11526690B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11526690-B2
Application numberUS-201916553223-A
CountryUS
Kind codeB2
Filing dateAug 28, 2019
Priority dateJan 22, 2019
Publication dateDec 13, 2022
Grant dateDec 13, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A learning device includes one or more processors. The processors generate a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function. The processors calculate a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the learning data out of the plurality of pieces of learning data. The partial data is obtained by dividing the learning data. The first partial gradient is a gradient of the objective function relating to the parameter for the partial data. The noise is calculated based on a second partial gradient calculated for another piece of the learning data. The processors update the parameter using the first partial gradient.

First claim

Opening claim text (preview).

What is claimed is: 1. A learning device comprising: one or more processors configured to: generate a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function; calculate a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the learning data out of the plurality of pieces of learning data, the partial data being obtained by dividing the learning data, the first partial gradient being a gradient of the objective function relating to the parameter for the partial data, the noise being calculated based on a second partial gradient calculated for another piece of the learning data; and update the parameter using the first partial gradient. 2. The learning device according to claim 1 , wherein the one or more processors calculate an overall gradient using a plurality of partial gradients calculated for a plurality of pieces of partial data obtained by dividing the learning data, the overall gradient being a gradient of the objective function relating to the parameter for the learning data, and calculate the noise using the overall gradient and the second partial gradient. 3. The learning device according to claim 2 , wherein the one or more processors calculate the noise using a difference between the overall gradient and the second partial gradient. 4. The learning device according claim 2 , wherein the one or more processors calculate a second parameter for the learning data by updating a first parameter calculated for the other learning data using the overall gradient, and calculate the first partial gradient using the second parameter added with the noise and the partial data. 5. The learning device according to claim 1 , wherein the one or more processors calculate the first partial gradient using a value that is obtained by adding the noise to a first parameter calculated for the other learning data, and the partial data. 6. A learning method comprising: generating a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function; calculating a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the learning data out of the plurality of pieces of learning data, the partial data being obtained by dividing the learning data, the first partial gradient being a gradient of the objective function relating to the parameter for the partial data, the noise being calculated based on a second partial gradient calculated for another piece of the learning data; and updating the parameter using the first partial gradient. 7. A computer program product having a non-transitory computer readable medium including programmed instructions, wherein the instructions, when executed by a computer, cause the computer to perform: generating a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function; and calculating a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the learning data out of the plurality of pieces of learning data, the partial data being obtained by dividing the learning data, the first partial gradient being a gradient of the objective function relating to the parameter for the partial data, the noise being calculated based on a second partial gradient calculated for another piece of the learning data; and updating the parameter using the first partial gradient.

Assignees

Inventors

Classifications

  • G06N3/084Primary

    Backpropagation, e.g. using gradient descent · CPC title

  • Organisation of the process, e.g. bagging or boosting · CPC title

  • using neural networks · CPC title

  • characterised by the process organisation or structure, e.g. boosting cascade · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11526690B2 cover?
A learning device includes one or more processors. The processors generate a plurality of pieces of learning data to be used in a plurality of learning processes, respectively, to learn a parameter of a neural network using an objective function. The processors calculate a first partial gradient using a partial data and the parameter added with noise, with respect to at least a part of the lear…
Who is the assignee on this patent?
Toshiba Kk
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 13 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).