What technology area does this patent fall under?

Primary CPC classification G06N3/045. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Dec 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Learning method and information processing apparatus

US2021397948A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2021397948-A1
Application number	US-202117197099-A
Country	US
Kind code	A1
Filing date	Mar 10, 2021
Priority date	Jun 18, 2020
Publication date	Dec 23, 2021
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A memory holds a model including a plurality of layers including their respective parameters and training data. A processor starts learning processing, which repeatedly calculates an error of an output of the model by using the training data, calculates an error gradient, which indicates a gradient of the error with respect to the parameters, for each of the layers, and updates the parameters based on the error gradients. The processor calculates a difference between a first error gradient calculated in a first iteration in the learning processing and a second error gradient calculated in a second iteration after the first iteration for a first layer among the plurality of layers. In a case where the difference is less than a threshold, the processor skips the calculating of the error gradient and the updating of the parameter for the first layer in a third iteration after the second iteration.

First claim

Opening claim text (preview).

What is claimed is: 1 . A non-transitory computer-readable recording medium storing therein a computer program that causes a computer to execute a process comprising: starting learning processing for generating a model including a plurality of layers, each of which includes a parameter, the learning processing including repeatedly calculating an error of an output of the model by using training data, calculating an error gradient, which indicates a gradient of the error with respect to the parameters, for each of the layers, and updating the parameters based on the error gradients; calculating a difference between a first error gradient calculated in a first iteration in the learning processing and a second error gradient calculated in a second iteration after the first iteration for a first layer among the plurality of layers; and skipping, in a case where the difference is less than a threshold, the calculating of the error gradient of the first layer and the updating of the corresponding parameter in a third iteration after the second iteration. 2 . The non-transitory computer-readable recording medium according to claim 1 , wherein the process further includes determining the threshold, based on an initial error gradient, which indicates an error gradient calculated in an iteration executed in response to the starting of the learning processing. 3 . The non-transitory computer-readable recording medium according to claim 2 , wherein the process further includes setting a learning rate used to update the parameters and changing the learning rate based on a number of iterations in which the learning processing has been executed, and wherein the determining of the threshold includes changing the threshold, based on a different initial error gradient, which indicates an error gradient calculated in an iteration executed in response to change of the learning rate. 4 . The non-transitory computer-readable recording medium according to claim 1 , wherein the process further includes, in each iteration in the learning processing, calculating different errors from different training data by using a plurality of processing nodes, calculating different error gradients from the different errors by using the plurality of processing nodes, synthesizing the different error gradients by allowing the plurality of processing nodes to communicate with each other, and updating the parameters, based on the synthesized error gradients, and wherein the skipping includes skipping the calculating of the different error gradients, the communicating among the plurality of processing nodes, and the updating of the parameter for the first layer. 5 . The non-transitory computer-readable recording medium according to claim 1 , wherein the model is a multilayer neural network. 6 . The non-transitory computer-readable recording medium according to claim 1 , wherein the second iteration is immediately after the first iteration, and the third iteration is immediately after the second iteration. 7 . The non-transitory computer-readable recording medium according to claim 1 , wherein the calculating of the difference includes calculating the difference for each of the plurality of layers, and wherein the skipping includes determining, among the plurality of layers, a layer whose difference is less than the threshold and skipping the calculating of the error gradient and the updating of the parameter for the determined layer. 8 . A learning method comprising: starting, by a processor, learning processing for generating a model including a plurality of layers, each of which includes a parameter, the learning processing including repeatedly calculating an error of an output of the model by using training data, calculating an error gradient, which indicates a gradient of the error with respect to the parameters, for each of the layers, and updating the parameters based on the error gradients; calculating, by the processor, a difference between a first error gradient calculated in a first iteration in the learning processing and a second error gradient calculated in a second iteration after the first iteration for a first layer among the plurality of layers; and skipping, by the processor, in a case where the difference is less than a threshold, the calculating of the error gradient of the first layer and the updating of the corresponding parameter in a third iteration after the second iteration. 9 . An information processing apparatus comprising: a memory configured to hold a model including a plurality of layers, each of which includes a parameter, and training data; and a processor configured to start learning processing, which includes repeatedly calculating an error of an output of the model by using the training data, calculating an error gradient, which indicates a gradient of the error with respect to the parameters, for each of the layers, and updating the parameters based on the error gradients, calculate a difference between a first error gradient calculated in a first iteration in the learning processing and a second error gradient calculated in a second iteration after the first iteration for a first layer among the plurality of layers, and skip, in a case where the difference is less than a threshold, the calculating of the error gradient and the updating of the parameter for the first layer in a third iteration after the second iteration.

Assignees

Fujitsu Ltd

Inventors

Classifications

G06F18/2115
by evaluating different subsets according to an optimisation criterion, e.g. class separability, forward selection or backward elimination · CPC title
G06N3/045Primary
Combinations of networks · CPC title
G06N3/048
Activation functions · CPC title
G06N3/09
Supervised learning · CPC title
G06N3/0985
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

Patent family

Related publications grouped by family.

View patent family 74873564

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021397948A1 cover?: A memory holds a model including a plurality of layers including their respective parameters and training data. A processor starts learning processing, which repeatedly calculates an error of an output of the model by using the training data, calculates an error gradient, which indicates a gradient of the error with respect to the parameters, for each of the layers, and updates the parameters b…
Who is the assignee on this patent?: Fujitsu Ltd
What technology area does this patent fall under?: Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Dec 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).