Learning method and information processing apparatus

US2021406683A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2021406683-A1
Application numberUS-202117226279-A
CountryUS
Kind codeA1
Filing dateApr 9, 2021
Priority dateJun 25, 2020
Publication dateDec 30, 2021
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A process includes starting a learning process for building a model including multiple layers each including a parameter. The learning process executes iterations, each including calculating output error of the model using training data and updating the parameter value based on the output error. The process also includes selecting two or more candidate layers representing candidates for layers, where the updating is to be suppressed, based on results of a first iteration of the learning process. The process also includes calculating, based on the number of iterations executed up to the first iteration, a ratio value which becomes larger when the number of iterations executed is greater, and determining, amongst the candidate layers, one or more layers, where the updating is to be suppressed at a second iteration following the first iteration. The number of one or more layers is determined according to the ratio value.

First claim

Opening claim text (preview).

What is claimed is: 1 . A non-transitory computer-readable recording medium storing therein a computer program that causes a computer to execute a process comprising: starting a learning process for building a model including a plurality of layers which each include a parameter, the learning process executing iterations, each of which includes calculating output error of the model using training data and updating a value of the parameter of each of the plurality of layers based on the output error; selecting, amongst the plurality of layers, two or more candidate layers representing candidates for layers, in each of which the updating of the value of the parameter is to be suppressed, based on execution results of a first iteration of the learning process; and calculating, based on a number of the iterations executed up to the first iteration, a ratio value which increases with an increase in the number of the iterations executed, and determining, amongst the two or more candidate layers, one or more layers, in each of which the updating of the value of the parameter is to be suppressed at a second iteration following the first iteration, a number of the one or more layers being determined according to the ratio value. 2 . The non-transitory computer-readable recording medium according to claim 1 , wherein: the number of the one or more layers determined according to the ratio value is calculated by multiplying a number of the two or more candidate layers by the ratio value. 3 . The non-transitory computer-readable recording medium according to claim 1 , wherein: the ratio value corresponding to the number of the iterations executed is calculated based on a sigmoid curve. 4 . The non-transitory computer-readable recording medium according to claim 1 , wherein: the updating of the value of the parameter is performed at the second iteration in each remaining layer other than the one or more layers whose number is determined according to the ratio value amongst the two or more candidate layers. 5 . The non-transitory computer-readable recording medium according to claim 1 , wherein: each of the iterations of the learning process includes calculating an error gradient indicating a gradient of the output error with respect to the parameter and updating the value of the parameter based on the error gradient, and the selecting of the two or more candidate layers includes monitoring each of the plurality of layers for an inter-iteration change in the error gradient and selecting each of the two or more candidate layers whose inter-iteration change is below a threshold. 6 . The non-transitory computer-readable recording medium according to claim 1 , wherein: the model is a multi-layer neural network. 7 . The non-transitory computer-readable recording medium according to claim 1 , wherein: each of the iterations of the learning process includes calculating an error gradient indicating a gradient of the output error with respect to the parameter and updating the value of the parameter based on the error gradient, the process further includes calculating, for each of the plurality of layers, an average of the error gradients across the iterations executed up to the first iteration, and each of the one or more layers whose number is determined according to the ratio value is determined based on the average of the error gradients. 8 . The non-transitory computer-readable recording medium according to claim 1 , wherein: each of the iterations of the learning process includes calculating an error gradient indicating a gradient of the output error with respect to the parameter and updating the value of the parameter based on the error gradient, the process further includes monitoring each of the plurality of layers for an inter-iteration change in the error gradient and calculating, for each of the plurality of layers, an average of the inter-iteration changes across the iterations executed up to the first iteration, and each of the one or more layers whose number is determined according to the ratio value is determined based on the average of the inter-iteration changes. 9 . The non-transitory computer-readable recording medium according to claim 1 , wherein: the plurality of layers is classified into a plurality of blocks, each including two or more layers, and each of the one or more layers whose number is determined according to the ratio value is determined based on identity of a block to which the layer belongs. 10 . The non-transitory computer-readable recording medium according to claim 1 , wherein: each of the one or more layers whose number is determined according to the ratio value is determined based on spacing of the one or more layers. 11 . The non-transitory computer-readable recording medium according to claim 1 , wherein: each of the one or more layers whose number is determined according to the ratio value is determined based on proximity thereof to an input of the model. 12 . A learning method comprising: starting, by a processor, a learning process for building a model including a plurality of layers which each include a parameter, the learning process executing iterations, each of which includes calculating output error of the model using training data and updating a value of the parameter of each of the plurality of layers based on the output error; selecting, by the processor, amongst the plurality of layers, two or more candidate layers representing candidates for layers, in each of which the updating of the value of the parameter is to be suppressed, based on execution results of a first iteration of the learning process; and calculating, by the processor, based on a number of the iterations executed up to the first iteration, a ratio value which increases with an increase in the number of the iterations executed, and determining, amongst the two or more candidate layers, one or more layers, in each of which the updating of the value of the parameter is to be suppressed at a second iteration following the first iteration, a number of the one or more layers being determined according to the ratio value. 13 . An information processing apparatus comprising: a memory configured to store training data and a model including a plurality of layers which each include a parameter; and a processor configured to execute a process including: starting a learning process executing iterations, each of which includes calculating output error of the model using the training data and updating a value of the parameter of each of the plurality of layers based on the output error, selecting, amongst the plurality of layers, two or more candidate layers representing candidates for layers, in each of which the updating of the value of the parameter is to be suppressed, based on execution results of a first iteration of the learning process, and calculating, based on a number of the iterations executed up to the first iteration, a ratio value which increases with an increase in the number of the iterations executed, and determining, amongst the two or more candidate layers, one or more layers, in each of which the updating of the value of the parameter is to be suppressed at a second iteration following the first iteration, a number of the one or more layers being determined according to the ratio value.

Assignees

Inventors

Classifications

  • Selection of the most significant subset of features · CPC title

  • G06N3/084Primary

    Backpropagation, e.g. using gradient descent · CPC title

  • G06N3/045Primary

    Combinations of networks · CPC title

  • Distributed learning, e.g. federated learning · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2021406683A1 cover?
A process includes starting a learning process for building a model including multiple layers each including a parameter. The learning process executes iterations, each including calculating output error of the model using training data and updating the parameter value based on the output error. The process also includes selecting two or more candidate layers representing candidates for layers,…
Who is the assignee on this patent?
Fujitsu Ltd
What technology area does this patent fall under?
Primary CPC classification G06N3/084. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 30 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).