Comparison method and comparison apparatus

US11423263B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11423263-B2
Application numberUS-201815940132-A
CountryUS
Kind codeB2
Filing dateMar 29, 2018
Priority dateMar 31, 2017
Publication dateAug 23, 2022
Grant dateAug 23, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processor builds a plurality of learning models using training data of a plurality of first sample sizes according to a first machine learning algorithm, and calculates a plurality of measured prediction performances. The processor calculates a plurality of estimated variances on the basis of relationship information indicating the relationship between expected value and variance with respect to prediction performance and the plurality of measured prediction performances. The processor creates a first prediction performance curve through a regression analysis using the plurality of measured prediction performances and the plurality of estimated variances. The processor calculates a first evaluation value on the basis of the first prediction performance curve and a second sample size. The processor compares the first evaluation value with a second evaluation value calculated based on a second prediction performance curve corresponding to a second machine learning algorithm and the second sample size.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer-readable storage medium storing a program that causes a computer to perform a process comprising: building a plurality of learning models using training data of a plurality of first sample sizes according to a first machine learning algorithm and calculating a plurality of measured prediction performances, the training data being extracted from an identical data population, the plurality of measured prediction performances respectively indicating results of measuring prediction performance of the plurality of learning models; calculating a plurality of estimated expected losses and an estimated expected bias, based on the plurality of measured prediction performances, and calculating a plurality of estimated variances using the plurality of estimated expected losses and the estimated expected bias, based on relationship information, the plurality of estimated variances respectively indicating results of estimating variances of the prediction performance at the plurality of first sample sizes, the relationship information indicating relationship among expected loss indicating an expected value of an error rate in prediction, expected bias indicating a lower limit of the expected loss, and variance of the prediction performance, the plurality of estimated expected losses respectively indicating results of estimating expected losses with respect to the plurality of first sample sizes, the estimated expected bias indicating a result of estimating the expected bias; creating a first prediction performance curve through a regression analysis using the plurality of measured prediction performances and the plurality of estimated variances, the first prediction performance curve representing relationship between sample size and the prediction performance and being a curve in which the prediction performance approaches a fixed upper limit of the prediction performance; calculating a first evaluation value of the first machine learning algorithm, based on the first prediction performance curve and a second sample size; and comparing the first evaluation value with a second evaluation value of a second machine learning algorithm, the second evaluation value being calculated based on a second prediction performance curve corresponding to the second machine learning algorithm and the second sample size. 2. The non-transitory computer-readable storage medium according to claim 1 , wherein the relationship information indicates that the variance of the prediction performance is proportional to a sum of the expected loss and the expected bias and is proportional to a difference between the expected loss and the expected bias. 3. The non-transitory computer-readable storage medium according to claim 1 , wherein the creating of the first prediction performance curve includes assigning a plurality of weights to the plurality of measured prediction performances according to the plurality of estimated variances in such a way that a weight to be assigned is increased as an estimated variance decreases, and carrying out the regression analysis using the plurality of measured prediction performances and the plurality of weights. 4. A comparison method comprising: building, by a processor, a plurality of learning models using training data of a plurality of first sample sizes according to a first machine learning algorithm and calculating a plurality of measured prediction performances, the training data being extracted from an identical data population, the plurality of measured prediction performances respectively indicating results of measuring prediction performance of the plurality of learning models; calculating, by the processor, a plurality of estimated expected losses and an estimated expected bias, based on the plurality of measured prediction performances, and calculating a plurality of estimated variances using the plurality of estimated expected losses and the estimated expected bias, based on relationship information, the plurality of estimated variances respectively indicating results of estimating variances of the prediction performance at the plurality of first sample sizes, the relationship information indicating relationship among expected loss indicating an expected value of an error rate in prediction, expected bias indicating a lower limit of the expected loss, and variance of the prediction performance, the plurality of estimated expected losses respectively indicating results of estimating expected losses with respect to the plurality of first sample sizes, the estimated expected bias indicating a result of estimating the expected bias; creating, by the processor, a first prediction performance curve through a regression analysis using the plurality of measured prediction performances and the plurality of estimated variances, the first prediction performance curve representing relationship between sample size and the prediction performance and being a curve in which the prediction performance approaches a fixed upper limit of the prediction performance; calculating, by the processor, a first evaluation value of the first machine learning algorithm, based on the first prediction performance curve and a second sample size; and comparing, by the processor, the first evaluation value with a second evaluation value of a second machine learning algorithm, the second evaluation value being calculated based on a second prediction performance curve corresponding to the second machine learning algorithm and the second sample size. 5. A comparison apparatus comprising: a memory configured to store therein a plurality of measured prediction performances and relationship information, the plurality of measured prediction performances respectively indicating results of measuring prediction performance of a plurality of learning models, the plurality of learning models being built using training data of a plurality of first sample sizes according to a first machine learning algorithm, the training data being extracted from an identical data population, the relationship information indicating relationship among expected loss indicating an expected value of an error rate in prediction, expected bias indicating a lower limit of the expected loss, and variance of the prediction performance; and a processor configured to perform a process including calculating a plurality of estimated expected losses and an estimated expected bias, based on the plurality of measured prediction performances, and calculating a plurality of estimated variances using the plurality of estimated expected losses and the estimated expected bias, based on the relationship information, the plurality of estimated variances respectively indicating results of estimating variances of the prediction performance at the plurality of first sample sizes, the plurality of estimated expected losses respectively indicating results of estimating expected losses with respect to the plurality of first sample sizes, the estimated expected bias indicating a result of estimating the expected bias, creating a first prediction performance curve through a regression analysis using the plurality of measured prediction performances and the plurality of estimated variances, the first prediction performance curve representing relationship between sample size and the prediction performance and being a curve in which the prediction performance approaches a fixed upper limit of the prediction performance, calculating a first evaluation value of the first machine learning algorithm, based on the first prediction performance curve and a second sample size, and comparing the first evaluation value with a second evaluation value of a second machine learning algorithm, the second evaluation value being calculated based on a second prediction performa

Assignees

Inventors

Classifications

  • based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11423263B2 cover?
A processor builds a plurality of learning models using training data of a plurality of first sample sizes according to a first machine learning algorithm, and calculates a plurality of measured prediction performances. The processor calculates a plurality of estimated variances on the basis of relationship information indicating the relationship between expected value and variance with respect…
Who is the assignee on this patent?
Fujitsu Ltd
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 23 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).