What technology area does this patent fall under?

Primary CPC classification G06N20/00. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Mar 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

High accuracy learning by boosting weak learners

US9607246B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9607246-B2
Application number	US-201313953813-A
Country	US
Kind code	B2
Filing date	Jul 30, 2013
Priority date	Jul 30, 2012
Publication date	Mar 28, 2017
Grant date	Mar 28, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system, apparatus, method, and computer-readable medium for optimizing classifiers are disclosed. The optimization process can include receiving one or more training examples. The optimization process can further include assigning a loss parameter to each training example. The optimization process can further include optimizing each loss parameter of each training sample based on a sample variance of each training example using a non-linear function. The optimization process can further include estimating a classifier from the one or more weighted training samples. The optimization process can further include assigning a loss parameter to the classifier based on a number of training examples that the classifier correctly classified and a number of training examples that the classifier incorrectly classified. The optimization process can further include adding the weighted classifier to an overall classifier.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: receiving by a processor, training examples representing classifiable events or objects including label data identifying respective classes of the events or objects, wherein each training example comprises one or more elements; associating a respective loss parameter with each of the training examples, a value of each respective loss parameter being initialized to an initial value; (a) calculating a weight of each training example based on a sample variance of the training example using a non-linear function; (b) optimizing a weak learner in a pool of weak learners or selecting a classifier from a pool of classifiers, to minimize an exponential loss of the weak learner or classifier on the weighted training examples, without evaluating all the weak learners in the pool of weak learners and without evaluating all of the classifiers in the pool of classifiers; (c) calculating a coefficient for the optimized weak learner or the selected classifier which is proportional to a logarithm of the ratio of the sum of the assigned weights corresponding to the examples classified correctly by the optimized weak learner or the selected classifier and the sum of the assigned weights corresponding to the examples incorrectly classified by the optimized weak learner or the selected classifier; (d) updating the loss parameters to the product of each with the exponential loss of the weak learner or classifier on its respective training example; repeating the operations defined in clauses a through d until a stop criterion is met; forming a linear combination of the optimized weak learners or the selected classifiers obtained from multiple iterations of operations a through d, each weighted by a respective one of the coefficients calculated in operation c; and outputting data representing said linear combination. 2. The method of claim 1 , wherein the calculating a weight from each of the loss parameters that is a non-linear function of the loss parameter includes calculating the function u i ←λnw i 2 +(1−λ)w i , where w i is the loss parameter, n is the number of training examples, λ is a constant between 0 and 1 and u i is the weight. 3. The method of claim 1 , wherein updating the loss parameters includes calculating a factor such that the sum of the loss parameters over all the training examples is equal to one. 4. The method of claim 1 , wherein the operation c is such that a penalty responsive to variance is included in a cost function that is effectively reduced through successive iterations of operation c. 5. The method of claim 1 , wherein the optimizing a weak learner or selecting a classifier from a pool of classifiers includes selecting an optimal classifier from a pool of classifiers. 6. The method of claim 5 , wherein the pool of classifiers are adapted for responding to specific features of an object or event to be classified. 7. The method of claim 1 , further comprising employing the linear combination as classifier including applying a signal containing data thereto and outputting a signal containing class data therefrom. 8. The method of claim 1 , wherein a cost function of the linear combination is minimized. 9. The method of claim 1 , wherein a lowest weight is calculated for a training example with a highest sample variance, and a highest weight is calculated for a training example with a lowest sample variance. 10. The method of claim 1 , wherein each classifier comprises a weak classifier that performs a classification better than random guessing; and wherein the linear combination comprises a strong classifier that performs a classification better than each weak classifier, and whose performance is correlated with a correct classification. 11. An apparatus, comprising: a processor configured to load and execute software instructions stored on a non-transitory computer readable medium, the software instructions, when executed, cause the processor to perform operations comprising: receiving one or more training examples representing classifiable events or objects including label data identifying respective classes of the events or objects, wherein each training example comprises one or more elements; associating a respective loss parameter with each training example of the one or more training examples, a value of each respective loss parameter being initialized to an initial value; (a) calculating a weight of each training example based on a sample variance of the training example using a non-linear function; (b) optimizing a weak learner in a pool of weak learners or selecting a classifier from a pool of classifiers, to minimize an exponential loss of the weak learner or classifier on the one or more weighted training examples, without evaluating all the weak learners in the pool of weak learners and without evaluating all of the classifiers in the pool of classifiers; (c) calculating a coefficient for the optimized weak learner or the selected classifier which is proportional to a logarithm of the ratio of the sum of the assigned weights corresponding to the examples classified correctly by the optimized weak learner or the selected classifier and the sum of the assigned weights corresponding to the examples incorrectly classified by the optimized weak learner or the selected classifier; (d) updating the loss parameters to the product of each with the exponential loss of the weak learner or classifier on its respective training example; and repeating the operations defined in clauses a through d until a stop criterion is met; forming a linear combination of the optimized weak learners or the selected classifiers obtained from multiple iterations of operations a through d, each weighted by a respective one of the coefficients calculated in operation c; and outputting data representing said linear combination. 12. The apparatus of claim 11 , the operations further comprising: exponentially adjusting each weight of each training example based on a scalar parameter that defines a relationship between a risk of each training example and the sample variance of each training example. 13. The apparatus of claim 11 , wherein w represents each weight of each training example; wherein n represents a number of the one or more weights of the one or more training examples; wherein λ represents the scalar parameter; and wherein the non-linear function comprises a function, u i ←λnw i 2 +(1−λ)w i . 14. The apparatus of claim 11 , the operations further comprising: adjusting each weight of each training example based on whether the classifier correctly classifies each training example; and iteratively repeating the optimizing, estimating, assigning the weight to the classifier, and the adding until a stopping criteria is met. 15. The apparatus of claim 11 , the operations further comprising: exponentially decreasing a weight of a training example when the classifier correctly classifies the training example; and exponentially increasing a weight of a training example when the classifier incorrectly classifies the training example. 16. A non-transitory computer-readable medium having instructions stored thereon that, when executed by a processor, cause the process to perform operations, the operations comprising: receiving one or more training examples representing classifiable events or objects including label data identifying respective classes of the events or objects, wherein each training example comprises one or more elements; associating a respective loss parameter with each training example of the one or

Assignees

Univ Columbia

Inventors

Classifications

G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06F18/214
Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06N20/00Primary
Machine learning · CPC title
G06F18/2148
characterised by the process organisation or structure, e.g. boosting cascade · CPC title
G06K9/6256Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 49994948

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9607246B2 cover?: A system, apparatus, method, and computer-readable medium for optimizing classifiers are disclosed. The optimization process can include receiving one or more training examples. The optimization process can further include assigning a loss parameter to each training example. The optimization process can further include optimizing each loss parameter of each training sample based on a sample var…
Who is the assignee on this patent?: Univ Columbia
What technology area does this patent fall under?: Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Mar 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).