Active learning in model training

US2024169253A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2024169253-A1
Application numberUS-202217992492-A
CountryUS
Kind codeA1
Filing dateNov 22, 2022
Priority dateNov 22, 2022
Publication dateMay 23, 2024
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Using a first dataset of labeled data, a model is trained by adjusting a feature extractor parameter, a classifier parameter, and a discriminator parameter of the model. Using the discriminator parameter and a parametric function of the feature extractor parameter, a plurality of samples of a dataset of unlabeled data is scored. A subset of the scored plurality of samples is selected for labeling. Responsive to receiving a label of each of the selected subset of the scored plurality of samples, the first dataset of labeled data is augmented with the selected subset of the scored plurality of samples and the label of each of the selected subset of the scored plurality of samples. Using the augmented dataset of labeled data, the model is retrained. The retraining comprises further adjusting the feature extractor parameter, the classifier parameter, and the discriminator parameter of the model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computer-implemented method comprising: training, using a first dataset of labeled data, a model, wherein the training comprises adjusting a feature extractor parameter of the model, a classifier parameter of the model, and a discriminator parameter of the model; scoring, using the discriminator parameter and a parametric function of the feature extractor parameter, a plurality of samples of a dataset of unlabeled data, the scoring resulting in a scored plurality of samples; selecting, for labeling, a subset of the scored plurality of samples, the selecting resulting in a selected subset of the scored plurality of samples; augmenting, responsive to receiving a label of each of the selected subset of the scored plurality of samples, the first dataset of labeled data with the selected subset of the scored plurality of samples and the label of each of the selected subset of the scored plurality of samples, the augmenting resulting in an augmented dataset of labeled data; and retraining, using the augmented dataset of labeled data, the model, wherein the retraining comprises further adjusting the feature extractor parameter of the model, the classifier parameter of the model, and the discriminator parameter of the model. 2 . The computer-implemented method of claim 1 , wherein the discriminator parameter of the model is adjusted using a quantization loss of the model. 3 . The computer-implemented method of claim 2 , wherein scoring the plurality of samples of a dataset of unlabeled data comprises combining an uncertainty score, a diversity score and a class imbalance score of a sample in the plurality of samples. 4 . The computer-implemented method of claim 3 , wherein the uncertainty score is computed using a weighted sum, each addend in the weighted sum comprising an upper bound of losses incurred on an unlabeled data point divided by the number of data points belonging to a particular class of data points. 5 . The computer-implemented method of claim 3 , wherein the diversity score is computed using a number of data points in the first dataset of labeled data, a number of data points in the selected subset of the scored plurality of samples, and the parametric function, the parametric function quantifying how well the first dataset of labeled data represents a combined dataset, the combined dataset comprising the first dataset of labeled data and the dataset of unlabeled data. 6 . The computer-implemented method of claim 3 , wherein the class imbalance score is computed using a weighted sum, each addend in the weighted sum comprising a result of performing the model divided by the cube root of the number of data points belonging to a particular class of data points. 7 . The computer-implemented method of claim 1 , further comprising: subtracting, from a query budget, a number of samples in the selected subset of the scored plurality of samples. 8 . The computer-implemented method of claim 7 , further comprising: causing, responsive to determining that the query budget is greater than zero, labeling of the selected subset of the scored plurality of samples. 9 . A computer program product comprising one or more computer readable storage medium, and program instructions collectively stored on the one or more computer readable storage medium, the program instructions executable by a processor to cause the processor to perform operations comprising: training, using a first dataset of labeled data, a model, wherein the training comprises adjusting a feature extractor parameter of the model, a classifier parameter of the model, and a discriminator parameter of the model; scoring, using the discriminator parameter and a parametric function of the feature extractor parameter, a plurality of samples of a dataset of unlabeled data, the scoring resulting in a scored plurality of samples; selecting, for labeling, a subset of the scored plurality of samples, the selecting resulting in a selected subset of the scored plurality of samples; augmenting, responsive to receiving a label of each of the selected subset of the scored plurality of samples, the first dataset of labeled data with the selected subset of the scored plurality of samples and the label of each of the selected subset of the scored plurality of samples, the augmenting resulting in an augmented dataset of labeled data; and retraining, using the augmented dataset of labeled data, the model, wherein the retraining comprises further adjusting the feature extractor parameter of the model, the classifier parameter of the model, and the discriminator parameter of the model. 10 . The computer program product of claim 9 , wherein the stored program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system. 11 . The computer program product of claim 9 , wherein the stored program instructions are stored in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system, further comprising: program instructions to meter use of the program instructions associated with the request; and program instructions to generate an invoice based on the metered use. 12 . The computer program product of claim 9 , wherein the discriminator parameter of the model is adjusted using a quantization loss of the model. 13 . The computer program product of claim 12 , wherein scoring the plurality of samples of a dataset of unlabeled data comprises combining an uncertainty score, a diversity score and a class imbalance score of a sample in the plurality of samples. 14 . The computer program product of claim 13 , wherein the uncertainty score is computed using a weighted sum, each addend in the weighted sum comprising an upper bound of losses incurred on an unlabeled data point divided by the number of data points belonging to a particular class of data points. 15 . The computer program product of claim 13 , wherein the diversity score is computed using a number of data points in the first dataset of labeled data, a number of data points in the selected subset of the scored plurality of samples, and the parametric function, the parametric function quantifying how well the first dataset of labeled data represents a combined dataset, the combined dataset comprising the first dataset of labeled data and the dataset of unlabeled data. 16 . The computer program product of claim 13 , wherein the class imbalance score is computed using a weighted sum, each addend in the weighted sum comprising a result of performing the model divided by the cube root of the number of data points belonging to a particular class of data points. 17 . The computer program product of claim 9 , further comprising: subtracting, from a query budget, a number of samples in the selected subset of the scored plurality of samples. 18 . The computer program product of claim 17 , further comprising: causing, responsive to determining that the query budget is greater than zero, labeling of the selected subset of the scored plurality of samples. 19 . A computer system comprising a processor and one or more computer readable storage media, and program instructions collectively stored on the one or more computer read

Assignees

Inventors

Classifications

  • G06F16/55Primary

    Clustering; Classification · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • Clustering or classification · CPC title

  • Combinations of networks · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024169253A1 cover?
Using a first dataset of labeled data, a model is trained by adjusting a feature extractor parameter, a classifier parameter, and a discriminator parameter of the model. Using the discriminator parameter and a parametric function of the feature extractor parameter, a plurality of samples of a dataset of unlabeled data is scored. A subset of the scored plurality of samples is selected for labeli…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/55. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 23 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).