Method for clustering photos for pictoral storytelling
US-2024419384-A1 · Dec 19, 2024 · US
US2024169253A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2024169253-A1 |
| Application number | US-202217992492-A |
| Country | US |
| Kind code | A1 |
| Filing date | Nov 22, 2022 |
| Priority date | Nov 22, 2022 |
| Publication date | May 23, 2024 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Using a first dataset of labeled data, a model is trained by adjusting a feature extractor parameter, a classifier parameter, and a discriminator parameter of the model. Using the discriminator parameter and a parametric function of the feature extractor parameter, a plurality of samples of a dataset of unlabeled data is scored. A subset of the scored plurality of samples is selected for labeling. Responsive to receiving a label of each of the selected subset of the scored plurality of samples, the first dataset of labeled data is augmented with the selected subset of the scored plurality of samples and the label of each of the selected subset of the scored plurality of samples. Using the augmented dataset of labeled data, the model is retrained. The retraining comprises further adjusting the feature extractor parameter, the classifier parameter, and the discriminator parameter of the model.
Opening claim text (preview).
What is claimed is: 1 . A computer-implemented method comprising: training, using a first dataset of labeled data, a model, wherein the training comprises adjusting a feature extractor parameter of the model, a classifier parameter of the model, and a discriminator parameter of the model; scoring, using the discriminator parameter and a parametric function of the feature extractor parameter, a plurality of samples of a dataset of unlabeled data, the scoring resulting in a scored plurality of samples; selecting, for labeling, a subset of the scored plurality of samples, the selecting resulting in a selected subset of the scored plurality of samples; augmenting, responsive to receiving a label of each of the selected subset of the scored plurality of samples, the first dataset of labeled data with the selected subset of the scored plurality of samples and the label of each of the selected subset of the scored plurality of samples, the augmenting resulting in an augmented dataset of labeled data; and retraining, using the augmented dataset of labeled data, the model, wherein the retraining comprises further adjusting the feature extractor parameter of the model, the classifier parameter of the model, and the discriminator parameter of the model. 2 . The computer-implemented method of claim 1 , wherein the discriminator parameter of the model is adjusted using a quantization loss of the model. 3 . The computer-implemented method of claim 2 , wherein scoring the plurality of samples of a dataset of unlabeled data comprises combining an uncertainty score, a diversity score and a class imbalance score of a sample in the plurality of samples. 4 . The computer-implemented method of claim 3 , wherein the uncertainty score is computed using a weighted sum, each addend in the weighted sum comprising an upper bound of losses incurred on an unlabeled data point divided by the number of data points belonging to a particular class of data points. 5 . The computer-implemented method of claim 3 , wherein the diversity score is computed using a number of data points in the first dataset of labeled data, a number of data points in the selected subset of the scored plurality of samples, and the parametric function, the parametric function quantifying how well the first dataset of labeled data represents a combined dataset, the combined dataset comprising the first dataset of labeled data and the dataset of unlabeled data. 6 . The computer-implemented method of claim 3 , wherein the class imbalance score is computed using a weighted sum, each addend in the weighted sum comprising a result of performing the model divided by the cube root of the number of data points belonging to a particular class of data points. 7 . The computer-implemented method of claim 1 , further comprising: subtracting, from a query budget, a number of samples in the selected subset of the scored plurality of samples. 8 . The computer-implemented method of claim 7 , further comprising: causing, responsive to determining that the query budget is greater than zero, labeling of the selected subset of the scored plurality of samples. 9 . A computer program product comprising one or more computer readable storage medium, and program instructions collectively stored on the one or more computer readable storage medium, the program instructions executable by a processor to cause the processor to perform operations comprising: training, using a first dataset of labeled data, a model, wherein the training comprises adjusting a feature extractor parameter of the model, a classifier parameter of the model, and a discriminator parameter of the model; scoring, using the discriminator parameter and a parametric function of the feature extractor parameter, a plurality of samples of a dataset of unlabeled data, the scoring resulting in a scored plurality of samples; selecting, for labeling, a subset of the scored plurality of samples, the selecting resulting in a selected subset of the scored plurality of samples; augmenting, responsive to receiving a label of each of the selected subset of the scored plurality of samples, the first dataset of labeled data with the selected subset of the scored plurality of samples and the label of each of the selected subset of the scored plurality of samples, the augmenting resulting in an augmented dataset of labeled data; and retraining, using the augmented dataset of labeled data, the model, wherein the retraining comprises further adjusting the feature extractor parameter of the model, the classifier parameter of the model, and the discriminator parameter of the model. 10 . The computer program product of claim 9 , wherein the stored program instructions are stored in a computer readable storage device in a data processing system, and wherein the stored program instructions are transferred over a network from a remote data processing system. 11 . The computer program product of claim 9 , wherein the stored program instructions are stored in a computer readable storage device in a server data processing system, and wherein the stored program instructions are downloaded in response to a request over a network to a remote data processing system for use in a computer readable storage device associated with the remote data processing system, further comprising: program instructions to meter use of the program instructions associated with the request; and program instructions to generate an invoice based on the metered use. 12 . The computer program product of claim 9 , wherein the discriminator parameter of the model is adjusted using a quantization loss of the model. 13 . The computer program product of claim 12 , wherein scoring the plurality of samples of a dataset of unlabeled data comprises combining an uncertainty score, a diversity score and a class imbalance score of a sample in the plurality of samples. 14 . The computer program product of claim 13 , wherein the uncertainty score is computed using a weighted sum, each addend in the weighted sum comprising an upper bound of losses incurred on an unlabeled data point divided by the number of data points belonging to a particular class of data points. 15 . The computer program product of claim 13 , wherein the diversity score is computed using a number of data points in the first dataset of labeled data, a number of data points in the selected subset of the scored plurality of samples, and the parametric function, the parametric function quantifying how well the first dataset of labeled data represents a combined dataset, the combined dataset comprising the first dataset of labeled data and the dataset of unlabeled data. 16 . The computer program product of claim 13 , wherein the class imbalance score is computed using a weighted sum, each addend in the weighted sum comprising a result of performing the model divided by the cube root of the number of data points belonging to a particular class of data points. 17 . The computer program product of claim 9 , further comprising: subtracting, from a query budget, a number of samples in the selected subset of the scored plurality of samples. 18 . The computer program product of claim 17 , further comprising: causing, responsive to determining that the query budget is greater than zero, labeling of the selected subset of the scored plurality of samples. 19 . A computer system comprising a processor and one or more computer readable storage media, and program instructions collectively stored on the one or more computer read
Related publications grouped by family.
Answers are generated from the same data shown on this page.