Optimization of Parameter Values for Machine-Learned Models
US-2020167691-A1 · May 28, 2020 · US
US2020257984A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2020257984-A1 |
| Application number | US-202016779035-A |
| Country | US |
| Kind code | A1 |
| Filing date | Jan 31, 2020 |
| Priority date | Feb 12, 2019 |
| Publication date | Aug 13, 2020 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The domain adaptation problem is addressed by using the predictions of a trained model over both source and target domain to retain the model with the assistance of an auxiliary model and a modified objective function. Inaccuracy in the model's predictions in the target domain is treated as noise and is reduced by using a robust learning framework during retraining, enabling unsupervised training in the target domain. Applications include object detection models, where noise in retraining is reduced by explicitly representing label noise and geometry noise in the objective function and using the ancillary model to inject information about label noise.
Opening claim text (preview).
1 . A method for training a machine learning system over an input space based on a training dataset comprising items in a source domain and associated ground-truth labels and a test dataset comprising items in a target domain, the method executed by at least one processor in communication with at least one memory and comprising: training a primary model in the at least one memory based on at least a first item of the training dataset and an associated first label; instantiating an ancillary model in the at least one memory, the ancillary model operable to classify objects in the input space with an associated confidence based on one or more parameters of the ancillary model; and retraining the one or more parameters of the primary model based on the test dataset, said retraining comprising: generating a prediction based on at least a second item of the test dataset by the primary model; generating an ancillary confidence associated with the prediction by the ancillary model; determining a value of an objective function based on the prediction and the ancillary confidence; and updating at least one of the one or more parameters of the primary model based on the value of the objective function. 2 . The method according to claim 1 wherein the primary model comprises an object-detection model, the input space comprises at least one of: images and video, and generating a prediction based on at least the second item comprises generating, for at least the second item, a bounding box and an associated object classification. 3 . The method according to claim 2 wherein generating the ancillary confidence associated with the prediction comprises extracting from the bounding box an extracted item comprising at least one of: an image and a video; and classifying the extracted item by the ancillary model. 4 . The method according to claim 3 wherein: retraining the one or more parameters of the primary model comprises retraining the one or more parameters of the primary model based on a training item from the training dataset, a ground-truth label associated with the training item, and a test item from the test dataset; and determining the value of the objective function comprises: determining a first value of the objective function based on the test item, the bounding box of prediction of the primary model, and the ancillary confidence of the ancillary model; determining a second value of a second objective function based on the training item and the associated ground-truth label; and determining the value of the object function based on the first and second values. 5 . The method of claim 1 wherein the prediction comprises a predicted confidence and wherein determining the value of the objective function comprises determining a first value based on the predicted confidence, determining the second value based on the ancillary confidence, and determining the value of the objective function comprises determining the value of the objective function based on the first and second values. 6 . The method of claim 5 wherein determining the value of the objective function comprises scaling the second value relative to the first value by a scaling factor. 7 . The method of claim 6 wherein determining the value of the objective function comprises annealing the scaling factor from an initial value at a first stage of retaining to a later value at a second stage of retaining after the first stage of retraining. 8 . The method of claim 6 wherein scaling the second value relative to the first value comprises determining a geometric mean of the first and second terms parametrized by the scaling factor. 9 . A computing system comprising: at least one processor; at least one nontransitory processor-readable medium communicatively coupled to the at least one processor, the at least one nontransitory processor-readable medium which stores at least one of processor-executable instructions or data which, when executed by the at least one processor, cause the at least one processor to: train a primary model based on at least a first item of a training dataset and an associated first label; instantiate an ancillary model, the ancillary model operable to classify objects in an input space with an associated confidence based on one or more parameters of the ancillary model; and retrain the one or more parameters of the primary model based on the test dataset, said retraining comprising: generating a prediction based on at least a second item of the test dataset by the primary model; generating an ancillary confidence associated with the prediction by the ancillary model; determining a value of an objective function based on the prediction and the ancillary confidence; and updating at least one of the one or more parameters of the primary model based on the value of the objective function. 10 . The system according to claim 9 wherein the primary model comprises an object-detection model, the input space comprises at least one of: images and video, and generating a prediction based on at least the second item comprises generating, for at least the second item, a bounding box and an associated object classification. 11 . The system according to claim 10 wherein generating the ancillary confidence associated with the prediction comprises extracting from the bounding box an extracted item comprising at least one of: an image and a video; and classifying the extracted item by the ancillary model. 12 . The system according to claim 11 wherein: retraining the one or more parameters of the primary model comprises retraining the one or more parameters of the primary model based on a training item from the training dataset, a ground-truth label associated with the training item, and a test item from the test dataset; and determining the value of the objective function comprises: determining a first value of the objective function based on the test item, the bounding box of prediction of the primary model, and the ancillary confidence of the ancillary model; determining a second value of a second objective function based on the training item and the associated ground-truth label; determining the value of the object function based on the first and second values. 13 . The system of claim 9 wherein the prediction comprises a predicted confidence and wherein determining the value of the objective function comprises determining a first value based on the predicted confidence, determining the second value based on the ancillary confidence, and determining the value of the objective function comprises determining the value of the objective function based on the first and second values. 14 . The system of claim 13 wherein determining the value of the objective function comprises scaling the second value relative to the first value by a scaling factor. 15 . The system of claim 14 wherein determining the value of the objective function comprises annealing the scaling factor from an initial value at a first stage of retaining to a later value at a second stage of retaining after the first stage of retraining. 16 . The system of claim 14 wherein scaling the second value relative to the first value comprises determining a geometric mean of the first and second terms parametrized by the scaling factor.
Combinations of networks · CPC title
Learning methods · CPC title
Supervised learning · CPC title
Convolutional networks [CNN, ConvNet] · CPC title
Transfer learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.