Techniques for machine learning model selection for domain generalization

US12406210B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12406210-B2
Application numberUS-202217663595-A
CountryUS
Kind codeB2
Filing dateMay 16, 2022
Priority dateMay 16, 2022
Publication dateSep 2, 2025
Grant dateSep 2, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing device may perform training of a set of machine learning models on a first data set associated with a first domain. In some examples, the training may include, for each machine learning model of the set of machine learning models, inputting, as values for a set of parameters of the respective sets of parameters and for an iteration of a set of iterations, a moving average of the set of parameters calculated over a threshold number of previous iterations. The computing device may select a set of model states that are generated during the training of the plurality of machine learning models based on a validation performance of the set of model states performed during the training. The computing device may then generate an ensembled machine learning model by aggregating the set of machine learning models corresponding to the set of selected model states.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for machine learning model training, comprising: performing training of a plurality of machine learning models on a first data set associated with a first domain, wherein the plurality of machine learning models comprises respective sets of parameters that are updated across a plurality of iterations during the training, wherein the training comprises, for each machine learning model of the plurality of machine learning models, inputting, as values for a set of parameters of the respective sets of parameters and for an iteration of the plurality of iterations, a moving average of the set of parameters calculated over a threshold number of previous iterations; selecting a plurality of model states that are generated during the training of the plurality of machine learning models, wherein the plurality of model states are selected based at least in part on a validation performance of the plurality of model states performed during the training; generating an ensembled machine learning model by aggregating the plurality of machine learning models corresponding to the plurality of selected model states; and performing a machine learning prediction using the ensembled machine learning model on a second data set associated with a second domain different from the first domain, wherein an output of the ensembled machine learning model is a dimension-wise average of respective outputs from the plurality of machine learning models in the ensembled machine learning model. 2. The method of claim 1 , further comprising: determining, for one or more iterations of the plurality of iterations, a validation performance value associated with a current state of a machine learning model of the plurality of machine learning models; and selecting a model state for one or more machine learning models of the plurality of machine learning models based on a highest validation performance value for the corresponding machine learning model across all iterations of the plurality of iterations. 3. The method of claim 1 , further comprising: determining, for a first iteration following the threshold number of previous iterations, a first set of values for the set of parameters of a machine learning model of the plurality of machine learning models; and determining, for a second iteration following the first iteration, a second set of values for the set of parameters of the machine learning model based at least in part on the moving average of the set of parameters calculated during the first iteration. 4. The method of claim 1 , further comprising: starting a calculation of the moving average of the set of parameters after a configured number of iterations from a starting iteration of the plurality of iterations. 5. The method of claim 1 , further comprising: determining, for a first iteration, a first validation performance value associated with a first state of a machine learning model of the plurality of machine learning models; determining, for a second iteration following the first iteration, a second validation performance value associated with a second state of the machine learning model of the plurality of machine learning models; and selecting the first state of the machine learning model for generation of the ensembled machine learning model based at least in part on determining that the second validation performance value is less than the first validation performance value. 6. The method of claim 1 , wherein the plurality of machine learning models are trained using a gradient based technique. 7. The method of claim 1 , wherein the respective outputs from the plurality of machine learning models comprises a respective vector for each machine learning model and the output of the ensembled machine learning model comprises the dimension-wise average of the respective vector. 8. An apparatus for machine learning model training, comprising: a processor; memory coupled with the processor; and instructions stored in the memory and executable by the processor to cause the apparatus to: perform training of a plurality of machine learning models on a first data set associated with a first domain, wherein the plurality of machine learning models comprises respective sets of parameters that are updated across a plurality of iterations during the training, wherein the training comprises, for each machine learning model of the plurality of machine learning models, inputting, as values for a set of parameters of the respective sets of parameters and for an iteration of the plurality of iterations, a moving average of the set of parameters calculated over a threshold number of previous iterations; select a plurality of model states that are generated during the training of the plurality of machine learning models, wherein the plurality of model states are selected based at least in part on a validation performance of the plurality of model states performed during the training; generate an ensembled machine learning model by aggregating the plurality of machine learning models corresponding to the plurality of selected model states; and perform a machine learning prediction using the ensembled machine learning model on a second data set associated with a second domain different from the first domain, wherein an output of the ensembled machine learning model is a dimension-wise average of respective outputs from the plurality of machine learning models in the ensembled machine learning model. 9. The apparatus of claim 8 , wherein the instructions are further executable by the processor to cause the apparatus to: determine, for one or more iterations of the plurality of iterations, a validation performance value associated with a current state of a machine learning model of the plurality of machine learning models; and select a model state for one or more machine learning models of the plurality of machine learning models based on a highest validation performance value for the corresponding machine learning model across all iterations of the plurality of iterations. 10. The apparatus of claim 8 , wherein the instructions are further executable by the processor to cause the apparatus to: determine, for a first iteration following the threshold number of previous iterations, a first set of values for the set of parameters of a machine learning model of the plurality of machine learning models; and determine, for a second iteration following the first iteration, a second set of values for the set of parameters of the machine learning model based at least in part on the moving average of the set of parameters calculated during the first iteration. 11. The apparatus of claim 8 , wherein the instructions are further executable by the processor to cause the apparatus to: start a calculation of the moving average of the set of parameters after a configured number of iterations from a starting iteration of the plurality of iterations. 12. The apparatus of claim 8 , wherein the instructions are further executable by the processor to cause the apparatus to: determine, for a first iteration, a first validation performance value associated with a first state of a machine learning model of the plurality of machine learning models; determine, for a second iteration following the first iteration, a second validation performance value associated with a second state of the machine learning model of the plurality of machine learning models; and select the first state of the machine learning model for generation of the ensembled machine learning model based at least in part on determining that the second validation performance value is less than the first validation performa

Assignees

Inventors

Classifications

  • Selection of pattern recognition techniques, e.g. of classifiers in a multi-classifier system · CPC title

  • G06F18/217Primary

    Validation; Performance evaluation; Active pattern learning techniques · CPC title

  • G06N20/20Primary

    Ensemble learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12406210B2 cover?
A computing device may perform training of a set of machine learning models on a first data set associated with a first domain. In some examples, the training may include, for each machine learning model of the set of machine learning models, inputting, as values for a set of parameters of the respective sets of parameters and for an iteration of a set of iterations, a moving average of the set…
Who is the assignee on this patent?
Salesforce Inc
What technology area does this patent fall under?
Primary CPC classification G06F18/217. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 02 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).