Granular neural network architecture search over low-level primitives
US-2024428071-A1 · Dec 26, 2024 · US
US2023259769A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2023259769-A1 |
| Application number | US-202318168027-A |
| Country | US |
| Kind code | A1 |
| Filing date | Feb 13, 2023 |
| Priority date | Feb 16, 2022 |
| Publication date | Aug 17, 2023 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a machine learning model to perform a machine learning task. In one aspect, a method includes: obtaining a set of training examples; obtaining, for each training example, a respective metadata label that characterizes the training example; and training the machine learning model over a sequence of training stages, including, at each training stage: identifying a selection criterion corresponding to the current training stage that defines a criterion for selecting training examples based on the metadata labels of the training examples; selecting a proper subset of the set training examples as training data for the current training stage in accordance with the selection criterion for the current training stage; and updating the machine learning model by training the machine learning model on the training data for the current training stage.
Opening claim text (preview).
What is claimed is: 1 . A method performed by one or more computers for training a machine learning model to perform a machine learning task, the method comprising: obtaining a set of training examples; obtaining, for each training example, a respective metadata label that characterizes the training example; and training the machine learning model over a sequence of training stages, comprising, at each training stage before a last training stage in the sequence of training stages: identifying a selection criterion corresponding to the current training stage that defines a criterion for selecting training examples based on the metadata labels of the training examples, selecting a proper subset of the set training examples as training data for the current training stage in accordance with the selection criterion for the current training stage, updating the machine learning model by training the machine learning model on the training data for the current training stage, and providing the updated machine learning model for further training at a next training stage in the sequence of training stages. 2 . The method of claim 1 , wherein for each training example, the metadata label for the training example defines a timestamp corresponding to the training example. 3 . The method of claim 1 , wherein for each training example, the metadata label for the training example defines a geographic feature corresponding to the training example. 4 . The method of claim 1 , wherein for each training stage in the sequence of training stages: the selection criterion corresponding to the training stage specifies a set of allowable metadata labels for the training stage; and each training example is eligible for selection at the training stage only if the metadata label of the training example is included in the set of allowable metadata labels for the training stage. 5 . The method of claim 4 , wherein for each training stage after a first training stage in the sequence of training stages: a maximum metadata label in the set of allowable metadata labels for the training stage is greater than a maximum metadata label in the set of allowable metadata labels for a preceding training stage. 6 . The method of claim 4 , wherein for one or more training stages in the sequence of training stages: the selection criterion corresponding to the training stage specifies a respective selection weight for each metadata label in the set of allowable metadata labels; and selecting a proper subset of the set of training examples as training data for the current training stage comprises: determining a probability distribution, over training examples having metadata labels included in the set of allowable metadata labels for the training stage, using the selection weights for the allowable metadata labels; and sampling a plurality of training examples having metadata labels included in set of allowable metadata labels in accordance with the probability distribution. 7 . The method of claim 6 , wherein for one or more training stages in the sequence of training stages: the set of allowable metadata labels for the training stage comprises a plurality of metadata labels; and the selection criterion corresponding to the training stage specifies a higher selection weight for a maximum metadata label in the set of allowable metadata labels than for a minimum metadata label in the set of allowable metadata labels. 8 . The method of claim 1 , wherein for one or more training stages in the sequence of training stages, updating the machine learning model by training the machine learning model on the training data for the current training stage comprises: determining, for each training example in the training data for the current training stage, an error in a prediction generated by the machine learning model for the training example; updating the machine learning model using the errors in the predictions generated by the machine learning model for the training examples in the training data for the current training stage. 9 . The method of claim 8 , wherein the machine learning model is an ensemble model that comprises a plurality of base models, and wherein updating the machine learning model using the errors in the predictions generated by the machine learning model for training examples in the training data for the current training stage comprises: determining a prediction target for each training example in the training data for the current training stage based on the error in the prediction generated by the machine learning model for the training example; generating one or more new base models that are each trained to generate the prediction targets for the training examples in the training data for the current training stage; and adding the new base models to the ensemble model. 10 . The method of claim 9 , wherein the new base models are decision trees. 11 . The method of claim 8 , wherein updating the machine learning model using the errors in the predictions generated by the machine learning model for the training examples in the training data for the current training stage comprises: determining a respective weight factor for each training example in the training data for the current training stage based on the error in the prediction generated by the machine learning model for the training example; training the machine learning model on the training data for the current training stage using the weight factors for the training examples, wherein the weight factor for a training example controls an impact of the training example on the training of the machine learning model. 12 . The method of claim 11 , wherein the machine learning model comprises a neural network, and wherein training the machine learning model on the training data for the current training stage using the weight factors for the training examples comprises, for each training example: generating a prediction for the training example using the neural network; determining gradients, with respect to neural network parameters of the neural network, of an objective function that depends on the prediction for the training example; scaling the gradients using the weight factor for the training example; and updating the neural network parameters of the neural network using the scaled gradients. 13 . The method of claim 1 , wherein training the machine learning model at the last stage in the sequence of training stages comprises: identifying a selection criterion corresponding to the last training stage that defines a criterion for selecting training examples based on the metadata labels of the training examples; selecting a proper subset of the set training examples as training data for the last training stage in accordance with the selection criterion for the current training stage; updating the machine learning model by training the machine learning model on the training data for the last training stage; and providing the updated machine learning model for use in performing the machine learning task. 14 . The method of claim 1 , wherein each training example in the set of training examples comprises: (i) a training input to the machine learning model, and (ii) a target output to be generated by the machine learning model by processing the training input. 15 . The method of claim 14 , wherein the machine learning model performs a fire prediction task, wherein for each training example: (i) the training input characterizes a geographic region, and (ii) the target output defines, for each of one or more spatial locations in the
Related publications grouped by family.
Answers are generated from the same data shown on this page.