Systems and methods for progressive learning for machine-learned models to optimize training speed

US11450096B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11450096-B2
Application numberUS-202117564860-A
CountryUS
Kind codeB2
Filing dateDec 29, 2021
Priority dateFeb 4, 2021
Publication dateSep 20, 2022
Grant dateSep 20, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods of the present disclosure can include a computer-implemented method for efficient machine-learned model training. The method can include obtaining a plurality of training samples for a machine-learned model. The method can include, for one or more first training iterations, training, based at least in part on a first regularization magnitude configured to control a relative effect of one or more regularization techniques, the machine-learned model using one or more respective first training samples of the plurality of training samples. The method can include, for one or more second training iterations, training, based at least in part on a second regularization magnitude greater than the first regularization magnitude, the machine-learned model using one or more respective second training samples of the plurality of training samples.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for efficient machine-learned model training, comprising: obtaining, by a computing system comprising one or more computing devices, a plurality of training samples for a machine-learned model; for one or more first training iterations: training, by the computing system based at least in part on a first regularization magnitude configured to control a relative effect of one or more regularization techniques, the machine-learned model using one or more respective first training samples of the plurality of training samples; and for one or more second training iterations: training, by the computing system based at least in part on a second regularization magnitude greater than the first regularization magnitude, the machine-learned model using one or more respective second training samples of the plurality of training samples. 2. The computer-implemented method of claim 1 , wherein: obtaining the plurality of training samples for a machine-learned model further comprises determining, by the computing system, a first sample complexity for the one or more first training samples; and wherein, prior to training the machine-learned model using the one or more respective second training samples, the method comprises determining, by the computing system, a second sample complexity for the one or more second training samples, wherein the second sample complexity is greater than the first sample complexity. 3. The computer-implemented method of claim 2 , wherein: the plurality of training samples comprises a respective plurality of training images; and determining the second sample complexity for the one or more second training samples comprises adjusting, by the computing system, a size of one or more second training images, wherein the size of the one or more second training images is greater than a size of one or more first training images. 4. The computer-implemented method of claim 1 , wherein, prior to obtaining the plurality of training samples for the machine-learned model, the method comprises: generating, by the computing system using a machine-learned model search architecture, an initial machine-learned model comprising one or more first values for one or more respective parameters; determining, by the computing system, a first training speed of the initial machine-learned model; and generating, by the computing system using the machine-learned model search architecture, the machine-learned model, wherein the machine-learned model comprises one or more second values for the one or more respective parameters, and wherein at least one of the one or more second values is different than the one or more first values. 5. The computer-implemented method of claim 4 , wherein method further comprises determining, by the computing system, a second training speed of the machine-learned model, wherein the second training speed is greater than the first training speed. 6. The computer-implemented method of claim 4 , wherein the machine-learned model comprises a plurality of sequential model stages, wherein each model stage comprises one or more layers, and wherein a first model stage comprises fewer layers than a second model stage of the plurality of model stages. 7. The computer-implemented method of claim 1 , wherein the one or more regularization techniques comprise at least one of: adjusting, by the computing system, a number of model channels of at least one layer of the machine-learned model; or adjusting, by the computing system, at least one characteristic of one or more training samples of the plurality of training samples. 8. The computer-implemented method of claim 1 , wherein the second regularization magnitude is based at least in part on one or more respective training outputs from the one or more first training iterations. 9. A computing system for determination of models with optimized training speed, comprising: one or more processors; and one or more tangible, non-transitory computer readable media storing computer-readable instructions that when executed by the one or more processors cause the one or more processors to perform operations, the operations comprising: generating a first machine-learned model from a defined model search space, wherein the defined model search space comprises one or more searchable parameters, wherein the first machine-learned model comprises a one or more first values for the one or more searchable parameters; performing a model training process on the first machine-learned model to obtain first training data descriptive of a first training speed; generating a second machine-learned model from the defined model search space based at least in part on the first training data, wherein the second machine-learned model comprises one or more second values for the one or more searchable parameters, wherein at least one of the one or more second values is different than the one or more first values; and performing the model training process on the second machine-learned model to obtain second training data descriptive of a second training speed, wherein the second training speed is faster than the first training speed. 10. The computing system of claim 9 , wherein plurality of model layers of the defined model search space comprises at least one of: a convolutional layer; or a fused convolutional layer. 11. The computing system of claim 9 , wherein the second machine-learned model comprises a plurality of sequential model stages, wherein each model stage comprises one or more model layers, and wherein a first model stage comprises fewer model layers than a second model stage of the plurality of model stages. 12. The computing system of claim 9 , wherein the first training data is further descriptive of a first model accuracy, and wherein the second training data is further descriptive of a second training accuracy greater than the first training accuracy. 13. The computing system of claim 12 , wherein generating the second machine-learned model from the defined model search space is further based at least in part on the first training accuracy. 14. The computing system of claim 9 , wherein performing a model training process on the first machine-learned model comprises: obtaining a plurality of training samples for the first machine-learned model; for one or more first training iterations: training, based at least in part on a first regularization magnitude configured to control a relative effect of one or more regularization techniques, the first machine-learned model using one or more respective first training samples of the plurality of training samples; and for one or more second training iterations: training, based at least in part on a second regularization magnitude greater than the first regularization magnitude, the first machine-learned model using one or more respective second training samples of the plurality of training samples. 15. The computing system of claim 14 , wherein the plurality of training samples comprises a respective plurality of training images; and wherein determining the second sample complexity for the one or more second training samples comprises adjusting a size of one or more second training images, wherein the size of the one or more second training images is greater than a size of one or more first training images. 16. The computing system of claim 15 , wherein the plurality of training samples comprises a respective plurality of training images; and determining the second sample complexity for the one or more second training samples compris

Assignees

Inventors

Classifications

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Convolutional networks [CNN, ConvNet] · CPC title

  • Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Supervised learning · CPC title

  • modifying the architecture, e.g. adding, deleting or silencing nodes or connections · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11450096B2 cover?
Systems and methods of the present disclosure can include a computer-implemented method for efficient machine-learned model training. The method can include obtaining a plurality of training samples for a machine-learned model. The method can include, for one or more first training iterations, training, based at least in part on a first regularization magnitude configured to control a relative …
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N3/045. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 20 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).