Dynamic tuning of training parameters for machine learning algorithms

US11397887B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11397887-B2
Application numberUS-201715716417-A
CountryUS
Kind codeB2
Filing dateSep 26, 2017
Priority dateSep 26, 2017
Publication dateJul 26, 2022
Grant dateJul 26, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system such as a service of a computing resource service provider includes executable code that, if executed by one or more processors, causes the one or more processors to initiate a training of a machine-learning model with a parameter for the training having a first value, the training to determine a set of parameters for the model, calculate output of the training, and change the parameter of the training to have a second value during the training based at least in part on the output. Training parameters may, in some cases, also be referred to as hyperparameters.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method, comprising: initializing, by a computer system, a hyperparameter to a first value, wherein the hyperparameter controls at least part of a training process for generating a machine-learning model; calculating, by the computer system, a first output of an iteration of the training process based at least in part on applying a first value of the hyperparameter and training data to a machine-learning algorithm; selecting, based at least in part on the first output of the iteration of the training process, a second value for the hyperparameter from a plurality of hyperparameters values to modify, by the computer system, the hyperparameter; and calculating, by the computer system, a second output of a subsequent iteration of the training process, wherein the second output is calculated based at least in part on the second value of the hyperparameter. 2. The computer-implemented method of claim 1 , further comprising calculating the second output based at least in part on applying a Bayesian optimization algorithm to the first output. 3. The computer-implemented method of claim 1 , wherein the machine-learning model comprises a neural network. 4. The computer-implemented method of claim 1 , further comprising: allocating, by the computer system, a set of computing resources based on detecting that the hyperparameter is modified to the second value; and wherein the calculating of the second output is performed using the set of computing resources. 5. A system, comprising: one or more processors; and memory that stores computer-executable instructions that, as a result of being executed, cause the one or more processors to: initiate a training of a machine-learning model with one or more hyperparameters for the training having at least a first value, the training to determine a set of parameters for the machine-learning model; calculate output of the training; and change the one or more hyperparameters of the training by applying a second value from a plurality of hyperparameter values during the training based at least in part on the output. 6. The system of claim 5 , wherein the instructions to change the one or more hyperparameters of the training to have at least the second value, which if performed by the one or more processors, cause the system to compute the second value based at least in part on a result of a sequential model-based optimization algorithm, the result determined based at least in part on the output of the training. 7. The system of claim 6 , wherein the sequential model-based optimization algorithm is a Bayesian optimization algorithm. 8. The system of claim 5 , wherein: the first value of the one or more hyperparameters corresponds to an amount of computing resources to utilize to calculate outputs of the training; the second value of the one or more hyperparameters indicating a different amount of computing resources to utilize to calculate outputs of the training; and the instructions, which if performed by the one or more processors, further cause the system to allocate computing resources for the training of the machine-learning model, wherein, the computing resources are allocated in response to detecting a change in the one or more hyperparameters from the first value to the second value. 9. The system of claim 8 , wherein the computing resources comprise virtual machine instances. 10. The system of claim 5 , wherein the one or more hyperparameters comprises an optimization hyperparameter that controls at least part of the training of the machine-learning model. 11. The system of claim 10 , wherein the optimization hyperparameter is a learning rate hyperparameter. 12. The system of claim 5 , wherein: the instructions, which if performed by the one or more processors, further cause the system store a plurality of outputs of the training generated at least in part by using the one or more hyperparameters; and the instructions to change the one or more hyperparameters of the training to have the second value, which, if performed by the one or more processors, further causes the system to change the one or more hyperparameters based at least in part on the plurality of outputs. 13. A non-transitory computer-readable storage medium having stored thereon executable instructions that, as a result of being executed by one or more processors of a computer system, cause the computer system to at least: select a first value for one or more hyperparameters to control training of a machine-learning model, the training to determine a set of parameters for the model; calculate an output of the training; and during the training, determine a second value from a plurality of values and change the one or more hyperparameters to have the second value determined based at least in part on the output. 14. The non-transitory computer-readable storage medium of claim 13 , wherein the one or more hyperparameters comprises an optimization hyperparameter. 15. The non-transitory computer-readable storage medium of claim 13 , wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the computer system to: select a plurality of values for a hyperparameter parameter of the one or more hyperparameters; for the plurality of values, calculate and store a respective output of the training; and determine the second value based at least in part on the respective outputs. 16. The non-transitory computer-readable storage medium of claim 15 , wherein the instructions that cause the computer system to select the plurality of values further include instructions that cause the computer system to pseudo-randomly select the plurality of values. 17. The non-transitory computer-readable storage medium of claim 13 , wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the system to apply a grid search algorithm to the output to generate the second value. 18. The non-transitory computer-readable storage medium of claim 13 , wherein the one or more hyperparameters comprises information usable to determine an amount of computing resources to utilize to calculate the output. 19. The non-transitory computer-readable storage medium of claim 13 , wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the system to: calculate a second output of the training, wherein second output is calculated using at least the second value of the one or more hyperparameters; and during the training, change the second value of the one or more hyperparameters to a third value based at least in part on the second output. 20. The non-transitory computer-readable storage medium of claim 13 , wherein the machine-learning model comprises a linear regression model or a Bayesian network.

Assignees

Inventors

Classifications

  • G06N3/0985Primary

    Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title

  • Supervised learning · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • using neural networks only · CPC title

  • G06N3/08Primary

    Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11397887B2 cover?
A system such as a service of a computing resource service provider includes executable code that, if executed by one or more processors, causes the one or more processors to initiate a training of a machine-learning model with a parameter for the training having a first value, the training to determine a set of parameters for the model, calculate output of the training, and change the paramete…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06N3/0985. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 26 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).