Optimization of Parameter Values for Machine-Learned Models
US-2020167691-A1 · May 28, 2020 · US
US11397887B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11397887-B2 |
| Application number | US-201715716417-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 26, 2017 |
| Priority date | Sep 26, 2017 |
| Publication date | Jul 26, 2022 |
| Grant date | Jul 26, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A system such as a service of a computing resource service provider includes executable code that, if executed by one or more processors, causes the one or more processors to initiate a training of a machine-learning model with a parameter for the training having a first value, the training to determine a set of parameters for the model, calculate output of the training, and change the parameter of the training to have a second value during the training based at least in part on the output. Training parameters may, in some cases, also be referred to as hyperparameters.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: initializing, by a computer system, a hyperparameter to a first value, wherein the hyperparameter controls at least part of a training process for generating a machine-learning model; calculating, by the computer system, a first output of an iteration of the training process based at least in part on applying a first value of the hyperparameter and training data to a machine-learning algorithm; selecting, based at least in part on the first output of the iteration of the training process, a second value for the hyperparameter from a plurality of hyperparameters values to modify, by the computer system, the hyperparameter; and calculating, by the computer system, a second output of a subsequent iteration of the training process, wherein the second output is calculated based at least in part on the second value of the hyperparameter. 2. The computer-implemented method of claim 1 , further comprising calculating the second output based at least in part on applying a Bayesian optimization algorithm to the first output. 3. The computer-implemented method of claim 1 , wherein the machine-learning model comprises a neural network. 4. The computer-implemented method of claim 1 , further comprising: allocating, by the computer system, a set of computing resources based on detecting that the hyperparameter is modified to the second value; and wherein the calculating of the second output is performed using the set of computing resources. 5. A system, comprising: one or more processors; and memory that stores computer-executable instructions that, as a result of being executed, cause the one or more processors to: initiate a training of a machine-learning model with one or more hyperparameters for the training having at least a first value, the training to determine a set of parameters for the machine-learning model; calculate output of the training; and change the one or more hyperparameters of the training by applying a second value from a plurality of hyperparameter values during the training based at least in part on the output. 6. The system of claim 5 , wherein the instructions to change the one or more hyperparameters of the training to have at least the second value, which if performed by the one or more processors, cause the system to compute the second value based at least in part on a result of a sequential model-based optimization algorithm, the result determined based at least in part on the output of the training. 7. The system of claim 6 , wherein the sequential model-based optimization algorithm is a Bayesian optimization algorithm. 8. The system of claim 5 , wherein: the first value of the one or more hyperparameters corresponds to an amount of computing resources to utilize to calculate outputs of the training; the second value of the one or more hyperparameters indicating a different amount of computing resources to utilize to calculate outputs of the training; and the instructions, which if performed by the one or more processors, further cause the system to allocate computing resources for the training of the machine-learning model, wherein, the computing resources are allocated in response to detecting a change in the one or more hyperparameters from the first value to the second value. 9. The system of claim 8 , wherein the computing resources comprise virtual machine instances. 10. The system of claim 5 , wherein the one or more hyperparameters comprises an optimization hyperparameter that controls at least part of the training of the machine-learning model. 11. The system of claim 10 , wherein the optimization hyperparameter is a learning rate hyperparameter. 12. The system of claim 5 , wherein: the instructions, which if performed by the one or more processors, further cause the system store a plurality of outputs of the training generated at least in part by using the one or more hyperparameters; and the instructions to change the one or more hyperparameters of the training to have the second value, which, if performed by the one or more processors, further causes the system to change the one or more hyperparameters based at least in part on the plurality of outputs. 13. A non-transitory computer-readable storage medium having stored thereon executable instructions that, as a result of being executed by one or more processors of a computer system, cause the computer system to at least: select a first value for one or more hyperparameters to control training of a machine-learning model, the training to determine a set of parameters for the model; calculate an output of the training; and during the training, determine a second value from a plurality of values and change the one or more hyperparameters to have the second value determined based at least in part on the output. 14. The non-transitory computer-readable storage medium of claim 13 , wherein the one or more hyperparameters comprises an optimization hyperparameter. 15. The non-transitory computer-readable storage medium of claim 13 , wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the computer system to: select a plurality of values for a hyperparameter parameter of the one or more hyperparameters; for the plurality of values, calculate and store a respective output of the training; and determine the second value based at least in part on the respective outputs. 16. The non-transitory computer-readable storage medium of claim 15 , wherein the instructions that cause the computer system to select the plurality of values further include instructions that cause the computer system to pseudo-randomly select the plurality of values. 17. The non-transitory computer-readable storage medium of claim 13 , wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the system to apply a grid search algorithm to the output to generate the second value. 18. The non-transitory computer-readable storage medium of claim 13 , wherein the one or more hyperparameters comprises information usable to determine an amount of computing resources to utilize to calculate the output. 19. The non-transitory computer-readable storage medium of claim 13 , wherein the instructions further comprise instructions that, as a result of being executed by the one or more processors, cause the system to: calculate a second output of the training, wherein second output is calculated using at least the second value of the one or more hyperparameters; and during the training, change the second value of the one or more hyperparameters to a third value based at least in part on the second output. 20. The non-transitory computer-readable storage medium of claim 13 , wherein the machine-learning model comprises a linear regression model or a Bayesian network.
Hyperparameter optimisation; Meta-learning; Learning-to-learn · CPC title
Supervised learning · CPC title
Machine learning · CPC title
using neural networks only · CPC title
Learning methods · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.