Systems and methods for Bayesian optimization using non-linear mapping of input
US-10074054-B2 · Sep 11, 2018 · US
US11501192B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11501192-B2 |
| Application number | US-201816121195-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 4, 2018 |
| Priority date | May 30, 2013 |
| Publication date | Nov 15, 2022 |
| Grant date | Nov 15, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for use in connection with performing optimization using an objective function that maps elements in a first domain to values in a range. The techniques include using at least one computer hardware processor to perform: identifying a first point at which to evaluate the objective function at least in part by using an acquisition utility function and a probabilistic model of the objective function, wherein the probabilistic model depends on a non-linear one-to-one mapping of elements in the first domain to elements in a second domain; evaluating the objective function at the identified first point to obtain a corresponding first value of the objective function; and updating the probabilistic model of the objective function using the first value to obtain an updated probabilistic model of the objective function.
Opening claim text (preview).
What is claimed is: 1. A system for optimizing performance of a machine learning system, the system comprising: at least one processor; and at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform: identifying a first set of hyper-parameter values at which to evaluate an objective function relating values of hyper-parameters of the machine learning system to performance of the machine learning system, the identifying performed at least in part by using a probabilistic model of the objective function, the probabilistic model of the objective function comprising a non-linear mapping of at least some of the values of the hyper-parameters from a first domain to a second domain; evaluating the objective function at the identified first set of hyper-parameter values, at least in part by executing the machine learning system when configured with the first set of hyper-parameter values; and updating the probabilistic model of the objective function based on the evaluation of the objective function at the first set of hyper-parameter values. 2. The system of claim 1 , wherein the machine learning system comprises a neural network for identifying objects in images and the objective function relates values of a plurality of hyper-parameters of the neural network to performance of the neural network in identifying the objects in the image. 3. The system of claim 2 , wherein the plurality of hyper-parameters of the neural network comprises at least one of: a learning rate, a dropout rate, a weight norm, a hidden layer size, a convolutional kernel size, or a pooling size. 4. The system of claim 2 , wherein evaluating the objective function at the identified first set of hyper-parameter values comprises: training the machine learning system configured with the first set of hyper-parameter values to obtain first learned parameter values, the training using training data of a dataset of images; processing the dataset of images using the machine learning system configured with the first set of hyper-parameters values and the first learned parameter values to obtain a first measure of generalization performance of the machine learning system in identifying objects in images when configured with the first set of hyper-parameter values. 5. The system of claim 2 , wherein the instructions, when executed by the at least one processor, cause the at least one processor to perform: identifying an extremal value of the objective function based at least in part on a result of evaluating the objective function at the first set of hyper-parameter values; configuring the plurality of hyper-parameters of the neural network for identifying objects in images based on the extremal value of the objective function; and processing a dataset of images using the neural network with the plurality of hyper-parameters configured based on the extremal value of the objective function. 6. The system of claim 1 , wherein the processor-executable instructions further cause the at least one processor to perform: determining at least one parameter of the non-linear mapping based on one or more evaluations of the objective function. 7. The system of claim 1 , wherein the processor-executable instructions further cause the at least one processor to perform: identifying a second set of hyper-parameter values at which to evaluate the objective function; evaluating the objective function at the identified second set of hyper-parameter values, at least in part by executing the machine learning system when configured with the second set of hyper-parameter values; and updating the probabilistic model of the objective function based on the evaluation of the objective function at the second set of hyper-parameter values. 8. The system of claim 1 , wherein the non-linear mapping comprises a one-to-one mapping of values from the first domain to the second domain. 9. A method of optimizing performance of a machine learning system, the method comprising: identifying a first set of hyper-parameter values at which to evaluate an objective function relating values of hyper-parameters of the machine learning system to performance of the machine learning system, the identifying performed at least in part by using a probabilistic model of the objective function, the probabilistic model of the objective function comprising a non-linear mapping of at least some of the values of the hyper-parameters from a first domain to a second domain; evaluating the objective function at the identified first set of hyper-parameter values, at least in part by executing the machine learning system when configured with the first set of hyper-parameter values; and updating the probabilistic model of the objective function based on the evaluation of the objective function at the first set of hyper-parameter values. 10. The method of claim 9 , wherein the machine learning system comprises a neural network for identifying objects in images and the objective function relates values of a plurality of hyper-parameters of the neural network to performance of the neural network in identifying the objects in the image. 11. The method of claim 10 , wherein evaluating the objective function at the identified first set of hyper-parameter values comprises: training the machine learning system configured with the first set of hyper-parameter values to obtain first learned parameter values, the training using training data of a dataset of images; processing the dataset of images using the machine learning system configured with the first set of hyper-parameters values and the first learned parameter values to obtain a first measure of generalization performance of the machine learning system in identifying objects in images when configured with the first set of hyper-parameter values. 12. The method of claim 10 , further comprising: identifying an extremal value of the objective function based at least in part on a result of evaluating the objective function at the first set of hyper-parameter values; configuring the plurality of hyper-parameters of the neural network for identifying objects in images based on the extremal value of the objective function; and processing a dataset of images using the neural network with the plurality of hyper-parameters configured based on the extremal value of the objective function. 13. The method of claim 9 , further comprising determining at least one parameter of the non-linear mapping based on one or more evaluations of the objective function. 14. The method of claim 9 , further comprising: identifying a second set of hyper-parameter values at which to evaluate the objective function; evaluating the objective function at the identified second set of hyper-parameter values, at least in part by executing the machine learning system when configured with the second set of hyper-parameter values; and updating the probabilistic model of the objective function based on the evaluation of the objective function at the second set of hyper-parameter values. 15. The method of claim 9 , wherein the non-linear mapping comprises a one-to-one mapping of values from the first domain to the second domain. 16. At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one processor, cause the at least one processor to perform a method for optimizing performance of a machine learning system, the method comprising: identifying a first set o
Probabilistic graphical models, e.g. probabilistic networks · CPC title
using kernel methods, e.g. support vector machines [SVM] · CPC title
Arrangements for executing specific programs · CPC title
for solving equations {, e.g. nonlinear equations, general mathematical optimization problems (optimization specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.