Systems and methods for Bayesian optimization using integrated acquisition functions
US-9864953-B2 · Jan 9, 2018 · US
US10346757B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10346757-B2 |
| Application number | US-201414291337-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 30, 2014 |
| Priority date | May 30, 2013 |
| Publication date | Jul 9, 2019 |
| Grant date | Jul 9, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for use in connection with performing optimization using an objective function. The techniques include using at least one computer hardware processor to perform: beginning evaluation of the objective function at a first point; before evaluating the objective function at the first point is completed: identifying, based on likelihoods of potential outcomes of evaluating the objective function at the first point, a second point different from the first point at which to evaluate the objective function; and beginning evaluation of the objective function at the second point.
Opening claim text (preview).
What is claimed is: 1. A system for optimizing performance of a machine learning system having multiple hyper-parameters, the system comprising: at least one computer hardware processor; and at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by the at least one computer hardware processor, cause the at least one computer hardware processor to perform: identifying a first set of hyper-parameter values at which to evaluate an objective function that relates hyper-parameter values of the machine learning system to values providing a measure of performance of the machine learning system; beginning evaluation of the objective function at the first set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the first set of hyper-parameter values to obtain a first value corresponding to the first set of hyper-parameter values; before evaluating the objective function at the first set of hyper-parameter values is completed: determining likelihoods of potential outcomes of evaluating the objective function at the first set of hyper-parameter values using a probabilistic model of the objective function; identifying, based on the likelihoods of the potential outcomes, a second set of hyper-parameter values different from the first set of hyper-parameter values at which to evaluate the objective function; and beginning evaluation of the objective function at the second set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the second set of hyper-parameter values to obtain a second value corresponding to the second set of hyper-parameter values; selecting one of the first and second sets of hyper-parameter values based at least on the first value corresponding to the first set of hyper-parameter values and the second value corresponding to the second set of hyper-parameter values; and configuring the machine learning system with the selected set of hyper-parameter values that optimizes the machine learning system. 2. The system of claim 1 , wherein the objective function relates values of a plurality of hyper-parameters of a neural network for identifying objects in images to respective values providing a measure of performance of the neural network in identifying the objects in the images. 3. The system of claim 1 , wherein the at least one computer hardware processor comprises a first computer hardware processor and a second computer hardware processor different from the first computer hardware processor, and wherein the processor-executable instructions cause: at least the first computer hardware processor to perform evaluation of the objective function at the first set of hyper-parameter values; and at least the second computer hardware processor to perform evaluation of the objective function at the second set of hyper-parameter values. 4. The system of claim 1 , wherein the identifying comprises using an acquisition utility function obtained at least in part by calculating an expected value of an initial acquisition utility function with respect to potential values of the objective function at the first set of hyper-parameter values. 5. The system of claim 1 , wherein the likelihoods are obtained using the probabilistic model of the objective function, and wherein the processor-executable instructions further cause the at least one computer hardware processor to perform: updating the probabilistic model of the objective function using results of evaluating the objective function at the first set of hyper-parameter values and/or the second set of hyper-parameter values to obtain an updated probabilistic model of the objective function. 6. The system of claim 5 , wherein the processor-executable instructions further cause the at least one computer hardware processor to perform: identifying, using the updated probabilistic model of the objective function, at least a third set of hyper-parameter values at which to evaluate the objective function; and beginning evaluation of the objective function at least at the identified third set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the third set of hyper-parameter values to obtain a third value corresponding to the third set of hyper-parameter values. 7. The system of claim 5 , wherein the probabilistic model of the objective function comprises a Gaussian process. 8. The system of claim 1 , wherein the probabilistic model of the objective function comprises a neural network. 9. The system of claim 1 , wherein identifying the first set of hyper-parameter values at which to evaluate the objective function comprises using an integrated acquisition utility function and the probabilistic model of the objective function, wherein the integrated acquisition function is obtained by integrating multiple acquisition functions. 10. A method for optimizing performance of a machine learning system having multiple hyper-parameters, the method comprising: using at least one computer hardware processor to perform: identifying a first set of hyper-parameter values at which to evaluate an objective function that relates hyper-parameter values of the machine learning system to values providing a measure of performance of the machine learning system; beginning evaluation of the objective function at the first set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the first set of hyper-parameter values to obtain a first value corresponding to the first set of hyper-parameter values; before evaluating the objective function at the first set of hyper-parameter values is completed: determining likelihoods of potential outcomes of evaluating the objective function at the first set of hyper-parameter values using a probabilistic model of the objective function; identifying, based on likelihoods of potential outcomes of evaluating the objective function at the first point, a second set of hyper-parameter values different from the first set of hyper-parameter values at which to evaluate the objective function; and beginning evaluation of the objective function at the second set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the second set of hyper-parameter values to obtain a second value corresponding to the second set of hyper-parameter values; selecting one of the first and second sets of hyper-parameter values based at least on the first value corresponding to the first set of hyper-parameter values and the second value corresponding to the second set of hyper-parameter values; and configuring the machine learning system with the selected set of hyper-parameter values that optimizes the machine learning system. 11. The method of claim 10 , wherein the objective function relates values of a plurality of hyper-parameters of a neural network for identifying objects in images to respective values providing a measure of performance of the neural network in identifying the objects in the images. 12. The method of claim 10 , wherein the at least one computer hardware processor comprises a first computer hardware processor and a second computer hardware processor different from the first computer hardware processor, and wherein the method comprises: using at least the first computer hardware processor to perform evaluation of the objective function at the first set of hyper-parameter values; and using
Probabilistic graphical models, e.g. probabilistic networks · CPC title
using kernel methods, e.g. support vector machines [SVM] · CPC title
for solving equations {, e.g. nonlinear equations, general mathematical optimization problems (optimization specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Machine learning · CPC title
Fuzzy inferencing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.