Systems and methods for parallelizing bayesian optimization
US-2016328653-A1 · Nov 10, 2016 · US
US9858529B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9858529-B2 |
| Application number | US-201414291255-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 30, 2014 |
| Priority date | May 30, 2013 |
| Publication date | Jan 2, 2018 |
| Grant date | Jan 2, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques for use in connection with performing optimization using a plurality of objective functions associated with a respective plurality of tasks. The techniques include using at least one computer hardware processor to perform: identifying, based at least in part on a joint probabilistic model of the plurality of objective functions, a first point at which to evaluate an objective function in the plurality of objective functions; selecting, based at least in part on the joint probabilistic model, a first objective function in the plurality of objective functions to evaluate at the identified first point; evaluating the first objective function at the identified first point; and updating the joint probabilistic model based on results of the evaluation to obtain an updated joint probabilistic model.
Opening claim text (preview).
What is claimed is: 1. A system for optimizing performance of a machine learning system configured to perform a plurality of tasks, the system comprising: at least one computer hardware processor; and at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by the at least one computer hardware processor, cause the at least one computer hardware processor to perform: identifying, based at least in part on a joint probabilistic model that models correlation among a plurality of objective functions corresponding to the plurality of tasks, a first set of hyper-parameter values at which to evaluate an objective function in the plurality of objective functions; selecting, based at least in part on the joint probabilistic model, a first objective function in the plurality of objective functions to evaluate at the identified first set of hyper-parameter values, wherein the first objective function relates values of hyper-parameters of the machine learning system to values providing a measure of performance of the machine learning system; evaluating the first objective function at the identified first set of hyper-parameter values, at least in part by executing the machine learning system when configured with the first set of hyper-parameter values, to obtain at least one first value providing a measure of performance of the machine learning system using the first set of hyper-parameter values; and updating the joint probabilistic model based on results of the evaluation to obtain an updated joint probabilistic model. 2. The system of claim 1 , wherein the first objective function relates values of a plurality of hyper-parameters of a neural network for identifying objects in images to respective values providing a measure of performance of the neural network in identifying the objects in the images. 3. The system of claim 1 , wherein the processor-executable instructions further cause the at least one computer hardware processor to perform: identifying, based at least in part on the updated joint probabilistic model of the plurality of objective functions, a second set of hyper-parameter values at which to evaluate an objective function in the plurality of objective functions; selecting, based at least in part on the joint probabilistic model, a second objective function in the plurality of objective functions to evaluate at the identified second set of hyper-parameter values, wherein the second objective function relates values of hyper-parameters of the machine learning system to values providing a measure of performance of the machine learning system; and evaluating the second objective function at the identified second set of hyper-parameter values, at least in part by executing the machine learning system when configured with the second set of hyper-parameter values, to obtain at least one second value providing a measure of performance of the machine learning system using the second set of hyper-parameter values. 4. The system of claim 3 , wherein the first objective function is different from the second objective function. 5. The system of claim 1 , wherein the joint probabilistic model of the plurality of objective functions comprises a vector-valued Gaussian process. 6. The system of claim 1 , wherein the joint probabilistic model comprises a covariance kernel obtained based, at least in part, on a first covariance kernel modeling correlation among tasks in the plurality of tasks and a second covariance kernel modeling correlation among hyper-parameter values at which objective functions in the plurality of objective functions may be evaluated. 7. The system of claim 1 , wherein the identifying is performed further based on a cost-weighted entropy-search utility function. 8. A method for optimizing performance of a machine learning system configured to perform a plurality of tasks, the method comprising: using at least one computer hardware processor to perform: identifying, based at least in part on a joint probabilistic model that models correlation among a plurality of objective functions corresponding to the plurality of tasks, a first set of hyper-parameter values at which to evaluate an objective function in the plurality of objective functions; selecting, based at least in part on the joint probabilistic model, a first objective function in the plurality of objective functions to evaluate at the identified first set of hyper-parameter values, wherein the first objective function relates values of hyper-parameters of the machine learning system to values providing a measure of performance of the machine learning system; evaluating the first objective function at the identified first set of hyper-parameter values, at least in part by executing the machine learning system when configured with the first set of hyper-parameter values, to obtain at least one first value providing a measure of performance of the machine learning system using the first set of hyper-parameter values; and updating the joint probabilistic model based on results of the evaluation to obtain an updated joint probabilistic model. 9. The method of claim 8 , further comprising: identifying, based at least in part on the updated joint probabilistic model of the plurality of objective functions, a second set of hyper-parameter values at which to evaluate an objective function in the plurality of objective functions; selecting, based at least in part on the joint probabilistic model, a second objective function in the plurality of objective functions to evaluate at the identified second set of hyper-parameter values, wherein the second objective function relates values of hyper-parameters of the machine learning system to values providing a measure of performance of the machine learning system; and evaluating the second objective function at the identified second set of hyper-parameter values, at least in part by executing the machine learning system when configured with the second set of hyper-parameter values, to obtain at least one second value providing a measure of performance of the machine learning system using the second set of hyper-parameter values. 10. The method of claim 8 , wherein the joint probabilistic model of the plurality of objective functions comprises a vector-valued Gaussian process. 11. The method of claim 8 , wherein the joint probabilistic model comprises a covariance kernel obtained based, at least in part, on a first covariance kernel modeling correlation among tasks in the plurality of tasks and a second covariance kernel modeling correlation among hyper-parameter values at which objective functions in the plurality of objective functions may be evaluated. 12. At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by the at least one computer hardware processor, cause the at least one computer hardware processor to perform a method for optimizing performance of a machine learning system configured to perform a plurality of tasks, the method comprising: identifying, based at least in part on a joint probabilistic that models correlation among a plurality of objective functions corresponding to the plurality of tasks, a first set of hyper-parameter values at which to evaluate an objective function in the plurality of objective functions; selecting, based at least in part on the joint probabilistic model, a first objective function in the plurality of objective functions to evaluate at the identified first set of hyper-parameter values, wherein the first objective function relates values of hyper-parameters of the machine learning system
Probabilistic graphical models, e.g. probabilistic networks · CPC title
using kernel methods, e.g. support vector machines [SVM] · CPC title
for solving equations {, e.g. nonlinear equations, general mathematical optimization problems (optimization specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Fuzzy inferencing · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.