Systems and methods for parallelizing Bayesian optimization

US10346757B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10346757-B2
Application numberUS-201414291337-A
CountryUS
Kind codeB2
Filing dateMay 30, 2014
Priority dateMay 30, 2013
Publication dateJul 9, 2019
Grant dateJul 9, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for use in connection with performing optimization using an objective function. The techniques include using at least one computer hardware processor to perform: beginning evaluation of the objective function at a first point; before evaluating the objective function at the first point is completed: identifying, based on likelihoods of potential outcomes of evaluating the objective function at the first point, a second point different from the first point at which to evaluate the objective function; and beginning evaluation of the objective function at the second point.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for optimizing performance of a machine learning system having multiple hyper-parameters, the system comprising: at least one computer hardware processor; and at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by the at least one computer hardware processor, cause the at least one computer hardware processor to perform: identifying a first set of hyper-parameter values at which to evaluate an objective function that relates hyper-parameter values of the machine learning system to values providing a measure of performance of the machine learning system; beginning evaluation of the objective function at the first set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the first set of hyper-parameter values to obtain a first value corresponding to the first set of hyper-parameter values; before evaluating the objective function at the first set of hyper-parameter values is completed: determining likelihoods of potential outcomes of evaluating the objective function at the first set of hyper-parameter values using a probabilistic model of the objective function; identifying, based on the likelihoods of the potential outcomes, a second set of hyper-parameter values different from the first set of hyper-parameter values at which to evaluate the objective function; and beginning evaluation of the objective function at the second set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the second set of hyper-parameter values to obtain a second value corresponding to the second set of hyper-parameter values; selecting one of the first and second sets of hyper-parameter values based at least on the first value corresponding to the first set of hyper-parameter values and the second value corresponding to the second set of hyper-parameter values; and configuring the machine learning system with the selected set of hyper-parameter values that optimizes the machine learning system. 2. The system of claim 1 , wherein the objective function relates values of a plurality of hyper-parameters of a neural network for identifying objects in images to respective values providing a measure of performance of the neural network in identifying the objects in the images. 3. The system of claim 1 , wherein the at least one computer hardware processor comprises a first computer hardware processor and a second computer hardware processor different from the first computer hardware processor, and wherein the processor-executable instructions cause: at least the first computer hardware processor to perform evaluation of the objective function at the first set of hyper-parameter values; and at least the second computer hardware processor to perform evaluation of the objective function at the second set of hyper-parameter values. 4. The system of claim 1 , wherein the identifying comprises using an acquisition utility function obtained at least in part by calculating an expected value of an initial acquisition utility function with respect to potential values of the objective function at the first set of hyper-parameter values. 5. The system of claim 1 , wherein the likelihoods are obtained using the probabilistic model of the objective function, and wherein the processor-executable instructions further cause the at least one computer hardware processor to perform: updating the probabilistic model of the objective function using results of evaluating the objective function at the first set of hyper-parameter values and/or the second set of hyper-parameter values to obtain an updated probabilistic model of the objective function. 6. The system of claim 5 , wherein the processor-executable instructions further cause the at least one computer hardware processor to perform: identifying, using the updated probabilistic model of the objective function, at least a third set of hyper-parameter values at which to evaluate the objective function; and beginning evaluation of the objective function at least at the identified third set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the third set of hyper-parameter values to obtain a third value corresponding to the third set of hyper-parameter values. 7. The system of claim 5 , wherein the probabilistic model of the objective function comprises a Gaussian process. 8. The system of claim 1 , wherein the probabilistic model of the objective function comprises a neural network. 9. The system of claim 1 , wherein identifying the first set of hyper-parameter values at which to evaluate the objective function comprises using an integrated acquisition utility function and the probabilistic model of the objective function, wherein the integrated acquisition function is obtained by integrating multiple acquisition functions. 10. A method for optimizing performance of a machine learning system having multiple hyper-parameters, the method comprising: using at least one computer hardware processor to perform: identifying a first set of hyper-parameter values at which to evaluate an objective function that relates hyper-parameter values of the machine learning system to values providing a measure of performance of the machine learning system; beginning evaluation of the objective function at the first set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the first set of hyper-parameter values to obtain a first value corresponding to the first set of hyper-parameter values; before evaluating the objective function at the first set of hyper-parameter values is completed: determining likelihoods of potential outcomes of evaluating the objective function at the first set of hyper-parameter values using a probabilistic model of the objective function; identifying, based on likelihoods of potential outcomes of evaluating the objective function at the first point, a second set of hyper-parameter values different from the first set of hyper-parameter values at which to evaluate the objective function; and beginning evaluation of the objective function at the second set of hyper-parameter values at least in part by executing the machine learning system with the multiple hyper-parameters set to the second set of hyper-parameter values to obtain a second value corresponding to the second set of hyper-parameter values; selecting one of the first and second sets of hyper-parameter values based at least on the first value corresponding to the first set of hyper-parameter values and the second value corresponding to the second set of hyper-parameter values; and configuring the machine learning system with the selected set of hyper-parameter values that optimizes the machine learning system. 11. The method of claim 10 , wherein the objective function relates values of a plurality of hyper-parameters of a neural network for identifying objects in images to respective values providing a measure of performance of the neural network in identifying the objects in the images. 12. The method of claim 10 , wherein the at least one computer hardware processor comprises a first computer hardware processor and a second computer hardware processor different from the first computer hardware processor, and wherein the method comprises: using at least the first computer hardware processor to perform evaluation of the objective function at the first set of hyper-parameter values; and using

Assignees

Inventors

Classifications

  • G06N7/01Primary

    Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • G06N20/10Primary

    using kernel methods, e.g. support vector machines [SVM] · CPC title

  • for solving equations {, e.g. nonlinear equations, general mathematical optimization problems (optimization specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title

  • Machine learning · CPC title

  • G06N5/048Primary

    Fuzzy inferencing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10346757B2 cover?
Techniques for use in connection with performing optimization using an objective function. The techniques include using at least one computer hardware processor to perform: beginning evaluation of the objective function at a first point; before evaluating the objective function at the first point is completed: identifying, based on likelihoods of potential outcomes of evaluating the objective f…
Who is the assignee on this patent?
Univ Sherbrooke, Harvard College, Governing Council Of The Univ Of Toronto The, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06N7/01. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 09 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).