Who is the assignee on this patent?

Harvard College, Governing Council Univ Toronto

What technology area does this patent fall under?

Primary CPC classification G06N7/01. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Systems and methods for Bayesian optimization using non-linear mapping of input

US11501192B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11501192-B2
Application number	US-201816121195-A
Country	US
Kind code	B2
Filing date	Sep 4, 2018
Priority date	May 30, 2013
Publication date	Nov 15, 2022
Grant date	Nov 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques for use in connection with performing optimization using an objective function that maps elements in a first domain to values in a range. The techniques include using at least one computer hardware processor to perform: identifying a first point at which to evaluate the objective function at least in part by using an acquisition utility function and a probabilistic model of the objective function, wherein the probabilistic model depends on a non-linear one-to-one mapping of elements in the first domain to elements in a second domain; evaluating the objective function at the identified first point to obtain a corresponding first value of the objective function; and updating the probabilistic model of the objective function using the first value to obtain an updated probabilistic model of the objective function.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for optimizing performance of a machine learning system, the system comprising: at least one processor; and at least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by the at least one processor, cause the at least one processor to perform: identifying a first set of hyper-parameter values at which to evaluate an objective function relating values of hyper-parameters of the machine learning system to performance of the machine learning system, the identifying performed at least in part by using a probabilistic model of the objective function, the probabilistic model of the objective function comprising a non-linear mapping of at least some of the values of the hyper-parameters from a first domain to a second domain; evaluating the objective function at the identified first set of hyper-parameter values, at least in part by executing the machine learning system when configured with the first set of hyper-parameter values; and updating the probabilistic model of the objective function based on the evaluation of the objective function at the first set of hyper-parameter values. 2. The system of claim 1 , wherein the machine learning system comprises a neural network for identifying objects in images and the objective function relates values of a plurality of hyper-parameters of the neural network to performance of the neural network in identifying the objects in the image. 3. The system of claim 2 , wherein the plurality of hyper-parameters of the neural network comprises at least one of: a learning rate, a dropout rate, a weight norm, a hidden layer size, a convolutional kernel size, or a pooling size. 4. The system of claim 2 , wherein evaluating the objective function at the identified first set of hyper-parameter values comprises: training the machine learning system configured with the first set of hyper-parameter values to obtain first learned parameter values, the training using training data of a dataset of images; processing the dataset of images using the machine learning system configured with the first set of hyper-parameters values and the first learned parameter values to obtain a first measure of generalization performance of the machine learning system in identifying objects in images when configured with the first set of hyper-parameter values. 5. The system of claim 2 , wherein the instructions, when executed by the at least one processor, cause the at least one processor to perform: identifying an extremal value of the objective function based at least in part on a result of evaluating the objective function at the first set of hyper-parameter values; configuring the plurality of hyper-parameters of the neural network for identifying objects in images based on the extremal value of the objective function; and processing a dataset of images using the neural network with the plurality of hyper-parameters configured based on the extremal value of the objective function. 6. The system of claim 1 , wherein the processor-executable instructions further cause the at least one processor to perform: determining at least one parameter of the non-linear mapping based on one or more evaluations of the objective function. 7. The system of claim 1 , wherein the processor-executable instructions further cause the at least one processor to perform: identifying a second set of hyper-parameter values at which to evaluate the objective function; evaluating the objective function at the identified second set of hyper-parameter values, at least in part by executing the machine learning system when configured with the second set of hyper-parameter values; and updating the probabilistic model of the objective function based on the evaluation of the objective function at the second set of hyper-parameter values. 8. The system of claim 1 , wherein the non-linear mapping comprises a one-to-one mapping of values from the first domain to the second domain. 9. A method of optimizing performance of a machine learning system, the method comprising: identifying a first set of hyper-parameter values at which to evaluate an objective function relating values of hyper-parameters of the machine learning system to performance of the machine learning system, the identifying performed at least in part by using a probabilistic model of the objective function, the probabilistic model of the objective function comprising a non-linear mapping of at least some of the values of the hyper-parameters from a first domain to a second domain; evaluating the objective function at the identified first set of hyper-parameter values, at least in part by executing the machine learning system when configured with the first set of hyper-parameter values; and updating the probabilistic model of the objective function based on the evaluation of the objective function at the first set of hyper-parameter values. 10. The method of claim 9 , wherein the machine learning system comprises a neural network for identifying objects in images and the objective function relates values of a plurality of hyper-parameters of the neural network to performance of the neural network in identifying the objects in the image. 11. The method of claim 10 , wherein evaluating the objective function at the identified first set of hyper-parameter values comprises: training the machine learning system configured with the first set of hyper-parameter values to obtain first learned parameter values, the training using training data of a dataset of images; processing the dataset of images using the machine learning system configured with the first set of hyper-parameters values and the first learned parameter values to obtain a first measure of generalization performance of the machine learning system in identifying objects in images when configured with the first set of hyper-parameter values. 12. The method of claim 10 , further comprising: identifying an extremal value of the objective function based at least in part on a result of evaluating the objective function at the first set of hyper-parameter values; configuring the plurality of hyper-parameters of the neural network for identifying objects in images based on the extremal value of the objective function; and processing a dataset of images using the neural network with the plurality of hyper-parameters configured based on the extremal value of the objective function. 13. The method of claim 9 , further comprising determining at least one parameter of the non-linear mapping based on one or more evaluations of the objective function. 14. The method of claim 9 , further comprising: identifying a second set of hyper-parameter values at which to evaluate the objective function; evaluating the objective function at the identified second set of hyper-parameter values, at least in part by executing the machine learning system when configured with the second set of hyper-parameter values; and updating the probabilistic model of the objective function based on the evaluation of the objective function at the second set of hyper-parameter values. 15. The method of claim 9 , wherein the non-linear mapping comprises a one-to-one mapping of values from the first domain to the second domain. 16. At least one non-transitory computer-readable storage medium storing processor-executable instructions that, when executed by at least one processor, cause the at least one processor to perform a method for optimizing performance of a machine learning system, the method comprising: identifying a first set o

Assignees

Inventors

Classifications

G06N7/01Primary
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N20/10Primary
using kernel methods, e.g. support vector machines [SVM] · CPC title
G06F9/44
Arrangements for executing specific programs · CPC title
G06F17/11
for solving equations {, e.g. nonlinear equations, general mathematical optimization problems (optimization specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
G06F17/18
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title

Patent family

Related publications grouped by family.

View patent family 51986294

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11501192B2 cover?: Techniques for use in connection with performing optimization using an objective function that maps elements in a first domain to values in a range. The techniques include using at least one computer hardware processor to perform: identifying a first point at which to evaluate the objective function at least in part by using an acquisition utility function and a probabilistic model of the objec…
Who is the assignee on this patent?: Harvard College, Governing Council Univ Toronto
What technology area does this patent fall under?: Primary CPC classification G06N7/01. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Systems and methods for Bayesian optimization using non-linear mapping of input

Systems and methods for Bayesian optimization using integrated acquisition functions

Systems and methods for multi-task Bayesian optimization

Systems and methods for parallelizing bayesian optimization

Systems and methods for multi-task bayesian optimization

Systems and methods for bayesian optimization using integrated acquisition functions

Frequently asked questions