Who is the assignee on this patent?

Microsoft Technology Licensing Llc

What technology area does this patent fall under?

Primary CPC classification G06N20/00. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jul 19 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Large-scale automated hyperparameter tuning

US11392859B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11392859-B2
Application number	US-201916246403-A
Country	US
Kind code	B2
Filing date	Jan 11, 2019
Priority date	Jan 11, 2019
Publication date	Jul 19, 2022
Grant date	Jul 19, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods determine optimized hyperparameter values for one or more machine-learning models. A sample training data set from a larger corpus of training data is obtained. Initial hyperparameter values are then randomly selected. Using the sample training data set and the randomly chosen hyperparameter values, an initial set of performance metric values are obtained. Maximized hyperparameter values are then determined from the initial set of hyperparameter values based on the corresponding performance metric value. A larger corpus of training data is then evaluated using the maximized hyperparameter values and the corresponding machine-learning model, which yields another corresponding set of performance metric values. The maximized hyperparameter values and their corresponding set of performance metric values are then merged with the prior set of hyperparameter values. The foregoing operations are performed iteratively until it is determined that the hyperparameter values are converging to a particular value.

First claim

Opening claim text (preview).

We claim: 1. A system comprising: a computer-readable storage device storing computer-executable instructions; and at least one hardware processor communicatively coupled to the computer-readable storage device that, when the computer-executable instructions are executed, configures the system to: sample a predetermined amount of training data from a corpus of training data to obtain sampled training data; select a predetermined number of hyperparameter values from a predetermined set of hyperparameter values to obtain an initial plurality of hyperparameter values, wherein each hyperparameter value of the initial plurality of hyperparameter values corresponds to at least one hyperparameter that defines a machine-learning model; evaluate the machine-learning model using the sampled training data and the initial plurality of hyperparameter values to obtain a plurality of corresponding performance metric values; assign the initial plurality of hyperparameter values and their corresponding performance metric values as a baseline set of hyperparameter values; assign the baseline set of hyperparameter values as a current set of hyperparameter values; iteratively perform the following operations until the at least one hardware processor determines that at least one hyperparameter is converging on a particular hyperparameter value: determine a maximized plurality of hyperparameter values based on the current set of hyperparameter values; determine a corresponding performance metric value for each hyperparameter value of the maximized plurality of hyperparameter values based on applying the machine-learning model to the corpus of training data using at least one hyperparameter value of the maximized plurality of hyperparameter values; merge the maximized plurality of hyperparameter values and their corresponding performance metric values with the baseline set of values to obtain a merged set of hyperparameter values; and assign the merged set of hyperparameter values as the current set of hyperparameter values on which to perform the determination and merging operations; and transmit the maximized plurality of hyperparameter values as optimal hyperparameter values for corresponding hyperparameters of the machine-learning model. 2. The system of claim 1 , wherein the at least one hardware processor is further configured to: determine a baseline bias value based on at least one performance metric value selected from the performance metric values associated with the baseline set of hyperparameter values; and wherein the assignment of the baseline set of hyperparameter values as a current set of hyperparameter values comprises discounting at least one hyperparameter value of the baseline set of hyperparameter values by the determined baseline bias value. 3. The system of claim 1 , wherein the at least one hardware processor is further configured to: determine a current bias value based on a number of iterations performed and at least one performance metric value selected from the performance metric values associated with the maximized plurality of hyperparameter values; and wherein at least one hyperparameter of the maximized plurality of hyperparameter values is discounted by the determined current bias value. 4. The system of claim 1 , wherein the at least one hardware processor is further configured to: estimate an integrated acquisition function based on at least one hyperparameter value selected from the current set of hyperparameter values and on at least one performance metric value associated with the at least one hyperparameter value; and wherein the maximized plurality of hyperparameter values are further determined based on the estimated integrated acquisition function. 5. The system of claim 1 , wherein the at least one hardware processor is further configured to receive a selection of the at least one hyperparameter that defines the machine-learning model. 6. The system of claim 1 , wherein: evaluation of the machine-learning model using the sampled training data and the initial plurality of hyperparameter values to obtain a plurality of corresponding performance metric values further comprises receiving a selection of a performance metric for which the plurality of corresponding performance metric values are obtained. 7. The system of claim 1 , wherein: evaluation of the machine-learning model using the sampled training data and the initial plurality of hyperparameter values to obtain a plurality of corresponding performance metric values is performed in parallel within a distributed computing architecture. 8. A method comprising: sampling, by at least one hardware processor, a predetermined amount of training data from a corpus of training data to obtain sampled training data; selecting, by the at least one hardware processor, a predetermined number of hyperparameter values from a predetermined set of hyperparameter values to obtain an initial plurality of hyperparameter values, wherein each hyperparameter value of the initial plurality of hyperparameter values corresponds to at least one hyperparameter that defines a machine-learning model; evaluating, by the at least one hardware processor, the machine-learning model using the sampled training data and the initial plurality of hyperparameter values to obtain a plurality of corresponding performance metric values; assigning, by the at least one hardware processor, the initial plurality of hyperparameter values and their corresponding performance metric values as a baseline set of hyperparameter values; assigning, by the at least one hardware processor, the baseline set of hyperparameter values as a current set of hyperparameter values; iteratively performing the following operations until the at least one hardware processor determines that at least one hyperparameter is converging on a particular hyperparameter value: determining a maximized plurality of hyperparameter values based on the current set of hyperparameter values; determining a corresponding performance metric value for each hyperparameter value of the maximized plurality of hyperparameter values based on applying the machine-learning model to the corpus of training data using at least one hyperparameter value of the maximized plurality of hyperparameter values; merging the maximized plurality of hyperparameter values and their corresponding performance metric values with the baseline set of values to obtain a merged set of hyperparameter values; and assigning the merged set of hyperparameter values as the current set of hyperparameter values on which to perform the determination and merging operations; and transmitting the maximized plurality of hyperparameter values as optimal hyperparameter values for corresponding hyperparameters of the machine-learning model. 9. The method of claim 8 , further comprising: determining a baseline bias value based on at least one performance metric value selected from the performance metric values associated with the baseline set of hyperparameter values; and wherein the assignment of the baseline set of hyperparameter values as a current set of hyperparameter values comprises discounting at least one hyperparameter value of the baseline set of hyperparameter values by the determined baseline bias value. 10. The method of claim 8 , further comprising: determining a current bias value based on a number of iterations performed and at least one performance metric value selected from the performance metric values associated with the maximized plurality of hyperparameter values; and wherein at least one hyperparameter of the maximized plurality of hyperparameter values is discounted by the determined current bias value.

Assignees

Microsoft Technology Licensing Llc

Inventors

Classifications

G06N5/01
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
G06N7/01
Probabilistic graphical models, e.g. probabilistic networks · CPC title
G06N20/00Primary
Machine learning · CPC title
G06F11/3466
Performance evaluation by tracing or monitoring · CPC title
G06F17/18
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title

Patent family

Related publications grouped by family.

View patent family 71516753

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11392859B2 cover?: Systems and methods determine optimized hyperparameter values for one or more machine-learning models. A sample training data set from a larger corpus of training data is obtained. Initial hyperparameter values are then randomly selected. Using the sample training data set and the randomly chosen hyperparameter values, an initial set of performance metric values are obtained. Maximized hyperpar…
Who is the assignee on this patent?: Microsoft Technology Licensing Llc
What technology area does this patent fall under?: Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jul 19 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).