Method and apparatus for managing machine learning process
US-2017372229-A1 · Dec 28, 2017 · US
US11392859B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11392859-B2 |
| Application number | US-201916246403-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 11, 2019 |
| Priority date | Jan 11, 2019 |
| Publication date | Jul 19, 2022 |
| Grant date | Jul 19, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems and methods determine optimized hyperparameter values for one or more machine-learning models. A sample training data set from a larger corpus of training data is obtained. Initial hyperparameter values are then randomly selected. Using the sample training data set and the randomly chosen hyperparameter values, an initial set of performance metric values are obtained. Maximized hyperparameter values are then determined from the initial set of hyperparameter values based on the corresponding performance metric value. A larger corpus of training data is then evaluated using the maximized hyperparameter values and the corresponding machine-learning model, which yields another corresponding set of performance metric values. The maximized hyperparameter values and their corresponding set of performance metric values are then merged with the prior set of hyperparameter values. The foregoing operations are performed iteratively until it is determined that the hyperparameter values are converging to a particular value.
Opening claim text (preview).
We claim: 1. A system comprising: a computer-readable storage device storing computer-executable instructions; and at least one hardware processor communicatively coupled to the computer-readable storage device that, when the computer-executable instructions are executed, configures the system to: sample a predetermined amount of training data from a corpus of training data to obtain sampled training data; select a predetermined number of hyperparameter values from a predetermined set of hyperparameter values to obtain an initial plurality of hyperparameter values, wherein each hyperparameter value of the initial plurality of hyperparameter values corresponds to at least one hyperparameter that defines a machine-learning model; evaluate the machine-learning model using the sampled training data and the initial plurality of hyperparameter values to obtain a plurality of corresponding performance metric values; assign the initial plurality of hyperparameter values and their corresponding performance metric values as a baseline set of hyperparameter values; assign the baseline set of hyperparameter values as a current set of hyperparameter values; iteratively perform the following operations until the at least one hardware processor determines that at least one hyperparameter is converging on a particular hyperparameter value: determine a maximized plurality of hyperparameter values based on the current set of hyperparameter values; determine a corresponding performance metric value for each hyperparameter value of the maximized plurality of hyperparameter values based on applying the machine-learning model to the corpus of training data using at least one hyperparameter value of the maximized plurality of hyperparameter values; merge the maximized plurality of hyperparameter values and their corresponding performance metric values with the baseline set of values to obtain a merged set of hyperparameter values; and assign the merged set of hyperparameter values as the current set of hyperparameter values on which to perform the determination and merging operations; and transmit the maximized plurality of hyperparameter values as optimal hyperparameter values for corresponding hyperparameters of the machine-learning model. 2. The system of claim 1 , wherein the at least one hardware processor is further configured to: determine a baseline bias value based on at least one performance metric value selected from the performance metric values associated with the baseline set of hyperparameter values; and wherein the assignment of the baseline set of hyperparameter values as a current set of hyperparameter values comprises discounting at least one hyperparameter value of the baseline set of hyperparameter values by the determined baseline bias value. 3. The system of claim 1 , wherein the at least one hardware processor is further configured to: determine a current bias value based on a number of iterations performed and at least one performance metric value selected from the performance metric values associated with the maximized plurality of hyperparameter values; and wherein at least one hyperparameter of the maximized plurality of hyperparameter values is discounted by the determined current bias value. 4. The system of claim 1 , wherein the at least one hardware processor is further configured to: estimate an integrated acquisition function based on at least one hyperparameter value selected from the current set of hyperparameter values and on at least one performance metric value associated with the at least one hyperparameter value; and wherein the maximized plurality of hyperparameter values are further determined based on the estimated integrated acquisition function. 5. The system of claim 1 , wherein the at least one hardware processor is further configured to receive a selection of the at least one hyperparameter that defines the machine-learning model. 6. The system of claim 1 , wherein: evaluation of the machine-learning model using the sampled training data and the initial plurality of hyperparameter values to obtain a plurality of corresponding performance metric values further comprises receiving a selection of a performance metric for which the plurality of corresponding performance metric values are obtained. 7. The system of claim 1 , wherein: evaluation of the machine-learning model using the sampled training data and the initial plurality of hyperparameter values to obtain a plurality of corresponding performance metric values is performed in parallel within a distributed computing architecture. 8. A method comprising: sampling, by at least one hardware processor, a predetermined amount of training data from a corpus of training data to obtain sampled training data; selecting, by the at least one hardware processor, a predetermined number of hyperparameter values from a predetermined set of hyperparameter values to obtain an initial plurality of hyperparameter values, wherein each hyperparameter value of the initial plurality of hyperparameter values corresponds to at least one hyperparameter that defines a machine-learning model; evaluating, by the at least one hardware processor, the machine-learning model using the sampled training data and the initial plurality of hyperparameter values to obtain a plurality of corresponding performance metric values; assigning, by the at least one hardware processor, the initial plurality of hyperparameter values and their corresponding performance metric values as a baseline set of hyperparameter values; assigning, by the at least one hardware processor, the baseline set of hyperparameter values as a current set of hyperparameter values; iteratively performing the following operations until the at least one hardware processor determines that at least one hyperparameter is converging on a particular hyperparameter value: determining a maximized plurality of hyperparameter values based on the current set of hyperparameter values; determining a corresponding performance metric value for each hyperparameter value of the maximized plurality of hyperparameter values based on applying the machine-learning model to the corpus of training data using at least one hyperparameter value of the maximized plurality of hyperparameter values; merging the maximized plurality of hyperparameter values and their corresponding performance metric values with the baseline set of values to obtain a merged set of hyperparameter values; and assigning the merged set of hyperparameter values as the current set of hyperparameter values on which to perform the determination and merging operations; and transmitting the maximized plurality of hyperparameter values as optimal hyperparameter values for corresponding hyperparameters of the machine-learning model. 9. The method of claim 8 , further comprising: determining a baseline bias value based on at least one performance metric value selected from the performance metric values associated with the baseline set of hyperparameter values; and wherein the assignment of the baseline set of hyperparameter values as a current set of hyperparameter values comprises discounting at least one hyperparameter value of the baseline set of hyperparameter values by the determined baseline bias value. 10. The method of claim 8 , further comprising: determining a current bias value based on a number of iterations performed and at least one performance metric value selected from the performance metric values associated with the maximized plurality of hyperparameter values; and wherein at least one hyperparameter of the maximized plurality of hyperparameter values is discounted by the determined current bias value.
Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title
Probabilistic graphical models, e.g. probabilistic networks · CPC title
Machine learning · CPC title
Performance evaluation by tracing or monitoring · CPC title
for evaluating statistical data {, e.g. average values, frequency distributions, probability functions, regression analysis (forecasting specially adapted for a specific administrative, business or logistic context G06Q10/04)} · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.