Distributed hyperparameter tuning system for machine learning

US10360517B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10360517-B2
Application numberUS-201715822462-A
CountryUS
Kind codeB2
Filing dateNov 27, 2017
Priority dateFeb 22, 2017
Publication dateJul 23, 2019
Grant dateJul 23, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing device automatically selects hyperparameter values based on objective criteria to train a predictive model. Each session of a plurality of sessions executes training and scoring of a model type using an input dataset in parallel with other sessions of the plurality of sessions. Unique hyperparameter configurations are determined using a search method and assigned to each session. For each session of the plurality of sessions, training of a model of the model type is requested using a training dataset and the assigned hyperparameter configuration, scoring of the trained model using a validation dataset and the assigned hyperparameter configuration is requested to compute an objective function value, and the received objective function value and the assigned hyperparameter configuration are stored. A best hyperparameter configuration is identified based on an extreme value of the stored objective function values.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory computer-readable medium having stored thereon computer-readable instructions that when executed by a first computing device cause the first computing device to: access a plurality of tuning evaluation parameters, wherein the plurality of tuning evaluation parameters include a model type, a search method type, and values to evaluate for each hyperparameter of a plurality of hyperparameters associated with the model type; determine a number of session computing devices allocated to each session of a plurality of sessions, wherein each session computing device of the number of session computing devices processes a subset of an input dataset, wherein the number of session computing devices is determined based on a number of rows and a number of columns of the input dataset; determine a number of the plurality of sessions, wherein each session of the plurality of sessions executes training and scoring of the model type using the input dataset in parallel with other sessions of the plurality of sessions, wherein the number of the plurality of sessions is determined based on the search method type; determine a plurality of hyperparameter configurations using a search method of the search method type, wherein a hyperparameter configuration includes a value for each hyperparameter of the plurality of hyperparameters, wherein each hyperparameter configuration of the plurality of hyperparameter configurations is unique; for each session of the plurality of sessions, assign a hyperparameter configuration to the session of the plurality of sessions; request training of a model of the model type by the session computing devices allocated to the session, wherein the model is trained using the assigned hyperparameter configuration and a training dataset that is a first portion of the input dataset; request scoring of the trained model by the session computing devices allocated to the session to compute an objective function value, wherein the trained model is scored using the assigned hyperparameter configuration and a validation dataset that is a second portion of the input dataset; receive the computed objective function value when the requested scoring is complete; and store the received objective function value and the assigned hyperparameter configuration; identify a best hyperparameter configuration based on an extreme value of the stored objective function values; and output the identified best hyperparameter configuration. 2. The non-transitory computer-readable medium of claim 1 , wherein before receiving the computed objective function value, the computer-readable instructions further cause the first computing device to receive an indicator from the session that the scoring is complete. 3. The non-transitory computer-readable medium of claim 2 , wherein when the requested scoring is complete and before identifying the best hyperparameter configuration, the computer-readable instructions further cause the first computing device to: (a) determine if the determined plurality of hyperparameter configurations includes another hyperparameter configuration that has not been assigned; (b) when the determined plurality of hyperparameter configurations includes another hyperparameter configuration that has not been assigned, assign the another hyperparameter configuration to the session from which the indicator was received; request training of the model of the model type by the session computing devices allocated to the session from which the indicator was received, wherein the model is trained using the training dataset and the assigned another hyperparameter configuration; request scoring of the trained model by the session computing devices allocated to the session from which the indicator was received to compute another objective function value using the validation dataset and the assigned another hyperparameter configuration; receive the computed another objective function value when the requested scoring is complete by the session computing devices allocated to the session from which the indicator was received; and store the received another objective function value and the assigned another hyperparameter configuration; and repeat (a) and (b) until all of the determined plurality of hyperparameter configurations have been assigned. 4. The non-transitory computer-readable medium of claim 3 , wherein the best hyperparameter configuration is identified after storing the received another objective function value and the assigned another hyperparameter configuration for all of the determined plurality of hyperparameter configurations. 5. The non-transitory computer-readable medium of claim 3 , wherein after storing the received another objective function value and the assigned another hyperparameter configuration for all of the determined plurality of hyperparameter configurations and before identifying the best hyperparameter configuration, the computer-readable instructions further cause the first computing device to: determine a second plurality of hyperparameter configurations using a second search method of a second search method type included in the tuning evaluation parameters, wherein each hyperparameter configuration of the second plurality of hyperparameter configurations is unique; for each session of the plurality of sessions, assign a second hyperparameter configuration to the session of the plurality of sessions; request second training of a second model of the model type by the session computing devices allocated to the session, wherein the second model is trained using the training dataset and the assigned second hyperparameter configuration; request second scoring of the trained second model by the session computing devices allocated to the session to compute a second objective function value, wherein the trained second model is scored using the validation dataset and the assigned second hyperparameter configuration; receive the computed second objective function value when the requested second scoring is complete; and store the received second objective function value and the assigned second hyperparameter configuration. 6. The non-transitory computer-readable medium of claim 5 , wherein the search method type is different from the second search method type. 7. The non-transitory computer-readable medium of claim 5 , wherein the second search method type is one of a plurality of search method types included in the tuning evaluation parameters, wherein the second plurality of hyperparameter configurations are determined using a second search method associated with each search method type of the plurality of search method types. 8. The non-transitory computer-readable medium of claim 5 , wherein each received objective function value, each assigned hyperparameter configuration, each received another objective function value, and each assigned another hyperparameter configuration are stored in an evaluation cache. 9. The non-transitory computer-readable medium of claim 8 , wherein before assigning the second hyperparameter configuration, the computer-readable instructions further cause the first computing device to remove any hyperparameter configuration from the determined second plurality of hyperparameter configurations that is within a predefined cache tolerance value of any hyperparameter configuration stored in the evaluation cache. 10. The non-transitory computer-readable medium of claim 1 , wherein before determining the plurality of hyperparameter configurations, the computer-readable instructions further cause the first computing device to: define a baseline hyperparameter configuration; select a baseline session from the plurality of s

Assignees

Inventors

Classifications

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • using ranking · CPC title

  • Evolutionary algorithms, e.g. genetic algorithms or genetic programming · CPC title

  • using kernel methods, e.g. support vector machines [SVM] · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10360517B2 cover?
A computing device automatically selects hyperparameter values based on objective criteria to train a predictive model. Each session of a plurality of sessions executes training and scoring of a model type using an input dataset in parallel with other sessions of the plurality of sessions. Unique hyperparameter configurations are determined using a search method and assigned to each session. Fo…
Who is the assignee on this patent?
Sas Inst Inc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 23 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).