Optimizing parameters for machine learning models

US2019102693A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2019102693-A1
Application numberUS-201715721189-A
CountryUS
Kind codeA1
Filing dateSep 29, 2017
Priority dateSep 29, 2017
Publication dateApr 4, 2019
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An online system determines candidate parameter values to be used by a machine learning algorithm to train a machine learning model by saving historical datasets that include historical parameter searches and the performance of prior machine learning models that were trained on the historical parameters. Using the historical datasets, the online system identifies parameter predictors associated with a relation between candidate parameter values and properties of the training dataset that will be used to train the machine learning model. The online system trains the machine learning models according to the candidate parameter values and validates that the machine learning model is performing as expected. If the online system detects that the machine learning model is performing outside of an acceptable range, the online system determines new candidate parameter values and re-trains the machine learning model.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method comprising: storing, by an online system, a plurality of historical datasets, each historical dataset comprising historical parameter values used to train a prior machine learning model, an evaluation score representing a performance of the prior machine learning model, and associated metadata descriptive of the prior machine learning model; receiving a request to train a machine learning model; predicting candidate parameter values for training the machine learning model, the candidate parameter values predicted based on a subset of the plurality of historical datasets; receiving training data for training the machine learning model; and training the machine learning model using the received training data according to the predicted candidate parameter values. 2 . The method of claim 1 , wherein determining candidate parameter values comprises: identifying at least one parameter predictor associated with a relationship between one or more parameters and a training dataset property; and determining candidate parameters based on the at least one parameter predictor by applying a prediction model. 3 . The method of claim 2 , wherein each of the one or more training dataset properties is one of a total number of training examples, statistical properties of a distribution of training labels over training examples, attributes of a time series of training examples, attributes of an entity, attributes of past activity performed by the entity, attributes of the online system, and attributes of an event predicted by the machine learning model. 4 . The method of claim 1 , wherein determining candidate parameter values comprises: for each candidate parameter, assigning a weight to the candidate parameter, the weight representing an impact of the candidate parameter on the performance of the prior machine learning model; and determining a value for each candidate parameter based on the weight assigned to the candidate parameter and one or more evaluation scores in the subset of the plurality of historical datasets. 5 . The method of claim 1 , wherein the subset of the plurality of historical datasets is identified by comparing the associated metadata of the prior machine learning model to information describing the machine learning model. 6 . The method of claim 1 , wherein the machine learning model generates a predicted output, wherein the predicted output corresponds to a likelihood of occurrence of a user interaction performed by a user of the online system on a content item. 7 . The method of claim 6 , further comprising generating an evaluation score for the trained machine learning model based on a comparison between the predicted output from the prediction model and ground truth data from evaluation data. 8 . A non-transitory computer-readable medium comprising computer program code, the computer program code when executed by a processor of a client device causes the processor to: store, by an online system, a plurality of historical datasets, each historical dataset comprising historical parameter values used to train a prior machine learning model, an evaluation score representing a performance of the prior machine learning model, and associated metadata descriptive of the prior machine learning model; receive a request to train a machine learning model; predict candidate parameter values for training the machine learning model, the candidate parameter values predicted based on a subset of the plurality of historical datasets; receive training data for training the machine learning model; and train the machine learning model using the received training data according to the predicted candidate parameter values. 9 . The non-transitory medium of claim 8 , wherein the computer program code to determine candidate parameters further comprises computer program code that when executed by the processor causes the processor to: identify at least one parameter predictor associated with a relationship between one or more parameters and a training dataset property; and determine candidate parameters based on the at least one parameter predictor by applying a prediction model. 10 . The non-transitory medium of claim 9 , wherein each of the one or more training dataset properties is one of a total number of training examples, statistical properties of a distribution of training labels over training examples, attributes of a time series of training examples, attributes of an entity, attributes of past activity performed by the entity, attributes of the online system, and attributes of an event predicted by the machine learning model. 11 . The non-transitory medium of claim 8 , wherein the computer program code to determine candidate parameters further comprises computer program code that when executed by the processor causes the processor to: for each candidate parameter, assign a weight to the candidate parameter, the weight representing an impact of the candidate parameter on the performance of the prior machine learning model; and determine a value for each candidate parameter based on the weight assigned to the candidate parameter and one or more evaluation scores in the subset of the plurality of historical datasets. 12 . The non-transitory medium of claim 8 , wherein the subset of the plurality of historical datasets is identified by comparing the associated metadata of the prior machine learning model to a type of the machine learning model. 13 . The non-transitory medium of claim 8 , wherein the machine learning model generates a predicted output, wherein the predicted output corresponds to a likelihood of occurrence of a user interaction performed by a user of the online system on a content item. 14 . The non-transitory medium of claim 13 , further comprising code that when executed by the processor of a client device causes the processor to: generate an evaluation score for the trained machine learning model based on a comparison between the predicted output from the prediction model and ground truth data from evaluation data. 15 . A method comprising: determining an estimated performance score of a trained machine learning model that was trained using candidate parameter values predicted by a prediction model; generating a prediction error based on a difference between a predicted occurrence of an event obtained from the trained machine learning model and an actual output; determining that a difference between the estimated performance score and the generated prediction error exceeds a threshold error; and responsive to the determined difference being above the threshold error, triggering a corrective action for the trained prediction model. 16 . The method of claim 15 , wherein generating the prediction error comprises: applying features of a user of an online system and features of a content item as input to the trained machine learning model to obtain a predicted output; presenting the content item to the user of the online system based on the predicted output; responsive to presenting the content item, receiving the actual output indicating whether the event occurred; and comparing the predicted output of the trained machine learning model to the received actual output to generate a prediction error. 17 . The method of claim 15 , wherein the estimated performance score comprises an expected mean and expected standard deviation of an expected error, and wherein the threshold error is based on the expected standard deviation of the expected error. 18 . The method

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Activation functions · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • using kernel methods, e.g. support vector machines [SVM] · CPC title

  • G06N20/20Primary

    Ensemble learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2019102693A1 cover?
An online system determines candidate parameter values to be used by a machine learning algorithm to train a machine learning model by saving historical datasets that include historical parameter searches and the performance of prior machine learning models that were trained on the historical parameters. Using the historical datasets, the online system identifies parameter predictors associated…
Who is the assignee on this patent?
Facebook Inc
What technology area does this patent fall under?
Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Apr 04 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).