Regularization of machine learning models

US10600000B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10600000-B2
Application numberUS-201615368447-A
CountryUS
Kind codeB2
Filing dateDec 2, 2016
Priority dateDec 4, 2015
Publication dateMar 24, 2020
Grant dateMar 24, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for regularizing feature weights maintained by a machine learning model. The method includes actions of obtaining a set of training data that includes multiple training feature vectors, and training the machine learning model on each of the training feature vectors, comprising, for each feature vector and for each of a plurality of the features of the feature vector: determining a first loss for the feature vector with the feature, determining a second loss for the feature vector without the feature, and updating a current benefit score for the feature using the first loss and the second loss, wherein the benefit score for the feature is indicative of the usefulness of the feature in generating accurate predicted outcomes for training feature vectors.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for training a machine learning model that is configured to receive as input a feature vector that includes a plurality of features and to generate a predicted output from the feature vector, the method comprising: obtaining a set of training data that includes multiple training feature vectors; and training the machine learning model on each of the training feature vectors, comprising, for each training feature vector: for each feature of a plurality of the features of the training feature vector: determining a first loss for the training feature vector with the feature; determining a second loss for the training feature vector without the feature; updating a current benefit score for the feature using the first loss and the second loss, wherein the current benefit score for the feature is indicative of a usefulness of the feature in generating accurate predicted outcomes for training feature vectors; and regularizing the machine learning model based on the current benefit score for the feature. 2. The method of claim 1 , wherein updating the current benefit score for the feature comprises determining a difference between the first loss and the second loss and updating the current benefit score using the difference. 3. The method of claim 1 , wherein determining the first loss for the training feature vector with the feature is based on the feature being associated with an unregularized feature weight for the feature that is determined in an immediately preceding training iteration. 4. The method of claim 1 , wherein determining the second loss for the training feature vector without the feature is based on the feature being associated with a weight that reduces an impact of the feature on the outcome generated by the machine learning model. 5. The method of claim 1 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and in response to determining that the current benefit score for the feature satisfies the predetermined benefit threshold, scaling an unregularized weight associated with the feature based on the current benefit score. 6. The method of claim 1 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and in response to determining that the current benefit score for the feature does not satisfy the predetermined benefit threshold, scaling an unregularized weight associated with the feature to a value that eliminates the feature from consideration by the machine learning model when making predictions. 7. The method of claim 1 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and removing the feature from the machine learning model based on a determination that the current benefit score did not satisfy the predetermined benefit threshold. 8. The method of claim 1 , the method further comprising: ranking each of the features based on the current benefit score associated with each respective feature. 9. The method of claim 8 , the method further comprising: determining a predetermined number of features to include in the machine learning model; and selecting the predetermined number of features based on the ranking. 10. The method of claim 9 , wherein determining the predetermined number of features to include in the learning model is based on an amount of available storage space to store the machine learning model. 11. The method of claim 1 , wherein the machine learning model is an online learning model. 12. A system for training a machine learning model that is configured to receive as input a feature vector that includes a plurality of features and to generate a predicted output from the feature vector, the system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining a set of training data that includes multiple training feature vectors; and training the machine learning model on each of the training feature vectors, comprising, for each training feature vector: for each feature of a plurality of the features of the training feature vector: determining a first loss for the training feature vector with the feature; determining a second loss for the training feature vector without the feature; updating a current benefit score for the feature using the first loss and the second loss, wherein the current benefit score for the feature is indicative of a usefulness of the feature in generating accurate predicted outcomes for training feature vectors; and regularizing the machine learning model based on the current benefit score for the feature. 13. The system of claim 12 , wherein updating the current benefit score for the feature comprises determining a difference between the first loss and the second loss and updating the current benefit score using the difference. 14. The system of claim 12 , wherein determining the first loss for the training feature vector with the feature is based on the feature being associated with an unregularized feature weight for the feature that is determined in an immediately preceding training iteration. 15. The system of claim 12 , wherein determining the second loss for the training feature vector without the feature is based on the feature being scaled by a weight that reduces an impact of the feature on the outcome generated by the machine learning model. 16. The system of claim 12 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and in response to determining that the current benefit score for the feature satisfies the predetermined benefit threshold, scaling an unregularized weight associated with the feature based on the current benefit score. 17. The system of claim 12 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and in response to determining that the current benefit score for the feature does not satisfy the predetermined benefit threshold, scaling an unregularized weight associated with the feature to a value that eliminates the feature from consideration by the machine learning model when making predictions. 18. The system of claim 12 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and removing the feature from the machine learning model based on a determination that the current benefit score did not satisfy the predetermined benefit threshold. 19. The system of claim 12 , the operations further comprising: ranking each of the features based on the current benefit score associated with each respective feature. 20. T

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10600000B2 cover?
Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for regularizing feature weights maintained by a machine learning model. The method includes actions of obtaining a set of training data that includes multiple training feature vectors, and training the machine learning model on each of the training feature vectors, comprising, for each feature vect…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).