Methods and systems for evaluating training objects by a machine learning algorithm
US-2019034830-A1 · Jan 31, 2019 · US
US10600000B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10600000-B2 |
| Application number | US-201615368447-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 2, 2016 |
| Priority date | Dec 4, 2015 |
| Publication date | Mar 24, 2020 |
| Grant date | Mar 24, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on computer storage medium, for regularizing feature weights maintained by a machine learning model. The method includes actions of obtaining a set of training data that includes multiple training feature vectors, and training the machine learning model on each of the training feature vectors, comprising, for each feature vector and for each of a plurality of the features of the feature vector: determining a first loss for the feature vector with the feature, determining a second loss for the feature vector without the feature, and updating a current benefit score for the feature using the first loss and the second loss, wherein the benefit score for the feature is indicative of the usefulness of the feature in generating accurate predicted outcomes for training feature vectors.
Opening claim text (preview).
What is claimed is: 1. A method for training a machine learning model that is configured to receive as input a feature vector that includes a plurality of features and to generate a predicted output from the feature vector, the method comprising: obtaining a set of training data that includes multiple training feature vectors; and training the machine learning model on each of the training feature vectors, comprising, for each training feature vector: for each feature of a plurality of the features of the training feature vector: determining a first loss for the training feature vector with the feature; determining a second loss for the training feature vector without the feature; updating a current benefit score for the feature using the first loss and the second loss, wherein the current benefit score for the feature is indicative of a usefulness of the feature in generating accurate predicted outcomes for training feature vectors; and regularizing the machine learning model based on the current benefit score for the feature. 2. The method of claim 1 , wherein updating the current benefit score for the feature comprises determining a difference between the first loss and the second loss and updating the current benefit score using the difference. 3. The method of claim 1 , wherein determining the first loss for the training feature vector with the feature is based on the feature being associated with an unregularized feature weight for the feature that is determined in an immediately preceding training iteration. 4. The method of claim 1 , wherein determining the second loss for the training feature vector without the feature is based on the feature being associated with a weight that reduces an impact of the feature on the outcome generated by the machine learning model. 5. The method of claim 1 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and in response to determining that the current benefit score for the feature satisfies the predetermined benefit threshold, scaling an unregularized weight associated with the feature based on the current benefit score. 6. The method of claim 1 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and in response to determining that the current benefit score for the feature does not satisfy the predetermined benefit threshold, scaling an unregularized weight associated with the feature to a value that eliminates the feature from consideration by the machine learning model when making predictions. 7. The method of claim 1 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and removing the feature from the machine learning model based on a determination that the current benefit score did not satisfy the predetermined benefit threshold. 8. The method of claim 1 , the method further comprising: ranking each of the features based on the current benefit score associated with each respective feature. 9. The method of claim 8 , the method further comprising: determining a predetermined number of features to include in the machine learning model; and selecting the predetermined number of features based on the ranking. 10. The method of claim 9 , wherein determining the predetermined number of features to include in the learning model is based on an amount of available storage space to store the machine learning model. 11. The method of claim 1 , wherein the machine learning model is an online learning model. 12. A system for training a machine learning model that is configured to receive as input a feature vector that includes a plurality of features and to generate a predicted output from the feature vector, the system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining a set of training data that includes multiple training feature vectors; and training the machine learning model on each of the training feature vectors, comprising, for each training feature vector: for each feature of a plurality of the features of the training feature vector: determining a first loss for the training feature vector with the feature; determining a second loss for the training feature vector without the feature; updating a current benefit score for the feature using the first loss and the second loss, wherein the current benefit score for the feature is indicative of a usefulness of the feature in generating accurate predicted outcomes for training feature vectors; and regularizing the machine learning model based on the current benefit score for the feature. 13. The system of claim 12 , wherein updating the current benefit score for the feature comprises determining a difference between the first loss and the second loss and updating the current benefit score using the difference. 14. The system of claim 12 , wherein determining the first loss for the training feature vector with the feature is based on the feature being associated with an unregularized feature weight for the feature that is determined in an immediately preceding training iteration. 15. The system of claim 12 , wherein determining the second loss for the training feature vector without the feature is based on the feature being scaled by a weight that reduces an impact of the feature on the outcome generated by the machine learning model. 16. The system of claim 12 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and in response to determining that the current benefit score for the feature satisfies the predetermined benefit threshold, scaling an unregularized weight associated with the feature based on the current benefit score. 17. The system of claim 12 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and in response to determining that the current benefit score for the feature does not satisfy the predetermined benefit threshold, scaling an unregularized weight associated with the feature to a value that eliminates the feature from consideration by the machine learning model when making predictions. 18. The system of claim 12 , wherein regularizing the machine learning model based on the current benefit score for the feature, comprises: determining whether the current benefit score for the feature satisfies a predetermined benefit threshold; and removing the feature from the machine learning model based on a determination that the current benefit score did not satisfy the predetermined benefit threshold. 19. The system of claim 12 , the operations further comprising: ranking each of the features based on the current benefit score associated with each respective feature. 20. T
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.