Automatic feature subset selection based on meta-learning

US11615265B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11615265-B2
Application numberUS-201916547312-A
CountryUS
Kind codeB2
Filing dateAug 21, 2019
Priority dateApr 15, 2019
Publication dateMar 28, 2023
Grant dateMar 28, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention relates to dimensionality reduction for machine learning (ML) models. Herein are techniques that individually rank features and combine features based on their rank to achieve an optimal combination of features that may accelerate training and/or inferencing, prevent overfitting, and/or provide insights into somewhat mysterious datasets. In an embodiment, a computer ranks features of datasets of a training corpus. For each dataset and for each landmark percentage, a target ML model is configured to receive only a highest ranking landmark percentage of features, and a landmark accuracy achieved by training the ML model with the dataset is measured. Based on the landmark accuracies and meta-features values of the dataset, a respective training tuple is generated for each dataset. Based on all of the training tuples, a regressor is trained to predict an optimal amount of features for training the target ML model.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: ranking a plurality of features of a plurality of datasets; for each dataset of the plurality of datasets: for each percentage of a plurality of distinct percentages: configuring a machine learning (ML) model to receive only a highest ranking said percentage of the plurality of features, and measuring an accuracy, of a respective plurality of accuracies, achieved by training the ML model with the dataset; and generating, based on a plurality of meta-features values of the dataset, a respective tuple of a plurality of training tuples that contains the respective plurality of accuracies; and training, based on the plurality of training tuples that contain respective pluralities of accuracies, a regressor that predicts a count of features of the plurality of features to configure the ML model to be most accurate. 2. The method of claim 1 further comprising: receiving a new dataset; for each percentage of the plurality of distinct percentages: configuring the ML model to receive only a highest ranking said percentage of the plurality of features, and measuring a new accuracy, of a new plurality of accuracies, achieved by training the ML model with the new dataset; and generating, based on the new plurality of accuracies and a new plurality of meta-features values of the new dataset, a new tuple; predicting, by the regressor and based on the new tuple, a new count of features of the plurality of features. 3. The method of claim 2 further comprising: configuring the ML model to receive only a highest ranking said new count of features of the plurality of features; measuring an empirical accuracy achieved by training the ML model with the new dataset and said highest ranking said new count of features; selecting a subset of the plurality of features based on the empirical accuracy and the new plurality of accuracies. 4. The method of claim 3 wherein a size of the plurality of features does not affect complexity of said selecting the subset of the plurality of features. 5. The method of claim 1 wherein said predicts the count of features of the plurality of features comprises: predicts a percentage of the plurality of features, or predicts an optimal subset of the plurality of features. 6. The method of claim 1 wherein: said ranking the plurality of features comprises a plurality of rankings of the plurality of features; said for each dataset comprises for each dataset and each particular ranking of the plurality of rankings; said highest ranking said percentage of the plurality of features comprises said highest ranking by the particular ranking; said predicts said count of features of the plurality of features comprises predicts a respective count of features for each particular ranking of the plurality of rankings. 7. The method of claim 1 wherein the regressor comprises a random forest. 8. The method of claim 1 wherein a count of the plurality of datasets exceeds one hundred. 9. The method of claim 1 wherein the plurality of meta-features values of each dataset of the plurality of datasets comprises at least one selected from the group consisting of: a count of the plurality of features that are numeric, a count of the plurality of features that are not numeric, a ratio of the count of the plurality of features to the count of samples in the dataset, a count of classes of samples in the dataset that the ML model can recognize, a minority count of samples in the dataset of a least frequent class of said classes, a majority count of samples in the dataset of a most frequent class of said classes, and a ratio of the minority count to the majority count. 10. The method of claim 1 wherein the accuracy comprises: a precision, a recall, or an F score. 11. One or more non-transitory computer-readable storage media storing instructions that, when executed by one or more processors, cause: ranking a plurality of features of a plurality of datasets; for each dataset of the plurality of datasets: for each percentage of a plurality of distinct percentages: configuring a machine learning (ML) model to receive only a highest ranking said percentage of the plurality of features, and measuring an accuracy, of a respective plurality of accuracies, achieved by training the ML model with the dataset; and generating, based on a plurality of meta-features values of the dataset, a respective tuple of a plurality of training tuples that contains the respective plurality of accuracies; and training, based on the plurality of training tuples that contain respective pluralities of accuracies, a regressor that predicts a count of features of the plurality of features to configure the ML model to be most accurate. 12. The one or more non-transitory computer-readable storage media of claim 11 wherein the instructions further cause: receiving a new dataset; for each percentage of the plurality of distinct percentages: configuring the ML model to receive only a highest ranking said percentage of the plurality of features, and measuring a new accuracy, of a new plurality of accuracies, achieved by training the ML model with the new dataset; and generating, based on the new plurality of accuracies and a new plurality of meta-features values of the new dataset, a new tuple; predicting, by the regressor and based on the new tuple, a new count of features of the plurality of features. 13. The one or more non-transitory computer-readable storage media of claim 12 wherein the instructions further cause: configuring the ML model to receive only a highest ranking said new count of features of the plurality of features; measuring an empirical accuracy achieved by training the ML model with the new dataset and said highest ranking said new count of features; selecting a subset of the plurality of features based on the empirical accuracy and the new plurality of accuracies. 14. The one or more non-transitory computer-readable storage media of claim 13 wherein a size of the plurality of features does not affect complexity of said selecting the subset of the plurality of features. 15. The one or more non-transitory computer-readable storage media of claim 11 wherein said predicts the count of features of the plurality of features comprises: predicts a percentage of the plurality of features, or predicts an optimal subset of the plurality of features. 16. The one or more non-transitory computer-readable storage media of claim 11 wherein: said ranking the plurality of features comprises a plurality of rankings of the plurality of features; said for each dataset comprises for each dataset and each particular ranking of the plurality of rankings; said highest ranking said percentage of the plurality of features comprises said highest ranking by the particular ranking; said predicts said count of features of the plurality of features comprises predicts a respective count of features for each particular ranking of the plurality of rankings. 17. The one or more non-transitory computer-readable storage media of claim 11 wherein the regressor comprises a random forest. 18. The one or more non-transitory computer-readable storage media of claim 11 wherein a count of the plurality of datasets exceeds one hundred. 19. The one or more non-transitory computer-readable storage media of claim 11 wherein the plurality of meta-features values of each dataset of the plurality of datasets comprises at least one selected from the group consisting of

Assignees

Inventors

Classifications

  • using classification, e.g. of video objects · CPC title

  • Machine learning · CPC title

  • using neural networks · CPC title

  • characterised by the process organisation or structure, e.g. boosting cascade · CPC title

  • by ranking or filtering the set of features, e.g. using a measure of variance or of feature cross-correlation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11615265B2 cover?
The present invention relates to dimensionality reduction for machine learning (ML) models. Herein are techniques that individually rank features and combine features based on their rank to achieve an optimal combination of features that may accelerate training and/or inferencing, prevent overfitting, and/or provide insights into somewhat mysterious datasets. In an embodiment, a computer ranks …
Who is the assignee on this patent?
Oracle Int Corp
What technology area does this patent fall under?
Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 28 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).