Determining quality of machine learning model output

US12585998B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12585998-B2
Application numberUS-202318170679-A
CountryUS
Kind codeB2
Filing dateFeb 17, 2023
Priority dateFeb 17, 2023
Publication dateMar 24, 2026
Grant dateMar 24, 2026

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In some aspects, a computing system may generate uninformative features that may be added to a dataset of real features to use as a baseline for determining the quality of an explanation of model output. The uninformative features may be features that do not correlate with what a model is tasked with predicting (e.g., the uninformative features may be random values), and the real features may be informative and correlate with what the model is tasked with predicting (e.g., variables of a dataset sample). A machine learning model may be trained on a dataset that includes both the real features and the uninformative features. The computing system may generate feature attributions for model output, which may include feature attributions for the uninformative features and the real features in the dataset.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system for facilitating efficacy of explanations of machine learning model output through use of uninformative features, the system comprising: one or more processors and one or more non-transitory media having instructions recorded thereon that, when executed by the one or more processors, cause operations comprising: obtaining a dataset comprising a set of real features and a set of samples, wherein each sample comprises a value for each feature in the set of real features; generating, based on the dataset, a set of uninformative features comprising one or more values for each sample in the set of samples, wherein the set of uninformative features are generated such that the uninformative features do not indicate correct classes of samples in the set of samples, wherein the set of uninformative features are combined with the set of real features to form a combined set of features; in connection with monitoring of a deployed model and detecting degraded performance of the deployed model, training, based on the dataset and the combined set of features, a machine learning model; generating a first local explanation associated with a first sample of the dataset and first output of the machine learning model, wherein the first local explanation indicates a first ranking of the combined set of features; determining, based on the first local explanation, that a first uninformative feature of the set of uninformative features is ranked higher than a first real feature of the set of real features, wherein the first uninformative feature ranking higher than the first real feature indicates that the first uninformative feature is more influential to the machine learning model; and based on the first uninformative feature ranking higher than the first real feature, removing the first real feature from the first local explanation to increase a trustworthiness of the first local explanation. 2 . The system of claim 1 , further comprising: generating a second local explanation using a second technique that is different from a first technique used to generate the first local explanation, wherein the second local explanation indicates a second ranking of the combined set of features; and based on the second ranking having fewer uninformative features within a threshold number of top ranked features of the second ranking, selecting the second local explanation for explaining the first output of the machine learning model. 3 . The system of claim 1 , wherein the instructions, when executed by the one or more processors, cause operations further comprising: based on the first uninformative feature ranking higher than the first real feature, generating a first weighting of the first real feature; based on the first uninformative feature ranking lower than a second real feature, generating a second weighting of the second real feature; and generating, based on the first weighting and the second weighting, a weighted metric for the first local explanation. 4 . The system of claim 1 , wherein the instructions, when executed by the one or more processors, cause operations further comprising: based on the first uninformative feature ranking higher than the first real feature, removing the first real feature from the dataset. 5 . A method for facilitating efficacy of explanations of machine learning model output through use of uninformative features, the method comprising: obtaining a dataset comprising a set of real features and a set of samples, wherein each sample comprises a value for each feature in the set of real features; generating a set of uninformative features comprising one or more values for each sample in the set of samples, wherein the set of uninformative features are combined with the set of real features to form a combined set of features; in connection with monitoring of a deployed model and detecting degraded performance of the deployed model, training, based on the dataset and the combined set of features, a machine learning model; generating a first local explanation associated with a first sample of the dataset and first output of the machine learning model, wherein the first local explanation indicates a first ranking of the combined set of features; determining, based on the first local explanation, that a first uninformative feature of the set of uninformative features is ranked higher than a first real feature of the set of real features, wherein the first uninformative feature ranking higher than the first real feature indicates that the first uninformative feature is more influential to the machine learning model; and based on the first uninformative feature ranking higher than the first real feature, removing the first real feature from the first local explanation. 6 . The method of claim 5 , further comprising: generating a second local explanation using a second technique that is different from a first technique used to generate the first local explanation, wherein the second local explanation indicates a second ranking of the combined set of features; and based on the second ranking having fewer uninformative features within a threshold number of top ranked features of the second ranking, selecting the second local explanation for explaining the first output of the machine learning model. 7 . The method of claim 5 , further comprising: based on the first uninformative feature ranking higher than the first real feature, generating a first weighting of the first real feature; based on the first uninformative feature ranking lower than a second real feature, generating a second weighting of the second real feature; and generating, based on the first weighting and the second weighting, a weighted metric for the first local explanation. 8 . The method of claim 5 , further comprising: based on the first uninformative feature ranking higher than the first real feature, removing the first real feature from the dataset. 9 . The method of claim 5 , wherein the first uninformative feature is ranked higher than a lowest ranked real feature. 10 . The method of claim 5 , wherein the first real feature was generated using a feature transformation function, the method further comprising: based on the first uninformative feature ranking higher than the first real feature, inactivating the feature transformation function used to generate the first real feature. 11 . The method of claim 5 , wherein each value in the set of uninformative features comprises a randomly generated value. 12 . The method of claim 5 , wherein generating the set of uninformative features comprises: determining, based on a quantity of features in the set of real features, a threshold number of features; and generating a quantity of uninformative features that is less than or equal to the threshold number of features. 13 . One or more non-transitory computer-readable media comprising instructions that, when executed by one or more processors, cause operations comprising: obtaining a dataset comprising a set of real features and a set of samples, wherein each sample comprises a value for each feature in the set of real features; generating a set of uninformative features comprising one or more values for each sample in the set of samples, wherein the set of uninformative features are combined with the set of real features to form a combined set of features; in connection with monitoring of a deployed model and detecting degraded performance of the deployed model, training, based on the dataset and the combined set of features, a machine learning model; generating a first local explanation associated with

Assignees

Inventors

Classifications

  • G06N20/00Primary

    Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12585998B2 cover?
In some aspects, a computing system may generate uninformative features that may be added to a dataset of real features to use as a baseline for determining the quality of an explanation of model output. The uninformative features may be features that do not correlate with what a model is tasked with predicting (e.g., the uninformative features may be random values), and the real features may b…
Who is the assignee on this patent?
Capital One Services Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/00. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 24 2026 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).