What technology area does this patent fall under?

Primary CPC classification G06V10/82. Mapped technology areas include Physics.

When was this patent published?

Publication date Thu Sep 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

System and method for mitigating bias in classification scores generated by machine learning models

US2024312198A1 · US · A1

Patent metadata
Field	Value
Publication number	US-2024312198-A1
Application number	US-202418677329-A
Country	US
Kind code	A1
Filing date	May 29, 2024
Priority date	Jun 3, 2020
Publication date	Sep 19, 2024
Grant date	—

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing platform is configured to: (i) train a machine learning model by carrying out a machine learning process on a training data set, wherein the trained machine learning model is configured to (a) receive an input vector comprising respective values for a given set of input variables and (b) based on an evaluation of the received input vector, output a prediction of a given type, (ii) detect bias in the trained machine learning model, (iii) identify one or more input variable groups that contribute to the bias, (iv) mitigate the bias by producing a post-processed version of the trained machine learning model that comprises, for each respective input variable, a respective transformation in place of the respective input variable group, and (v) use the post-processed version of the trained machine learning model to output a given prediction of the given type for a given input vector.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computing platform comprising: at least one processor; at least one non-transitory computer-readable medium; and program instructions stored on the at least one non-transitory computer-readable medium that, when executed by the at least one processor, cause the computing platform to: train a machine learning model by carrying out a machine learning process on a training data set, wherein the trained machine learning model is configured to (i) receive an input vector comprising respective values for a given set of input variables and (ii) based on an evaluation of the received input vector, output a prediction of a given type; detect bias in the trained machine learning model; after detecting the bias in the trained machine learning model, identify one or more input variable groups that contribute to the bias; mitigate the bias in the trained machine learning model by producing a post-processed version of the trained machine learning model that comprises, for each respective input variable group of the identified one or more input variable groups, a respective transformation in place of the respective input variable group; and use the post-processed version of the trained machine learning model to output a given prediction of the given type for a given input vector. 2 . The computing platform of claim 1 , wherein the prediction of the given type comprises a score for use in rendering a classification decision. 3 . The computing platform of claim 1 , wherein the program instructions that, when executed by the at least one processor, cause the computing platform to detect the bias in the trained machine learning model comprise program instructions stored on the at least one non-transitory computer-readable medium that, when executed by the at least one processor, cause the computing platform to: input a first set of input vectors associated with an unprotected class into the trained machine learning model and thereby produce a first set of scores associated with the unprotected class; input a second set of input vectors associated with a protected class into the trained machine learning model and thereby produce a second set of scores associated with the protected class; perform a comparison between the first set of scores and the second set of scores; and based on the comparison, determine that the trained machine learning model exhibits a threshold level of bias. 4 . The computing platform of claim 3 , wherein the program instructions that, when executed by the at least one processor, cause the computing platform to perform a comparison between the first set of scores and the second set of scores comprise program instructions stored on the at least one non-transitory computer-readable medium that, when executed by the at least one processor, cause the computing platform to: perform a comparison between distributions of the first and second sets of scores based on a Wasserstein distance metric. 5 . The computing platform of claim 1 , wherein the program instructions that, when executed by the at least one processor, cause the computing platform to identify one or more input variable groups that contribute to the bias comprise program instructions stored on the at least one non-transitory computer-readable medium that, when executed by the at least one processor, cause the computing platform to: divide the trained machine learning model's given set of input variables into a plurality of input variable groups, wherein each respective input variable group includes a respective subset of the given set of input variables; determine respective bias contribution values for the plurality of input variable groups using a score explainer function; and based on the respective bias contribution values that are determined for the plurality of input variable groups, identify the one or more input variable groups that contribute to the bias. 6 . The computing platform of claim 5 , wherein the program instructions that, when executed by the at least one processor, cause the computing platform to divide the trained machine learning model's given set of input variables into the plurality of input variable groups comprise program instructions stored on the at least one non-transitory computer-readable medium that, when executed by the at least one processor, cause the computing platform to: divide the trained machine learning model's given set of input variables into the plurality of input variable groups based on a clustering algorithm. 7 . The computing platform of claim 1 , wherein the respective transformation for at least one respective input variable group of the identified one or more input variable groups functions to at least partially neutralize the respective input variable group. 8 . The computing platform of claim 1 , wherein the respective transformation for at least one respective input variable group of the identified one or more input variable groups functions to compress each input variable in the respective input variable group. 9 . The computing platform of claim 1 , wherein the respective transformation for at least one respective input variable group of the identified one or more input variable groups functions to compress a distribution of each input variable in the respective input variable group towards a median value for the at least one input variable. 10 . The computing platform of claim 1 , wherein, for each respective input variable group of the identified one or more input variable groups, the respective transformation utilized in place of the respective input variable group comprises a variable-specific transformation for each input variable in the respective input variable group. 11 . The computing platform of claim 1 , wherein the bias comprises one or both of a positive bias component or a negative bias component. 12 . A non-transitory computer-readable medium, wherein the non-transitory computer-readable medium is provisioned with program instructions that, when executed by at least one processor, cause a computing platform to: train a machine learning model by carrying out a machine learning process on a training data set, wherein the trained machine learning model is configured to (i) receive an input vector comprising respective values for a given set of input variables and (ii) based on an evaluation of the received input vector, output a prediction of a given type; detect bias in the trained machine learning model; after detecting the bias in the trained machine learning model, identify one or more input variable groups that contribute to the bias; mitigate the bias in the trained machine learning model by producing a post-processed version of the trained machine learning model that comprises, for each respective input variable group of the identified one or more input variable groups, a respective transformation in place of the respective input variable group; and use the post-processed version of the trained machine learning model to output a given prediction of the given type for a given input vector. 13 . A method implemented by a computing platform, the method comprising: training a machine learning model by carrying out a machine learning process on a training data set, wherein the trained machine learning model is configured to (i) receive an input vector comprising respective values for a given set of input variables and (ii) based on an evaluation of the received input vector, output a prediction of a given type; detecting bias in the trained machine learning model; after detecting the bias in the trained machine learning model, identifying one or more in

Assignees

Discover Financial Services

Inventors

Classifications

G06V10/776
Validation; Performance evaluation · CPC title
G06V10/774
Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title
G06F18/23
Clustering techniques · CPC title
G06F17/16
Matrix or vector computation {, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization (matrix transposition G06F7/78)} · CPC title
G06N20/00
Machine learning · CPC title

Patent family

Related publications grouped by family.

View patent family 78817643

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2024312198A1 cover?: A computing platform is configured to: (i) train a machine learning model by carrying out a machine learning process on a training data set, wherein the trained machine learning model is configured to (a) receive an input vector comprising respective values for a given set of input variables and (b) based on an evaluation of the received input vector, output a prediction of a given type, (ii) d…
Who is the assignee on this patent?: Discover Financial Services
What technology area does this patent fall under?: Primary CPC classification G06V10/82. Mapped technology areas include Physics.
When was this patent published?: Publication date Thu Sep 19 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).