Machine learning models for evaluating differences between groups and methods thereof

US10839318B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10839318-B2
Application numberUS-201816217808-A
CountryUS
Kind codeB2
Filing dateDec 12, 2018
Priority dateDec 12, 2018
Publication dateNov 17, 2020
Grant dateNov 17, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems, methods, and computer readable media are disclosed for generating, modifying, and using machine learning models to predict and evaluate differences between groups. Methods disclosed herein may include identifying variables that characterize members of a first group, generating shift indicators using the identified variables, generating a machine learning model using the shift indicators and the first group, using the machine learning model and the group to predict shifts between the first group and a predicted second group, determining an aggregate population shift and an aggregate performance shift between the first group and an actual second group, and identifying an impact of one or more of the shift indicators on the aggregate population shift or performance shift. Systems and methods disclosed herein may be configured to receive requests to predict and evaluate differences between group, and to return such predictions and evaluations to one or more users.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for generating a machine learning model to define differences between two groups, the method comprising: receiving a plurality of variable values that characterize members of a first group, wherein the members of the first group comprise a first population exhibiting a first loss of monetary assets over a first period of time; generating a plurality of shift indicators attributable to each of the members of the first group by sorting the received variable values into categories of similar values, and transforming each category of similar values into a simplified numerical value, wherein each simplified numerical value is a shift indicator of the plurality of shift indicators, and wherein there are fewer shift indicators than received variable values; generating a machine learning model using the plurality of shift indicators and the first group, wherein the machine learning model is configured to predict a second group comprising a predicted second population exhibiting a predicted second loss of monetary assets over a second period of time, a population shift representing a change in population between the first group and the predicted second group, and a performance shift representing a difference in one or more of the shift indicators, the difference representing a change in at least one variable value characterizing a member of the first group who is also a member of the predicted second group; modifying the machine learning model using a plurality of hyperparameters identified for the machine learning model to generate a modified machine learning model; using the modified machine learning model and the first group, generating the predicted second group, the predicted population shift, and the predicted performance shift; using one or more of the plurality of shift indicators, the first group, and an actual second group, the actual second group comprising an actual second population exhibiting an actual second loss of monetary assets over the second period of time, determining an actual population shift representing a change in population between the first group and the actual second group, and an actual performance shift representing a difference in one or more of the plurality of shift indicators, the difference representing a change in at least one variable value characterizing a member of the first group who is also a member of the actual second group; using a difference between the predicted population shift and the actual population shift, determining an aggregate population shift; using a difference between the predicted performance shift and the actual performance shift, determining an aggregate performance shift; and outputting the aggregate population shift and the aggregate performance shift. 2. The computer-implemented method of claim 1 , wherein the plurality of variable values that characterize members of the first group comprises loss percentages associated with loss of monetary assets. 3. The computer-implemented method of claim 1 , wherein generating the plurality of shift indicators attributable to the first group further comprises: transforming a received variable value into a binary value. 4. The computer-implemented method of claim 1 , wherein the machine learning model is one of a gradient boosting model or a random forest model. 5. The computer-implemented method of claim 1 , further comprising: estimating, for a first shift indicator of the plurality of shift indicators, a coefficient of impact on the aggregate population shift; and using the aggregate population shift and the estimated coefficient of impact for the first shift indicator, determining an impact of the first shift indicator on the aggregate population shift. 6. The computer-implemented method of claim 5 , further comprising: comparing the impact of the first shift indicator on the aggregate population shift with an impact of at least one other shift indicator on the aggregate population shift; and if the impact of the first shift indicator is greater than the impact of the at least one other shift indicator, outputting the first shift indicator. 7. The computer-implemented method of claim 1 , further comprising: identifying a sub-sample of the actual second group associated with a first shift indicator of the plurality of shift indicators, wherein the first shift indicator associated with each member of the sub-sample exhibits a change as compared to its corresponding value associated with a member of the first group, wherein the exhibited change is a partial performance change; and comparing the partial performance change to the aggregate performance shift to identify an impact of the first shift indicator on the aggregate performance shift. 8. The computer-implemented method of claim 7 , further comprising: comparing the impact of the first shift indicator on the aggregate performance shift with an impact of at least one other shift indicator on the aggregate performance shift; and if the impact of the first shift indicator is greater than the impact of the at least one other shift indicator, outputting the first shift indicator. 9. The computer-implemented method of claim 1 , wherein outputting the aggregate population shift and the aggregate performance shift comprises: automatically generating a waterfall chart on a display, the waterfall chart including representations of the aggregate population shift, the aggregate performance shift, and at least one shift indicator contributing to the aggregate population shift and/or the aggregate performance shift. 10. The computer-implemented method of claim 1 , wherein the plurality of hyperparameters are generated using at least one of cross-validation or early stopping on the machine learning model. 11. A computer-implemented method for generating a machine learning model to define aggregate changes between two groups, the method comprising: identifying a plurality of variable values that characterize members of each of a first group and a second group, wherein members of the first group comprise a first population exhibiting a first loss of monetary assets over a first period of time; generating a plurality of shift indicators by sorting the plurality of variable values into categories of similar values, and transforming each category of similar values into a simplified numerical value, wherein each simplified numerical value is a shift indicator of the plurality of shift indicators, and wherein there are fewer shift indicators than received variable values; generating a machine learning model using the plurality of shift indicators and the first group, wherein the machine learning model is configured to predict a population shift representing a change in population between the first group and a predicted second group comprising a predicted second population exhibiting a predicted second loss of monetary assets over a second period of time, and a performance shift representing a difference in one or more of the plurality of shift indicators, the difference representing a change in at least one variable value characterizing a member of the first group who is also a member of the predicted second group; identifying a plurality of hyperparameters for the machine learning model; modifying the machine learning model using the plurality of hyperparameters to generate a modified machine learning model; using the modified machine learning model and the first group, predicting a population shift and a performance shift between the first group and the predicted second group; using one or more of the plurality of shift indicators, the first group, and an actual second group, the actual seco

Assignees

Inventors

Classifications

  • Drawing of charts or graphs · CPC title

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • G06N20/20Primary

    Ensemble learning · CPC title

  • G06N20/00Primary

    Machine learning · CPC title

  • Inference or reasoning models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10839318B2 cover?
Systems, methods, and computer readable media are disclosed for generating, modifying, and using machine learning models to predict and evaluate differences between groups. Methods disclosed herein may include identifying variables that characterize members of a first group, generating shift indicators using the identified variables, generating a machine learning model using the shift indicator…
Who is the assignee on this patent?
Capital One Services Llc
What technology area does this patent fall under?
Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 17 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).