Database optimization concepts in fast response environments

US10586235B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10586235-B2
Application numberUS-201715592709-A
CountryUS
Kind codeB2
Filing dateMay 11, 2017
Priority dateJun 22, 2016
Publication dateMar 10, 2020
Grant dateMar 10, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Rapidly handling large data sets can be a challenge, particularly in situations where there are millions or even hundreds of millions of database records. Sometimes, however, a service level agreement necessitates that a service return a response to a query in a small amount of time. Database organization techniques can be used that reduce potentially large datasets to smaller groups (neighbors) based on uncommon but shared attributes, in various instances. Using a limited set of related records, queries can be answered using a focused approximation based on characteristics of various identified clusters of records in the set of related records. A particular record may also be associated with an existing cluster of records based on that record's similarities to records in the cluster.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: a processor; and a non-transitory computer-readable storage medium having instructions stored thereon that are executable by the processor to cause the system to perform operations comprising: receiving string data corresponding to a plurality of account characteristics for a new account for an electronic transaction service; determining a match exists between a particular piece of the string data for the new account and respective particular pieces of string data for a plurality of established accounts for the electronic transaction service; based on the match, analyzing the plurality of account characteristics for the new account relative to account characteristics for the plurality of established accounts; without using transaction history data for the new account, and based on the analyzing, assigning the new account to a particular account cluster based on similarities in the plurality of account characteristics to account characteristics of the established accounts; and using a machine learning model trained by account characteristics of historical transactions of the established accounts and fraud indications of the historical transactions, predicting a first fraud probability of a first new transaction attempted by the new account and a second fraud probability of a second new transaction attempted by the new account based on the assigned particular account cluster. 2. The system of claim 1 , wherein the operations further comprise using a least commonly occurring value for the string data for the plurality of established accounts to determine the exact match exists. 3. The system of claim 1 , wherein the operations further comprise: training the machine learning model based on the historical transactions histories of the plurality of established accounts, wherein the historical transactions include indications of whether or not particular past transactions were determined to be fraudulent. 4. The system of claim 1 , wherein the operations further comprise determining to not attempt to match a different piece of the string data to pieces of the string data for the plurality of established accounts when potential matches in the pieces of the string data of the plurality of established accounts for the different piece of string data exceed a threshold size limit. 5. The system of claim 1 , wherein the operations further comprise: receiving an application for the new account; and wherein predicting the first or the second fraud probability is performed in less than one minute after receiving the application for the new account. 6. The system of claim 1 , wherein the operations further comprise cleaning the string data for the new account and cleaning corresponding string data for the plurality of established accounts prior to determining the match exists. 7. The system of claim 1 , wherein determining the exact match is based on one of a group of factors comprising: email address domain name, country code, postal code. 8. A method, comprising: receiving, at an analysis computer system, new account information for a new account corresponding to an electronic transaction service; analyzing, by the analysis computer system, a plurality of account characteristics included in the new account information; prior to receiving any transaction details regarding any electronic payment transactions made with the new account, assigning the new account to a particular account cluster based on similarities in the plurality of account characteristics to corresponding account characteristics of other accounts in the particular account cluster; and using a machine learning model trained by account characteristics of historical transactions of the established accounts and fraud indications of the historical transactions, predicting a first fraud probability to a first new transaction attempted by the new account and a second fraud probability to a second new transaction attempted by the new account based on the assigned particular account cluster. 9. The method of claim 8 , wherein the new account has been requested for creation by a user, but has not yet been used to complete an electronic payment transaction. 10. The method of claim 8 , further comprising assigning a network ID to each of a plurality of account clusters, including the particular account cluster, wherein each network ID assigned uniquely identifies a respective account cluster. 11. The method of claim 8 , wherein assigning the new account to the particular account cluster comprises: calculating a total of similarities for each of a plurality of account clusters, including the particular account cluster, to the new user account based on the plurality of account characteristics; wherein the assigning is done based on the particular account cluster having a highest aggregate score as indicated by calculating the total similarities. 12. The method of claim 11 , wherein calculating the total of similarities comprises assigning a weight to an account characteristic for accounts having an identical value for the account characteristic to a value for the corresponding account characteristic for the new user account, and assigning no weight to the account characteristic when the value is not identical. 13. The method of claim 8 , further comprising approving or denying the first or the second new transaction involving the new account based on the predicted first or second fraud probability. 14. The method of claim 8 , wherein the machine learning model is configured to provide fraud-related scores based on being trained on historical data associated with accounts not known to have engaged in fraud and accounts known to have previously engaged in fraud. 15. The method of claim 8 , wherein the electronic transaction service is an electronic payment service allowing electronic payments to be made between different accounts maintained by the electronic transaction service, including the new account. 16. A non-transitory computer-readable medium having instructions stored thereon that are executable by a computer system to cause the computer system to perform operations comprising: receiving an indication that a new account corresponding to an electronic transaction service has initiated a transaction; analyzing a plurality of account characteristics included in new account information for the new account; prior to receiving any transaction details regarding any electronic payment transactions made with the new account, assigning the new account to a particular account cluster based on similarities in the plurality of account characteristics to corresponding account characteristics of other accounts in the particular account cluster; using a machine learning model trained by account characteristics of historical transactions of the established accounts and fraud indications of the historical transactions, assigning a first fraud probability to a first new transaction attempted by the new account and a second fraud probability to a second new transaction based on the assigned particular account cluster; and determining whether to approve or deny the first or the second new transaction based on the first or the second predicted fraud probability. 17. The non-transitory computer-readable medium of claim 16 , wherein the operations further comprise using network information and transaction detail information for the transaction to determine whether to approve the transaction, wherein the transaction detail information includes at least an amount of the transaction.

Assignees

Inventors

Classifications

  • specially adapted for electronic funds transfer [EFT] systems; specially adapted for home banking systems · CPC title

  • Clustering or classification · CPC title

  • involving remote charge determination or related payment systems · CPC title

  • involving fraud or risk level assessment in transaction processing · CPC title

  • Machine learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10586235B2 cover?
Rapidly handling large data sets can be a challenge, particularly in situations where there are millions or even hundreds of millions of database records. Sometimes, however, a service level agreement necessitates that a service return a response to a query in a small amount of time. Database organization techniques can be used that reduce potentially large datasets to smaller groups (neighbors…
Who is the assignee on this patent?
Paypal Inc
What technology area does this patent fall under?
Primary CPC classification G06Q20/4016. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 10 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).