Systems and methods for compressing behavior data using semi-parametric or non-parametric models

US11188917B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11188917-B2
Application numberUS-201815940930-A
CountryUS
Kind codeB2
Filing dateMar 29, 2018
Priority dateMar 29, 2018
Publication dateNov 30, 2021
Grant dateNov 30, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems for efficiently compressing information comprising a plurality of data points along a particular dimension are presented. In some embodiments, a model may be generated using a semi-parametric modeling technique or a non-parametric modeling technique to represent the plurality of data points. The model may include a set of parameters that is less in size than the plurality of data points. Once the model is generated, the set of parameters may be stored and subsequently used to represent the information, with a significant reduction in storage space over the original data. In response to a request to analyze the information, the set of parameters may be analyzed to produce an outcome. Since the set of parameters have less cardinality than the plurality of data points in the original information, the efficiency of the analysis tool is enhanced.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, by one or more hardware processors, a request for initiating a machine learning model configured to analyze data over a time dimension; obtaining, by the one or more hardware processors, first behavior data generated over a first period of time for a first user account associated with a first user, wherein the first behavior data comprises a first plurality of data points having a particular cardinality and representing behavior of the first user over the first period of time; generating, by the one or more hardware processors, a plurality of models based on the first plurality of data points, wherein each model in the plurality of models is associated with a different set of parameters having a different cardinality that is less than the particular cardinality for representing the first plurality of data points; determining, for each model in the plurality of models, a deviation between the model and the first plurality of data points; selecting, from the plurality of models, a first model associated with a first set of parameters having a first cardinality for representing the first plurality of data points based on differences in deviations between the first model and other models in the plurality of models; configuring the machine learning model to accept input values corresponding to the first set of parameters and a deviation, and to produce an output based on the input values; training, by the one or more hardware processors, the machine learning model based on the first set of parameters and a first deviation between the first model and the plurality of data points; receiving, by the one or more hardware processors, a transaction request for processing a transaction for a second user account associated with a second user; obtaining, by the one or more hardware processors, second behavior data generated over a second period of time for the second user account, wherein the second behavior data comprises a second plurality of data points having the particular cardinality and representing behavior of the second user over the second period of time; generating, by the one or more hardware processors, a second model based on the second plurality of data points, wherein the second model comprises a second set of parameters having the first cardinality; determining, by the one or more hardware processors, a second deviation between the second plurality of data points and the second model; determining, by the one or more hardware processors, a risk associated with the transaction request based on feeding the second set of parameters and the second deviation as input values to the trained machine learning model; and processing, by the one or more hardware processors, the transaction request based on the determined risk. 2. The method of claim 1 , further comprising determining a threshold using an elbow method based on the deviations determined for the plurality of models, wherein the first model is selected from the plurality of models based on determining that the differences in deviations exceed the threshold. 3. The method of claim 1 , wherein the generated plurality of models comprises at least one of a semi-parametric model or a non-parametric model. 4. The method of claim 1 , wherein the first behavior data represents electronic transactions associated with the first user account conducted over the first period of time. 5. The method of claim 4 , wherein the electronic transactions comprises at least one of a login transaction, an electronic payment transaction, or a fund withdrawal transaction. 6. The method of claim 1 , wherein the first behavior data represents an account balance of the first user account over the first period of time. 7. The method of claim 1 , wherein the first behavior data comprises web browsing behavior of the first user. 8. The method of claim 1 , further comprising: determining whether the transaction request is a fraudulent request based on an output value from the trained machine learning model; and in response to determining that the transaction request is not a fraudulent request, authorizing a transaction associated with the transaction request. 9. The method of claim 1 , wherein the machine learning model is trained to produce a likelihood of whether the transaction request is a fraudulent request based on the second set of parameters and the second deviation. 10. A system comprising: a non-transitory memory; and one or more hardware processors coupled to the non-transitory memory and configured to read instructions from the non-transitory memory to cause the system to perform operations comprising: accessing a machine learning model configured to analyze data associated with a dimension; obtaining first information along the dimension, wherein the first information comprises a first plurality of data points along the dimension and having a first cardinality; accessing a plurality of models that model the first plurality of data points, wherein each model in the plurality of models is associated with a different set of parameters having a different cardinality that is less than the first cardinality; computing, for each model in the plurality of models, a deviation between the model and the plurality of data points; determining, for each model in one or more of the plurality of models, an improvement index representing differences in deviations with respect to a predecessor model a successor model, respectively; selecting, from the plurality of models, a first model based on the improvement indices determined for the one or more of the plurality of models, wherein the first model is associated with a first set of parameters having a second cardinality for representing the first plurality of data points; configuring the machine learning model to accept input values corresponding to the first set of parameters and a deviation, and to produce an output based on the input values; training the machine learning model based on the first set of parameters and a first deviation between the first model and the first plurality of data points; receiving a transaction request for processing a transaction for a second user account associated with a second user; obtaining a second plurality of data points associated with the second user account, wherein the second plurality of data points has the first cardinality; generating a second model based on the second plurality of data points, wherein the second model comprises a second set of parameters having the second cardinality for representing the second plurality of data points; determining a second deviation between the second plurality of data points and the second model; generating an output value associated with the transaction request based on feeding the second set of parameters and the second deviation as input values to the trained machine learning model; and processing the transaction request based on the output value. 11. The system of claim 10 , wherein the dimension is one of a time dimension or a geographical location dimension. 12. The system of claim 11 , wherein the dimension is the geographical location dimension, and the first plurality of data points represent a number of times the first user has visited corresponding locations along the geographical location dimension. 13. The system of claim 10 , wherein the generated plurality of models comprises at least one of a semi-parametric model or a non-parametric model. 14. The system of claim 10 , wherein the transaction request is related to a pending transaction, and wherein the machine learning model is traine

Assignees

Inventors

Classifications

  • Quantised networks; Sparse networks; Compressed networks · CPC title

  • Feedforward networks · CPC title

  • Supervised learning · CPC title

  • G06F21/316Primary

    by observing the pattern of computer usage, e.g. typical user behaviour · CPC title

  • involving fraud or risk level assessment in transaction processing · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11188917B2 cover?
Methods and systems for efficiently compressing information comprising a plurality of data points along a particular dimension are presented. In some embodiments, a model may be generated using a semi-parametric modeling technique or a non-parametric modeling technique to represent the plurality of data points. The model may include a set of parameters that is less in size than the plurality of…
Who is the assignee on this patent?
Paypal Inc
What technology area does this patent fall under?
Primary CPC classification G06F21/316. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 30 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).