System for reducing transaction failure
US-12175472-B2 · Dec 24, 2024 · US
US11727311B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11727311-B2 |
| Application number | US-202217870733-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 21, 2022 |
| Priority date | Jul 27, 2015 |
| Publication date | Aug 15, 2023 |
| Grant date | Aug 15, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying user behavior as anomalous. One of the methods includes obtaining user behavior data representing behavior of a user in a subject system. An initial model is generated from training data, the initial model having first characteristic features of the training data. A resampling model is generated from the training data and from multiple instances of the first representation for a test time period. A difference between the initial model and the resampling model is computed. The user behavior in the test time period is classified as anomalous based on the difference between the initial model and the resampling model.
Opening claim text (preview).
What is claimed is: 1. A method comprising: obtaining a plurality of topics, each topic being data representing a plurality of file types that frequently co-occur in user behavior data of individual users; obtaining user behavior data representing behavior of a user in a subject system, wherein the user behavior data indicates file types of files accessed by the user in the subject system and when the file was accessed by the user; generating test data from the user behavior data, the test data comprising a first representation of which topics the user accessed during a test time period according to the file types of the user behavior data; generating training data from the user behavior data, the training data comprising respective second representations of which topics the user accessed in each of multiple time periods prior to the test time period; generating an initial SVD model from the test data; generating a resampling model from the training data from multiple instances of the first representation of which topics the user accessed during the test time period; computing a difference between the initial model and the resampling model; and classifying the user behavior in the test time period as anomalous based on the difference between the initial model and the resampling model. 2. The method of claim 1 , further comprising generating the plurality of topics from file types of files accessed by multiple users in the subject system. 3. The method of claim 2 , further comprising: generating the topics using a topic modeling process including defining each user to be a document and each file type accessed by each user to be a term in the corresponding document. 4. The method of claim 3 , wherein generating the topics using the topic modeling process comprises generating a predetermined number K of topics. 5. The method of claim 4 , wherein generating the K topics comprises generating a probability distribution for each of the K topics that assigns a likelihood to a particular file type being accessed by a user who accesses file types assigned to the topic. 6. The method of claim 3 , further comprising: iterating over a plurality of candidate values of K; and selecting a particular candidate value of K as the predetermined number K. 7. The method of claim 1 , wherein computing the difference between the initial model and the resampling model comprises comparing the initial model and the resampling model using singular value decomposition. 8. A system comprising: one or more computers and one or more storage devices storing instructions that are operable, when executed by the one or more computers, to cause the one or more computers to perform operations comprising: obtaining a plurality of topics, each topic being data representing a plurality of file types that frequently co-occur in user behavior data of individual users; obtaining user behavior data representing behavior of a user in a subject system, wherein the user behavior data indicates file types of files accessed by the user in the subject system and when the file was accessed by the user; generating test data from the user behavior data, the test data comprising a first representation of which topics the user accessed during a test time period according to the file types of the user behavior data; generating training data from the user behavior data, the training data comprising respective second representations of which topics the user accessed in each of multiple time periods prior to the test time period; generating an initial SVD model from the test data; generating a resampling model from the training data from multiple instances of the first representation of which topics the user accessed during the test time period; computing a difference between the initial model and the resampling model; and classifying the user behavior in the test time period as anomalous based on the difference between the initial model and the resampling model. 9. The system of claim 8 , wherein the operations further comprise the plurality of topics from file types of files accessed by multiple users in the subject system. 10. The system of claim 9 , wherein the operations further comprise: generating the topics using a topic modeling process including defining each user to be a document and each file type accessed by each user to be a term in the corresponding document. 11. The system of claim 10 , wherein generating the topics using the topic modeling process comprises generating a predetermined number K of topics. 12. The system of claim 11 , wherein generating the K topics comprises generating a probability distribution for each of the K topics that assigns a likelihood to a particular file type being accessed by a user who accesses file types assigned to the topic. 13. The system of claim 10 , wherein the operations further comprise: iterating over a plurality of candidate values of K; and selecting a particular candidate value of K as the predetermined number K. 14. The system of claim 8 , wherein computing the difference between the initial model and the resampling model comprises comparing the initial model and the resampling model using singular value decomposition. 15. One or more non-transitory computer storage media encoded with computer program instructions that when executed by one or more computers cause the one or more computers to perform operations comprising: obtaining a plurality of topics, each topic being data representing a plurality of file types that frequently co-occur in user behavior data of individual users; obtaining user behavior data representing behavior of a user in a subject system, wherein the user behavior data indicates file types of files accessed by the user in the subject system and when the file was accessed by the user; generating test data from the user behavior data, the test data comprising a first representation of which topics the user accessed during a test time period according to the file types of the user behavior data; generating training data from the user behavior data, the training data comprising respective second representations of which topics the user accessed in each of multiple time periods prior to the test time period; generating an initial SVD model from the test data; generating a resampling model from the training data from multiple instances of the first representation of which topics the user accessed during the test time period; computing a difference between the initial model and the resampling model; and classifying the user behavior in the test time period as anomalous based on the difference between the initial model and the resampling model. 16. The non-transitory computer storage media of claim 15 , wherein the operations further comprise generating the plurality of topics from file types of files accessed by multiple users in the subject system. 17. The non-transitory computer storage media of claim 16 , wherein the operations further comprise: generating the topics using a topic modeling process including defining each user to be a document and each file type accessed by each user to be a term in the corresponding document. 18. The non-transitory computer storage media of claim 17 , wherein generating the topics using the topic modeling process comprises generating a predetermined number K of topics. 19. The non-transitory computer storage media of claim 18 , wherein generating the K topics comprises generating a probability distribution for each of the K topics that assigns
Machine learning · CPC title
Clustering or classification · CPC title
by observing the pattern of computer usage, e.g. typical user behaviour · CPC title
involving long-term monitoring or reporting · CPC title
Traffic logging, e.g. anomaly detection · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.