Rarity analysis in network security anomaly/threat detection

US10038707B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10038707-B2
Application numberUS-201514929204-A
CountryUS
Kind codeB2
Filing dateOct 30, 2015
Priority dateAug 31, 2015
Publication dateJul 31, 2018
Grant dateJul 31, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A security platform employs a variety techniques and mechanisms to detect security related anomalies and threats in a computer network environment. The security platform is “big data” driven and employs machine learning to perform security analytics. The security platform performs user/entity behavioral analytics (UEBA) to detect the security related anomalies and threats, regardless of whether such anomalies/threats were previously known. The security platform can include both real-time and batch paths/modes for detecting anomalies and threats. By visually presenting analytical results scored with risk ratings and supporting evidence, the security platform enables network security administrators to respond to a detected anomaly or threat, and to take action promptly.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: analyzing, by a computer system, event data representative of data traffic associated with a computer network to identify a feature of the data traffic, the data traffic including a plurality of occurrences of the feature, each occurrence of the feature having one of a plurality of values of the feature; identifying, by the computer system, a set of the values whose probability of occurrence does not exceed a probability of occurrence of a particular value of the plurality of values, the set of the values being those values of the feature that have occurred not more than the number of times of the particular value; determining, by the computer system, a rarity score for the particular value as a function of the probability of occurrence of the set of the values; detecting, by the computer system, that activity of an entity on the computer network is anomalous in a security context, by determining that an occurrence of the particular value corresponds to an anomaly, based on the rarity score, wherein said determining the rarity score of the particular value and said determining that the occurrence of the particular value corresponds to an anomaly comprise executing a machine learning model at the computer system; and enabling, by the computer system, a targeted response to a security threat indicated by the activity by causing a display of an indication that the activity is anomalous. 2. The method of claim 1 , wherein the feature is an attribute of the data traffic that can assume one of a finite number of values, and wherein the feature includes at least one of (i) an Internet Protocol (IP) address, (ii) a port, (iii) a username of a user, (iv) a device identification (ID), (v) an application name, or (vi) a geo location of a device and/or user. 3. The method of claim 1 , wherein determining the rarity score of the particular value is performed as part of execution of a machine learning model executing at the computer system. 4. The method of claim 1 , wherein determining the occurrence of the particular value as an anomaly is performed as part of execution of a machine learning model executing at the computer system. 5. The method of claim 1 , wherein analyzing the data traffic of the device includes analyzing the data traffic in real-time. 6. The method of claim 1 , wherein determining the rarity score of the particular value includes determining the rarity score as a function of a number of occurrences of the particular value, the set of the values and a total number of occurrences of the feature. 7. The method of claim 1 , wherein determining the rarity score of the particular value includes: determining a rarity of the particular value as a function of a number of occurrences of the particular value and the set of the values and a total number of occurrences of the feature, and determining the rarity score for the particular value based on a confidence interval for the rarity. 8. The method of claim 1 , wherein determining the rarity score of the particular value includes: determining a rarity of the particular value as a sum of a probability of occurrence of the particular value and the set of the values, and determining the rarity score of the particular value based on a confidence interval for the rarity. 9. The method of claim 1 , wherein the rarity score is a tuple including a score threshold and a count threshold, the count threshold indicative of a number of times the particular value can be indicated as an anomaly; the method further comprising: using the count threshold to determine whether the particular value is an anomaly. 10. The method of claim 1 , wherein determining the occurrence of the particular value as an anomaly includes incrementing a count of a number of times the particular value is indicated as an anomaly. 11. The method of claim 1 , wherein determining the occurrence of the particular value as an anomaly includes determining that the rarity score of the particular value is less than a score threshold and that a count of a number of times the particular value is indicated as an anomaly is less than a count threshold. 12. The method of claim 1 , wherein determining the occurrence of the particular value as an anomaly includes: determining that the occurrence of the particular value is not an anomaly if a count of a number of times the particular value is indicated as an anomaly exceeds a count threshold. 13. The method of claim 1 , wherein determining the occurrence of the particular value as an anomaly includes determining the particular value as an anomaly based on a score threshold and a count threshold; the method further comprising: dynamically adjusting the score threshold and the count threshold based on a number of times the particular value is identified as an anomaly in a predefined period. 14. The method of claim 1 , wherein analyzing the data traffic includes: obtaining information regarding the data traffic from a log, the log representing a plurality of events of the data traffic, each of the events including at least one feature. 15. The method of claim 1 , wherein the feature is one of a plurality of features occurring in an event of the data traffic, and wherein determining the rarity score of the particular value of the feature includes determining the rarity score of the particular value when a first feature of the features occurs at a first value in the event. 16. The method of claim 1 , wherein determining the rarity score includes: determining the rarity score for a feature pair, the feature pair including a first feature and a second feature, wherein the first feature is the feature, the determining including determining the rarity score for the occurrence of the particular value of the first feature when the second feature occurs at a first value. 17. The method of claim 1 further comprising: determining that an event of which the feature is a part as an anomaly based on a rarity score for at least some features in the event and a number of the features whose rarity scores do not satisfy a score threshold for the event. 18. The method of claim 1 further comprising: determining the rarity score for each of a set of features in an event of which the feature is a part, determining a number of the set of features whose rarity scores do not satisfy a score threshold for the event, and determining the event as an anomaly if the number of set of features satisfy a feature count threshold for the event. 19. The method of claim 1 further comprising: determining an event of which the feature is a part as an anomaly based on whether a particular feature of a plurality of features of the event is determined as anomalous. 20. The method of claim 1 further comprising: determining an event of which the feature is a part as an anomaly based on whether a particular feature pair of a plurality of features of the event is determined as anomalous. 21. The method of claim 1 , wherein determining the occurrence of the particular value as an anomaly includes: determining an event of which the feature is a part as an anomaly based on at least one of (i) a number of a plurality of features of the event that are determined as anomalous, (ii) whether a particular feature of the features is determined as anomalous, or (iii) a number of times the event has been identified as an anomaly. 22. The method of claim 1 further comprising: determining an event of which the feat

Assignees

Inventors

Classifications

  • G06N20/20Primary

    Ensemble learning · CPC title

  • Event detection, e.g. attack signature detection · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • Hyperlinking · CPC title

  • for supporting key management in a packet data network (cryptographic mechanisms or cryptographic arrangements for key management H04L9/08) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10038707B2 cover?
A security platform employs a variety techniques and mechanisms to detect security related anomalies and threats in a computer network environment. The security platform is “big data” driven and employs machine learning to perform security analytics. The security platform performs user/entity behavioral analytics (UEBA) to detect the security related anomalies and threats, regardless of whether…
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification G06N20/20. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 31 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).