Classification of log data

US11250043B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11250043-B2
Application numberUS-201716306077-A
CountryUS
Kind codeB2
Filing dateJun 2, 2017
Priority dateJun 3, 2016
Publication dateFeb 15, 2022
Grant dateFeb 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

This disclosure relates to analyzing log data of a system. This comprises evaluating a first graph model with multiple log events in the log data. The first graph model comprises a first set of nodes connected by a first set of edges representing a first behaviour. A processor determines a first correspondence value based on the first graph model and indicative of a correspondence between the multiple log events and the first behaviour. The processor repeats the steps of evaluating the first graph model for one or more further graph models representing one or more further behaviors and determining the first correspondence value to determine one or more further correspondence values. The processor finally determines a classification of the multiple log events as representing one of the behaviors based on the correspondence values. The use of multiple graph models allows a more granular classification than binary intrusion detection.

First claim

Opening claim text (preview).

The invention claimed is: 1. An internet server comprising: a user input port connected to the internet to create user events based on user input received via the internet; a logging module to create log data comprising multiple log events representing the user events and to store the log data on a database, the log events representing one execution of a process; a data store to store a first graph model comprising data records representing a first set of nodes and data records representing a first set of edges, the first set of nodes being connected by the first set of edges, in the first graph model, to represent a first user behavior, which corresponds to historical log data of the first user behavior, and to store further graph models representing further user behaviors, each of the further user behaviors being different than the first user behavior; a processor to perform the steps of creating a log graph model that represents the multiple log events, evaluating the first graph model with the multiple log events, representing the one execution of the process, by performing multiple steps that simultaneously traverse the log graph model and the first graph model to determine at each step of the multiple steps a match or mismatch between a node of the log graph model and the first graph model, determining a first correspondence value based on the match or mismatch, the first correspondence value being indicative of a correspondence between the multiple log events and the first user behavior, repeating (i) the step of evaluating to evaluate one or more of the further graph models representing corresponding one or more of the further user behaviors, and (ii) the step of determining to determine one or more further correspondence values each indicative of a correspondence between the multiple log events, representing the one execution of the process, and a respective one of the further user behaviors, and determining a classification of the multiple log events as representing one of the user behaviors by selecting, amongst the first user behavior represented by the first graph model and the further user behaviors represented by the corresponding further graph models, the one of the user behaviors that has the highest correspondence value indicating the closest match between the respective graph model and the multiple log events; and an output module to generate an output signal that causes the internet server to perform an action depending on the classification. 2. A method for analyzing log data of a computer system, the method comprising the steps of: creating a log graph model that represents multiple log events, evaluating, by a computer processor, a first graph model with the multiple log events in the log data, the log events representing one execution of a process, wherein the first graph model comprises a first set of nodes connected by a first set of edges, the first graph model representing a first behavior, said evaluating including performing multiple steps that simultaneously traverse the log graph model and the first graph model to determine at each step of the multiple steps a match or mismatch between a node of the log graph model and the first graph model; determining a first correspondence value based on the match or mismatch, the first correspondence value being indicative of a correspondence between the multiple log events and the first behavior; repeating, by the computer processor, (i) the step of evaluating to evaluate one or more further graph models representing one or more further behaviors, each of the one or more further behaviors being different than the first behavior, and (ii) the step of determining to determine one or more further correspondence values each indicative of a correspondence between the multiple log events, representing the one execution of the process, and a respective one of the further behaviors; and determining, by the computer processor, a classification of the multiple log events as representing one of the behaviors by selecting, amongst the first behavior represented by the first graph model and the further behaviors represented by the corresponding further graph models, the one of the behaviors that has the highest correspondence value indicating the closest match between the respective graph model and the multiple log events. 3. The method of claim 2 , wherein evaluating the first graph model comprises determining a first path according to the multiple log events between nodes within the first set of nodes along edges in the first set of edges; and determining the first correspondence value is based on the first path. 4. The method of claim 3 , wherein the first correspondence value is indicative of a correspondence between the order of the multiple log events and the order of the nodes on the first path, or is based on a branching probability associated with each edge along the first path. 5. The method of claim 2 , wherein determining the first correspondence value comprises determining a matching set of nodes that match the multiple log events and the correspondence value is based on a number of elements in the matching set of nodes. 6. The method of claim 2 , wherein determining the first correspondence value comprises determining a mismatching set of nodes that do not match the multiple log events and the correspondence value is based on a number of elements in the mismatching set of nodes. 7. The method of claim 2 , wherein determining the first correspondence value is based on subtraces of consecutive events. 8. The method of claim 2 , wherein the first correspondence value is based on timing information associated with one or more nodes or edges of the first graph model. 9. The method of claim 8 , wherein the timing information is a probability distribution of the time associated with the event. 10. The method of claim 2 , wherein the first correspondence value is based on an alignment value associated with each edge or node of the first graph model, wherein the alignment value is indicative of a level of alignment with each of the further graph models. 11. The method of claim 2 , further comprising generating, by the computer processor, an alert based on the determined classification. 12. The method of claim 2 , wherein the first behavior is a malicious behavior. 13. The method of claim 2 , wherein the step of determining the classification comprises determining the classification before creating a further log event. 14. The method of claim 2 , wherein the step of determining the classification comprises determining the classification before reaching a terminal node of the first graph model. 15. The method of claim 2 , further comprising: receiving, by the computer processor, historical log data and associated behavior labels; and determining, by the computer processor, the nodes and edges of the first graph model based on the historical log data and associated behavior labels; and determining for each edge or for each node one or more of: a branching probability; alignment profiles; and transition timing information. 16. The method of claim 2 , wherein the first graph model is an event structure. 17. A non-transitory computer readable medium with program code stored thereon that, when executed by a computer, causes the computer to perform the method of claim 2 . 18. A computer system for analyzing log data, the computer system comprising: an input port to receive log data comprising multiple log events, the log events representing one execution of a process; a memory t

Assignees

Inventors

Classifications

  • G06F21/552Primary

    involving long-term monitoring or reporting · CPC title

  • monitoring of user actions (tracking the activity of the user H04L67/535) · CPC title

  • G06F16/35Primary

    Clustering; Classification · CPC title

  • Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title

  • where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11250043B2 cover?
This disclosure relates to analyzing log data of a system. This comprises evaluating a first graph model with multiple log events in the log data. The first graph model comprises a first set of nodes connected by a first set of edges representing a first behaviour. A processor determines a first correspondence value based on the first graph model and indicative of a correspondence between the m…
Who is the assignee on this patent?
Nat Ict Australia Ltd
What technology area does this patent fall under?
Primary CPC classification G06F21/552. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).