Methods and systems to detect anomalies in computer system behavior based on log-file sampling

US10116675B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10116675-B2
Application numberUS-201514963100-A
CountryUS
Kind codeB2
Filing dateDec 8, 2015
Priority dateDec 8, 2015
Publication dateOct 30, 2018
Grant dateOct 30, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and systems that detect computer system anomalies based on log file sampling are described. Computers systems generate log files that record various types of operating system and software run events in event messages. For each computer system, a sample of event messages are collected in a first time interval and a sample of event messages are collected in a recent second time interval. Methods calculate a difference between the event messages collected in the first and second time intervals. When the difference is greater than a threshold, an alert is generated. The process of repeatedly collecting a sample of event messages in a recent time interval, calculating a difference between the event messages collected in the recent and previous time intervals, comparing the difference to the threshold, and generating an alert when the threshold is violated may be executed for each computer system of a cluster of computer systems.

First claim

Opening claim text (preview).

The invention claimed is: 1. A process stored in one or more data-storage devices and executed using one or more processors of a computer system to detect anomalies in behavior of a computer system of a distributed computing system, the method comprising: assigning each event message generated by the computer system to a time interval of a series of time intervals, each event message having a time stamp in the time interval the event message is assigned to; and when a most recent time interval of the series of time intervals has elapsed, calculating a difference between a set of event messages with time stamps in the most recent time interval and a set of event messages with time stamps in a previous time interval of the series of time intervals that precede the most recent time interval, and when the difference is greater than a threshold, generating an alert on an administrative computer console that indicates the computer system exhibits anomalous behavior and migrating one or more virtual machines from the computer system to another computer system within the distributing computing system. 2. The process of claim 1 , wherein the series of time intervals are adjacent time intervals. 3. The process of claim 1 , wherein calculating the difference further comprises: retrieving a first event-type probability distribution of the set of event messages with time stamps in the previous time interval; calculating a second event-type probability distribution of the set of event messages with time stamps in the most recent time interval; calculating a Jensen-Shannon divergence between the first and second event-type probability distributions, the Jensen-Shannon divergence is the difference; and replacing the first event-type probability distribution with the second event-type probability distribution in preparation for a subsequent time interval. 4. The process of claim 3 , wherein calculating the second event-type probability distribution further comprises: for each event type of the set of event messages with time stamps in the most recent time interval, incrementing an event-type counter associated with each event type; for each event type, dividing the event-type counter by a total number of event types with time stamps in the recently elapsed time to calculate an event-type probability; and collecting the event-type probabilities to form an event-type probability distribution for the recently elapsed time interval. 5. The process of claim 1 , further comprising: for each set of event messages with time stamps in a first time interval of the series of one or more time intervals, incrementing an event-type counter associated with each event type of the event messages; for each event type, dividing the event-type counter by a total number of event types with time stamps in the first time interval to calculate an event-type probability; and collecting the event-type probabilities to form an event-type probability distribution for the first time interval. 6. The process of claim 1 , further comprising calculating the threshold as a first positive standard deviation of a number of the most recently generated Jensen-Shannon divergence. 7. The process of claim 1 , further comprises applying the method of claim 1 to each computer system of a cluster of computer systems. 8. A system to detect anomalies in behavior of a computer system of a distributed computing system, the system comprising: one or more processors; one or more data-storage devices; and machine-readable instructions stored in the one or more data-storage devices that when executed using the one or more processors controls the system to carry out receiving event messages generated by event-message sources of the computer system; assigning each event message to a time interval of a series of time intervals, each event message having a time stamp in the time interval the event message is assigned to; and when a most recent time interval of the series of time intervals has elapsed, calculating a difference between a set of event messages with time stamps in the most recent time interval and a set of event messages with time stamps in a previous time interval of the series of time intervals that precede the most recent time interval, and when the difference is greater than a threshold, generating an alert on an administrative computer console that indicates the computer system exhibits anomalous behavior and migrating one or more virtual machines from the computer system to another computer system within the distributing computing system. 9. The system of claim 8 , wherein the series of time intervals are adjacent time intervals. 10. The system of claim 8 , wherein calculating the difference further comprises: retrieving a first event-type probability distribution of the set of event messages with time stamps in the previous time interval; calculating a second event-type probability distribution of the set of event messages with time stamps in the most elapsed recent time interval; calculating a Jensen-Shannon divergence between the first and second event-type probability distributions, the Jensen-Shannon divergence is the difference; and replacing the first event-type probability distribution with the second event-type probability distribution in preparation for a subsequent time interval. 11. The system of claim 10 , wherein calculating the second event-type probability distribution further comprises: for each event type of the set of event messages with time stamps in the recently elapsed time interval, incrementing an event-type counter associated with each event type; for each event type, dividing the event-type counter by a total number of event types with time stamps in the recently elapsed time to calculate an event-type probability; and collecting the event-type probabilities to form an event-type probability distribution for the recently elapsed time interval. 12. The system of claim 8 , further comprising: for each set of event messages with time stamps in a first time interval of the series of one or more time intervals, incrementing an event-type counter associated with each event type of the event messages; for each event type, dividing the event-type counter by a total number of event types with time stamps in the first time interval to calculate an event-type probability; and collecting the event-type probabilities to form an event-type probability distribution for the first time interval. 13. The system of claim 8 , further comprising calculating the threshold as a first positive standard deviation of a number of the most recently generated Jensen-Shannon divergence. 14. The system of claim 8 , further comprises applying the method of claim 1 to each computer system of a cluster of computer systems. 15. A non-transitory computer-readable medium encoded with machine-readable instructions that implement a method carried out by one or more processors of a computer system to perform the operations of assigning each event message generated by the computer system to a time interval of a series of time intervals, each event message having a time stamp in the time interval the event message is assigned to; and when a most recent time interval of the series of time intervals has elapsed, calculating a difference between a set of event messages with time stamps in the most recent time interval and a set of event messages with time stamps in a previous time interval of the series of time intervals that precede the most recent time interval, and when the difference is greater than a threshold, generating an aler

Assignees

Inventors

Classifications

  • using time frame reporting · CPC title

  • Processing captured monitoring data, e.g. for logfile generation · CPC title

  • Inference or reasoning models · CPC title

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • using filtering, e.g. reduction of information by using priority, element types, position or time · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10116675B2 cover?
Methods and systems that detect computer system anomalies based on log file sampling are described. Computers systems generate log files that record various types of operating system and software run events in event messages. For each computer system, a sample of event messages are collected in a first time interval and a sample of event messages are collected in a recent second time interval. …
Who is the assignee on this patent?
Vmware Inc
What technology area does this patent fall under?
Primary CPC classification H04L63/1425. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Oct 30 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).