Real-time event storm detection in a cloud environment

US8949676B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-8949676-B2
Application numberUS-201213469468-A
CountryUS
Kind codeB2
Filing dateMay 11, 2012
Priority dateMay 11, 2012
Publication dateFeb 3, 2015
Grant dateFeb 3, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, an apparatus and an article of manufacture for detecting an event storm in a networked environment. The method includes receiving a plurality of events via a plurality of probes in a networked environment, each of the plurality of probes monitoring a monitored information technology (IT) element, aggregating the plurality of events received into an event set, and correlating the plurality of events in the event set to determine whether the plurality of events are part of an event storm by determining if the plurality of events in the event set meet one or more event storm criteria.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for detecting an event storm in a networked environment, wherein the method comprises: receiving a plurality of events via a plurality of probes in a networked environment, each of the plurality of probes monitoring a monitored information technology (IT) element; aggregating the plurality of events received into an event set; and correlating the plurality of events in the event set via a linear regression technique to determine whether the plurality of events are part of an event storm by determining if the plurality of events in the event set meet one or more event storm criteria via: using a determined amount of growth from the linear regression technique to determine a difference between a potential event storm and an outage by comparing a lower bound defining a particular value characterizing an outage with a measured growth; wherein at least one of the steps is carried out by a computer device. 2. The method of claim 1 , further comprising: taking a single corrective action to respond to the plurality of events in the event set corresponding to the event storm. 3. The method of claim 2 , wherein the single corrective action comprises one or more of a provision for more system or computing resources, and creating a ticket. 4. The method of claim 1 , wherein a monitored IT element comprises one of one or more virtual machine memories, one or more virtual central processing units, one or more hypervisors, server hardware, an operating system and storage. 5. The method of claim 1 , wherein an event storm comprises one of a type 1 event storm, an action storm and a type 2 event storm. 6. The method of claim 5 , wherein the event storm comprises a type 1 event storm, and wherein a type 1 event storm includes a same event that occurs on multiple system elements over a period of time, wherein the period of time is n polling cycles. 7. The method of claim 5 , wherein an action storm is a case of a type 1 event storm wherein events in a set occurring on multiple system elements require a same action to be taken over a period of time, where the period of time is n polling cycles. 8. The method of claim 5 , wherein a type 2 event storm includes multiple events in an event set that occur on multiple system elements, indicating a planned or an unplanned outage. 9. The method of claim 1 , further comprising: using machine learning to predict an event storm. 10. The method of claim 9 , wherein using machine learning to predict an event storm comprises: using stochastic modeling to estimate a probability distribution of one or more potential outcomes by allowing random variation of event occurrence over multiple polling cycles, wherein the random variation is based on fluctuations observed in historical data for a selected period, and wherein the probability distribution of one or more potential outcomes are derived from multiple simulations which reflect the random variation as an input. 11. An article of manufacture comprising a computer readable storage medium having computer readable instructions tangibly embodied thereon which, when implemented, cause a computer to carry out a plurality of method steps comprising: receiving a plurality of events via a plurality of probes in a networked environment, each of the plurality of probes monitoring a monitored IT element; aggregating the plurality of events received into an event set; and correlating the plurality of events in the event set via a linear regression technique to determine whether the plurality of events are part of an event storm by determining if the plurality of events in the event set meet one or more event storm criteria via: using a determined amount of growth from the linear regression technique to determine a difference between a potential event storm and an outage by comparing a lower bound defining a particular value characterizing an outage with a measured growth. 12. The article of manufacture of claim 11 , wherein the computer readable instructions which, when implemented, further cause a computer to carry out a method step comprising: taking a single corrective action to respond to the plurality of events in the event set corresponding to the event storm. 13. The article of manufacture of claim 11 , wherein a monitored IT element comprises one of one or more virtual machine memories, one or more virtual central processing units, one or more hypervisors, server hardware, an operating system and storage. 14. The article of manufacture of claim 11 , wherein an event storm comprises one of a type 1 event storm, an action storm and a type 2 event storm. 15. A system for detecting an event storm in a networked environment, comprising: at least one distinct software module, each distinct software module being embodied on a tangible computer-readable medium; a memory; and at least one processor coupled to the memory and operative for: receiving a plurality of events via a plurality of probes in a networked environment, each of the plurality of probes monitoring a monitored IT element; aggregating the plurality of events received into an event set; and correlating the plurality of events in the event set via a linear regression technique to determine whether the plurality of events are part of an event storm by determining if the plurality of events in the event set meet one or more event storm criteria via: using a determined amount of growth from the linear regression technique to determine a difference between a potential event storm and an outage by comparing a lower bound defining a particular value characterizing an outage with a measured growth. 16. The system of claim 15 , wherein the at least one processor coupled to the memory is further operative for: taking a single corrective action to respond to the plurality of events in the event set corresponding to the event storm. 17. The system of claim 15 , wherein a monitored IT element comprises one of one or more virtual machine memories, one or more virtual central processing units, one or more hypervisors, server hardware, an operating system and storage. 18. The system of claim 15 , wherein an event storm comprises one of a type 1 event storm, an action storm and a type 2 event storm.

Assignees

Inventors

Classifications

  • the data filtering being achieved by aggregating or compressing the monitored data · CPC title

  • where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title

  • Monitoring arrangements for monitoring the status of the computing system or of the computing system component, e.g. monitoring if the computing system is on, off, available, not available (error or fault processing without redundancy G06F11/0703; error detection or correction by redundancy in data representation G06F11/08; error detection or correction of the data by redundancy in operations G06F11/14; error detection or correction by redundancy in hardware G06F11/16) · CPC title

  • for performance assessment · CPC title

  • Threshold · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US8949676B2 cover?
A method, an apparatus and an article of manufacture for detecting an event storm in a networked environment. The method includes receiving a plurality of events via a plurality of probes in a networked environment, each of the plurality of probes monitoring a monitored information technology (IT) element, aggregating the plurality of events received into an event set, and correlating the plura…
Who is the assignee on this patent?
Behrendt Michael Man, Hosn Rafah A, Mahindru Ruchi, and 5 more
What technology area does this patent fall under?
Primary CPC classification G06F11/3082. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 03 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).