Methods and apparatus for systems determining a probable set of problems to explain symptoms

US10176071B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-10176071-B1
Application numberUS-201514674134-A
CountryUS
Kind codeB1
Filing dateMar 31, 2015
Priority dateMar 31, 2015
Publication dateJan 8, 2019
Grant dateJan 8, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus for performing event correlation using codebook processing including determining a most probable set of problems for observed symptoms in a system. In embodiments, a correlation matrix is received which has managed objects. Hypotheses are defined as a subset of problems having observed symptoms based on the correlation matrix and evaluated.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of performing event correlation using codebook processing including determining a most probable set of problems for observed symptoms in a system comprising an interconnected plurality of devices, each respective interconnected device providing respective data storage that is shareable with the other interconnected devices, the method comprising; receiving, in real time, a correlation matrix for the system; defining hypotheses as a subset of problems having observed symptoms in the system based on the correlation matrix, wherein there is a causal probability from problem set to symptoms; determining a partial relative probability for each of the problems by parallel vertex processing including: for each incoming edge for a problem vertex, determining the partial relative probability from the symptom and the problem for the problem vertex; sending first messages that include the determined partial relative probability; and combining first messages for the same problem vertex by multiplication; partitioning the correlation matrix into a plurality of partitions according to the observed symptoms; performing a best first search on each partition of the system, wherein determining an expansion of visited hypothesis nodes and their relative probabilities and upper bounds is calculated using parallel vertex processing including: for outgoing edges for a problem vertex in the hypothesis, determining a sum of 1 and the additive inverse of the causal probability, sending second messages that include the determined sum, and combining second messages for the same vertex using multiplication; for each outgoing edge for a symptom, determining a probability of that symptom's occurred value given the hypothesis, divided by the probability given the hypothesis unioned with a set containing the problem attached to an outgoing edge, divided by the probability given the problem attached to the outgoing edge, and determining the upper bound on those probabilities for the same; sending third messages of the compute probabilities and upper bounds; and combining third messages for the same problem vertex using multiplication; for each problem receiving an incoming first, second or third message, combining the respective incoming first, second, or third message with the partial probability and upper bounds for that problem and the probability and upper bound for the hypothesis using multiplication to form the probability and upper bound for a child of the hypothesis determining, based at least in part on the hypothesis and on the probability and upper bound for a child of the hypothesis, a real-time operational state of the system, the real-time operational state including information indicating, in real-time, the impacts of detected problems in the system; and dynamically and automatically rebalancing the shareable data storage among the interconnected plurality of devices of the system, based at least in part on the real-time operational state, to compensate for the impacts of the detected problems. 2. The method according to claim 1 , wherein the partial relative probability for each of the problems is updated incrementally as the set of occurred symptoms change. 3. A method according to claim 1 , wherein the correlation matrix is partitioned according to the observed set of symptoms and calculation of the relative probability of a hypothesis in each partition is performed in parallel. 4. The method of claim 1 , wherein the system comprises a plurality of hosts interconnected over a network. 5. The method of claim 1 , wherein the determination of the probable set of problems occurs in real-time, and wherein each symptom comprises information relating to a real-time operational state of the system. 6. The method of claim 1 wherein the correlation matrix is created based at least in part on a set of dynamic, real time data, the set of dynamic, real-time data comprising topological data representing the plurality of hosts and a set of telemetry data from the plurality of hosts. 7. The method of claim 1 , further comprising monitoring one or more real-time states of the system based at least in part on one or more of the first messages, second messages, third messages, and the hypotheses. 8. A system for performing event correlation using codebook processing including determining a most probable set of problems for observed symptoms in a system comprising a plurality of interconnected devices, each respective interconnected device providing respective data storage that is shareable with the other interconnected devices, the system comprising: a memory and a processor configured to: receive, in real time, a correlation matrix for the system; define hypotheses as a subset of problems having observed symptoms in the system based on the correlation matrix, wherein there is a causal probability from problem set to symptoms; determine a partial relative probability for each of the problems by parallel vertex processing including: for each incoming edge for a problem vertex, determine the partial relative probability from the symptom and the problem for the problem vertex; send first messages that include the determined partial relative probability; and combine first messages for the same problem vertex by multiplication; partition the correlation matrix into a plurality of partitions according to the observed symptoms; perform a best first search on each partition of the system, wherein determining an expansion of visited hypothesis nodes and their relative probabilities and upper bounds is calculated using parallel vertex processing including: for outgoing edges for a problem vertex in the hypothesis, determine a sum of 1 and the additive inverse of the causal probability, sending second messages that include the determined sum, and combining second messages for the same vertex using multiplication; for each outgoing edge for a symptom, determine a probability of that symptom's occurred value given the hypothesis, divided by the probability given the hypothesis unioned with a set containing the problem attached to an outgoing edge, divided by the probability given the problem attached to the outgoing edge, and determining the upper bound on those probabilities for the same; send third messages of the compute probabilities and upper bounds; and combine third messages for the same problem vertex using multiplication; for each problem receiving an incoming first, second or third message, combine the respective incoming first, second, or third message with the partial probability and upper bounds for that problem and the probability and upper bound for the hypothesis using multiplication to form the probability and upper bound for a child of the hypothesis determine, based at least in part on the hypothesis and on the probability and upper bound for a child of the hypothesis, a real-time operational state of the system, the real-time operational state including information indicating, in real-time, the impacts of detected problems in the system; and dynamically and automatically rebalance the shareable data storage among the interconnected plurality of devices of the system, based at least in part on the real-time operational state, to compensate for the impacts of the detected problems. 9. The system according to claim 8 , wherein the partial relative probability for each of the problems is updated incrementally as the set of occurred symptoms change. 10. A system according to claim 8 , wherein the correlation matrix is partitioned according to the observed set of symptoms and calculation of the relative probability of a hypothesis in each partition is performed in parallel.

Assignees

Inventors

Classifications

  • where the computing system component is a storage system, e.g. DASD based or network based (digital input from or digital output to record carriers G06F3/06; digital recording or reproducing G11B20/18; for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS], H04L67/1097) · CPC title

  • Performance evaluation by statistical analysis · CPC title

  • for performance assessment · CPC title

  • Real-time · CPC title

  • Performance evaluation by modeling · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10176071B1 cover?
Methods and apparatus for performing event correlation using codebook processing including determining a most probable set of problems for observed symptoms in a system. In embodiments, a correlation matrix is received which has managed objects. Hypotheses are defined as a subset of problems having observed symptoms based on the correlation matrix and evaluated.
Who is the assignee on this patent?
Emc Corp, Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/3034. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).