Adaptive fault diagnosis

US9672085B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9672085-B2
Application numberUS-201615050008-A
CountryUS
Kind codeB2
Filing dateFeb 22, 2016
Priority dateDec 4, 2012
Publication dateJun 6, 2017
Grant dateJun 6, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to an example, an adaptive fault diagnosis system may include a memory storing machine readable instructions to receive metrics and events from an enterprise system, and use a substitution graph to determine if a received metric or a received event belongs to a cluster that includes one or more correlated metrics and/or events grouped based on similarity. If the received metric or the received event belongs to the cluster, the memory may further store machine readable instructions to use a detection graph to determine if the received metric or the received event is identifiable to form a fault pattern by traversing a fault path of the detection graph. Further, the memory may further store machine readable instructions to diagnose a fault based on the traversal of the fault path of the detection graph. The system may include a processor to implement the machine readable instructions.

First claim

Opening claim text (preview).

What is claimed is: 1. An adaptive fault diagnosis system comprising: a processor; and a memory storing machine readable instructions that when executed by the processor cause the processor to: access metrics and events that are to be used to diagnose a fault; determine whether a metric of the accessed metrics or an event of the accessed events belongs to a cluster of a plurality of clusters, wherein the cluster of the plurality of clusters includes at least one of one or more correlated metrics and events, and wherein the at least one of the one or more correlated metrics and events is grouped based on similarity; in response to a determination that the metric or the event belongs to the cluster, determine whether the metric or the event is identifiable to form a fault pattern; and diagnose the fault based on identification of the metric or the event as forming the fault pattern. 2. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: generate a substitution graph to determine whether the metric or the event belongs to the cluster by: collecting metrics and events created by injection of a plurality of labeled faults in a training enterprise system; using the collected metrics and events to generate the substitution graph to group at least one of one or more collected metrics and one or more collected events into the plurality of clusters such that at least one of the one or more collected metrics and events grouped in one cluster are more strongly related to at least one of the one or more collected metrics and events grouped in the one cluster as compared to at least one of the one or more collected metrics and events in other clusters; and scoring each cluster based on how at least one of the one or more collected metrics and events in the scored cluster originated. 3. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: generate a detection graph to determine whether the metric or the event is identifiable to form the fault pattern by: collecting metrics and events created by injection of a plurality of labeled faults in a training enterprise system; and using the collected metrics and events to generate the detection graph by: ordering and connecting at least one of one or more collected metrics and events based on respective timestamps. 4. The adaptive fault diagnosis system of claim 3 , wherein using the collected metrics and events to generate the detection graph further comprises machine readable instructions to: select at least one of one or more collected metrics and events critical to a fault to form a fault pattern by using an EDGERANK process. 5. The adaptive fault diagnosis system of claim 3 , wherein using the collected metrics and events to generate the detection graph further comprises machine readable instructions to: select at least one of one or more collected metrics and events critical to a fault to form a fault pattern based on affinity, weight, and time decay related to at least one of the one or more collected metrics and events. 6. The adaptive fault diagnosis system of claim 3 , wherein using the collected metrics and events to generate the detection graph further comprises machine readable instructions to: rank at least one of the one or more collected metrics and events based on contribution to fault identification; and select at least one of one or more ranked metrics and events critical to a fault to form a fault pattern. 7. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: monitor a subset of the accessed metrics and events based on previously detected fault patterns. 8. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: update at least one of a substitution graph to determine whether the metric or the event belongs to the cluster and a detection graph to determine whether the metric or the event is identifiable to form the fault pattern based on a new detected fault. 9. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: utilize the fault pattern as a template to diagnose a new fault that includes at least one of different events and different metrics compared to at least one of events and metrics of the fault pattern. 10. The adaptive fault diagnosis system of claim 1 , wherein a substitution graph to determine whether the metric or the event belongs to the cluster includes a metric A correlated to a metric B if the metric A is a function of the metric B. 11. The adaptive fault diagnosis system of claim 1 , wherein a substitution graph to determine whether the metric or the event belongs to the cluster includes an event A correlated to an event B if the event A and the event B always appear simultaneously or with a fixed order. 12. The adaptive fault diagnosis system of claim 1 , wherein a substitution graph to determine whether the metric or the event belongs to the cluster includes an event A correlated to a metric B if the event A occurs after the metric B reaches a threshold, or if the event A includes the metric B. 13. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: diagnose the fault based on traversal of a fault path of a detection graph; in response to a determination that the fault path cannot be expanded, diagnose no fault; in response to a determination that no additional metrics or events on the fault path match with known fault patterns, diagnose no fault; and in response to a determination that traversal of the fault path matches the fault pattern, diagnose the fault. 14. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: diagnose the fault based on the traversal of a fault path of a detection graph; and estimate a probability to determine if the fault path leads to a known fault. 15. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: diagnose the fault by determining a probability of detecting an unknown fault. 16. The adaptive fault diagnosis system of claim 1 , further comprising machine readable instructions to: adjust a threshold related to the fault pattern based on a ratio of applicability of a training enterprise system to an enterprise system associated with the accessed metrics and events. 17. The adaptive fault diagnosis system of claim 16 , wherein the enterprise system associated with the accessed metrics and events is a cloud-based enterprise system. 18. A method for adaptive fault diagnosis, the method comprising: accessing, by at least one processor, metrics and events that are to be used to diagnose a fault; determining, by the at least one processor, whether a metric of the accessed metrics or an event of the accessed events belongs to a cluster of a plurality of clusters, wherein the cluster of the plurality of clusters includes at least one of one or more correlated metrics and events, and wherein the at least one of the one or more correlated metrics and events is grouped based on similarity; in response to a determination that the metric or the event belongs to the cluster, determining, by the at least one processor, whether the metric or the event is identifiable to form a fault pattern; diagnosing, by the at least one processor, the fault based on identification of the metric or the event as forming the fault pattern; and adjusting

Assignees

Inventors

Classifications

  • G06F11/079Primary

    Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

  • Performance analysis of employees; Performance analysis of enterprise or organisation operations · CPC title

  • Administration; Management · CPC title

  • in a distributed system consisting of a plurality of standalone computer nodes, e.g. clusters, client-server systems · CPC title

  • Error or fault processing not based on redundancy, i.e. by taking additional measures to deal with the error or fault not making use of redundancy in operation, in hardware, or in data representation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9672085B2 cover?
According to an example, an adaptive fault diagnosis system may include a memory storing machine readable instructions to receive metrics and events from an enterprise system, and use a substitution graph to determine if a received metric or a received event belongs to a cluster that includes one or more correlated metrics and/or events grouped based on similarity. If the received metric or the…
Who is the assignee on this patent?
Accenture Global Services Ltd
What technology area does this patent fall under?
Primary CPC classification G06F11/079. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 06 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).