Who is the assignee on this patent?

Huawei Cloud Computing Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06F11/079. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automated root-cause analysis for distributed systems using tracing-data

US11645141B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11645141-B2
Application number	US-202117462955-A
Country	US
Kind code	B2
Filing date	Aug 31, 2021
Priority date	Mar 4, 2019
Publication date	May 9, 2023
Grant date	May 9, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system for identifying root cause of anomalies in execution of an application comprising a plurality of operations is provided. The system comprising a preprocessing module configured to receive tracing data comprising a plurality of tracing spans each documenting, for a corresponding operation of the application, a plurality of properties and corresponding values, a signal splitting module configured to group the plurality of tracing spans in a plurality of groups such that each of the plurality of groups comprises operations with identical properties and corresponding values, an anomaly detection module configured to determine anomalous operations for each of the plurality of tracing data spans, a scoring module configured to calculate a plurality of anomaly scores each indicating a level of anomaly within each of the plurality of groups and a root cause identification module configured to analyze the anomaly scores and identify root cause of the detected anomalies according to the analysis.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for identifying a root cause of anomalies in executing an application, the application comprising a plurality of operations, the system comprising: a processing circuitry and a non-transitory storage, wherein the non-transitory storage is configured to store code, and the processing circuitry is configured to execute the code to: receive tracing data, comprising a plurality of tracing spans, each tracing span documenting for a corresponding operation of the application a plurality of properties and corresponding values; group the plurality of tracing spans in a plurality of groups such that each of the plurality of groups comprises operations with identical properties and corresponding values; determine anomalous operations for each of the plurality of tracing data spans; calculate a plurality of anomaly scores each anomaly score indicating a level of anomaly within each one of the plurality of groups; and analyze the plurality of anomaly scores and to identify a root cause of the detected anomalies according to the analysis of the plurality of anomaly scores. 2. The system of claim 1 , wherein the processing circuitry is further configured to execute the code to generate the tracing spans, wherein the system further comprises a tracing library, and a plurality of tracing servers, the tracing library being configured to: generate one tracing span for each operation processed by the application; and transmit the tracing span to at least one of the plurality of tracing servers, wherein the tracing servers are configured to collect the received tracing spans and to transmit them to the processing circuitry. 3. The system of claim 1 , wherein the processing circuitry is configured to execute the code to: for each of the plurality of properties: compute a plurality of aggregated anomaly scores, for each property value, over the anomalies of all groups with the same property value; compute a standard deviation in the plurality of aggregated anomaly scores; and select at least one property with the standard deviation exceeding a threshold; wherein the root cause is identified according to the selected at least one property and the property value having the maximal aggregated anomaly score. 4. The system of claim 1 , wherein the processing circuitry is configured to execute the code to: compute a distance metric indicating distances between all pairs of property values; apply a clustering algorithm on the plurality of property values using the distance metric to obtain a plurality of clusters of property values; compute a plurality of anomaly scores each for one of the plurality of clusters; and select at least one cluster according to the plurality of anomaly scores and a threshold; wherein the root cause is identified according to the at least one selected cluster. 5. The system of claim 4 , wherein the processing circuitry is configured to execute the code to compute the distance metric by: constructing a graph G=(V, E), with vertices V and edges E, wherein V comprises vertices representing the plurality of property values and the plurality of groups of tracing spans, and E comprising edges between vertices representing the plurality of property values and the plurality of groups of tracing spans and edge capacities based on the group anomaly scores; computing a plurality of maximum flow values between pairs of vertices representing the plurality of property values; computing a plurality of distances between all vertices representing the plurality of property values from the plurality of maximum flow values; and obtaining the distance metric from the plurality of distances. 6. The system of claim 4 , wherein each distance of the plurality of distances of pairs of property values is one of: inverse proportional to the anomaly score of a corresponding group of tracing spans, zero when both property values are the same, or inverse proportional to the number of groups of tracing spans with non-zero anomaly score. 7. A method for identifying a root cause of a fault in executing an application, comprising: receiving tracing data, comprising a plurality of tracing spans, each tracing span documenting for a corresponding operation of the application a plurality of properties and corresponding values; grouping the plurality of tracing spans in a plurality of groups such that each of the plurality of groups comprises operations with identical properties and corresponding values; determining anomalous operations for each of the plurality of tracing data spans; calculating a plurality of anomaly scores, each anomaly score indicating a level of anomaly within each one of the plurality of groups; and analyzing the plurality of anomaly scores and to identify a root cause of the detected anomalies according to the analysis of the plurality of anomaly scores. 8. The method of claim 7 , wherein further comprising: generating one tracing span for each operation processed by the application. 9. The method of claim 7 , wherein for each property, comprising: computing a plurality of aggregated anomaly scores, for each property value, over the anomalies of all groups with the same property value; computing a standard deviation in the plurality of aggregated anomaly scores; and selecting at least one property with the standard deviation exceeding a threshold; wherein the root cause is identified according to the selected at least one property and the property value having the maximal aggregated anomaly score. 10. The method of claim 7 , wherein comprising: computing a distance metric indicating distances between all pairs of property values; applying a clustering algorithm on the plurality of property values using the distance metric to obtain a plurality of clusters of property values; computing a plurality of anomaly scores each for one of the plurality of clusters; and selecting at least one cluster according to the plurality of anomaly scores and a threshold; wherein the root cause is identified according to the at least one selected cluster. 11. The method of claim 10 , wherein the computing the distance metric comprising: constructing a graph G=(V, E), with vertices V and edges E, wherein V comprises vertices representing the plurality of property values and the plurality of groups of tracing spans, and E comprising edges between vertices representing the plurality of property values and the plurality of groups of tracing spans and edge capacities based on the group anomaly scores; computing a plurality of maximum flow values between pairs of vertices representing the plurality of property values; computing a plurality of distances between all vertices representing the plurality of property values from the plurality of maximum flow values; and obtaining the distance metric from the plurality of distances. 12. A non-transitory computer readable storage medium comprising computer program code instructions, being executable by a computer, for performing the following steps: receive tracing data, comprising a plurality of tracing spans, each tracing span documenting for a corresponding operation of the application a plurality of properties and corresponding values; group the plurality of tracing spans in a plurality of groups such that each of the plurality of groups comprises operations with identical properties and corresponding values; determine anomalous operations for each of the plurality of tracing data spans; calculate a plurality of anomaly scores, each anomaly score indicating a level of anomaly within each one of the plurality of groups; and analyze the plurality of anomaly score

Assignees

Huawei Cloud Computing Tech Co Ltd

Inventors

Classifications

G06F11/3072
where the reporting involves data filtering, e.g. pattern matching, time or event triggered, adaptive or policy-based reporting · CPC title
G06F16/9024
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title
G06F11/3466
Performance evaluation by tracing or monitoring · CPC title
G06F11/079Primary
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
G06F2201/865
Monitoring of software · CPC title

Patent family

Related publications grouped by family.

View patent family 65802035

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11645141B2 cover?: A system for identifying root cause of anomalies in execution of an application comprising a plurality of operations is provided. The system comprising a preprocessing module configured to receive tracing data comprising a plurality of tracing spans each documenting, for a corresponding operation of the application, a plurality of properties and corresponding values, a signal splitting module c…
Who is the assignee on this patent?: Huawei Cloud Computing Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F11/079. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue May 09 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Automated root cause detection using data flow analysis

Analyzing OpenManage Integration for Troubleshooting Log to Determine Root Cause

Root cause analysis for service degradation in computer networks

Frequently asked questions