Trace backtracking in distributed systems
US-9450849-B1 · Sep 20, 2016 · US
US10263833B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10263833-B2 |
| Application number | US-201514956137-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 1, 2015 |
| Priority date | Dec 1, 2015 |
| Publication date | Apr 16, 2019 |
| Grant date | Apr 16, 2019 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The disclosed embodiments provide a system for processing data. During operation, the system obtains a set of components of a time-series performance metric associated with an anomaly in a performance of one or more monitored systems. For each component in the set of components, the system performs a statistical hypothesis test on the component to assess a deviation of the component from a baseline value of the component. When the statistical hypothesis test identifies a statistically significant deviation of the component from the baseline value, the system outputs an alert comprising a root cause of the anomaly that is represented by the statistically significant deviation of the component from the baseline value.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: obtaining a set of components of a time-series performance metric associated with an anomaly in a performance of one or more monitored systems, wherein the set of components includes a connection time; for each component in the set of components, performing, by a computer system, a statistical hypothesis test on the component to assess a deviation of the component from a corresponding baseline value of the component; and when the statistical hypothesis test identifies a statistically significant deviation of a component from the corresponding baseline value, outputting an alert comprising a root cause of the anomaly that is represented by the statistically significant deviation of the component from the baseline value, wherein the root cause of the anomaly comprises a network issue; analyzing additional data associated with the statistically significant deviation to identify a source of the anomaly by: aggregating the connection time by one or more dimension values of a dimension associated with the time-series performance metric; and for each dimension value in the one or more dimension values, comparing the aggregated connection time for the dimension value with a baseline connection time associated with the dimension value to assess the deviation of the aggregated connection time from the baseline connection time; and when the statistically significant deviation is found in the aggregated connection time, associating the dimension value with the source of the anomaly; and including the source in the outputted alert. 2. The method of claim 1 , wherein analyzing the additional data associated with the statistically significant deviation to identify the source of the anomaly further comprises: obtaining one or more additional dimension values that are related to the dimension value; and including the one or more additional dimension values in the one or more dimension values by which the connection time is aggregated. 3. The method of claim 1 , wherein the statistical hypothesis test is used to compare the aggregated connection time for the dimension value with the baseline connection time associated with the dimension value. 4. The method of claim 1 , wherein the dimension is at least one of: a data center; a point of presence (PoP); an autonomous system number (ASN); a page; and a country. 5. The method of claim 1 , wherein outputting the alert of the anomaly represented by the deviation comprises: matching one or more attributes of the anomaly to the alert; and grouping the anomaly with one or more additional anomalies into the alert. 6. The method of claim 1 , wherein the statistical hypothesis test comprises a sign test. 7. The method of claim 1 , wherein: the set of components further includes a client render time, and the root cause of the anomaly further comprises a client issue. 8. The method of claim 1 , wherein the time-series performance metric includes a page-load time. 9. The method of claim 1 , wherein the set of components of the time-series performance metric includes: the connection time; a first byte time; a content download time; and a client rendering time. 10. An apparatus, comprising: one or more processors; and memory storing instructions that, when executed by the one or more processors, cause the apparatus to: obtain a set of components of a time-series performance metric associated with an anomaly in a performance of one or more monitored systems, wherein the set of components includes a connection time; perform a statistical hypothesis test on each component in the set of components to assess a deviation of the component from a corresponding baseline value of the component; when the statistical hypothesis test identifies a statistically significant deviation of a component from the corresponding baseline value, output an alert comprising a root cause of the anomaly that is represented by the statistically significant deviation of the component from the baseline value, wherein the root cause of the anomaly comprises a network issue; analyze additional data associated with the statistically significant deviation to identify a source of the anomaly by: aggregating the connection time by one or more dimension values of a dimension associated with the time-series performance metric; and for each dimension value in the one or more dimension values, comparing the aggregated connection time for the dimension value with a baseline connection time associated with the dimension value to assess the deviation of the aggregated connection time from the baseline connection time; and when the statistically significant deviation is found in the aggregated connection time, associating the dimension value with the source of the anomaly; and include the source in the outputted alert. 11. The apparatus of claim 10 , wherein the set of dimensions comprises at least one of: a data center; a point of presence; an autonomous system number (ASN); a page; and a country. 12. The apparatus of claim 10 , wherein the statistical hypothesis test is used to compare the aggregated connection time for the dimension value with the baseline connection time associated with the dimension value. 13. The apparatus of claim 10 , wherein: the set of components further includes a client render time, and the root cause of the anomaly further comprises a client issue. 14. The apparatus of claim 10 , wherein outputting the alert of the anomaly represented by the deviation comprises: matching one or more attributes of the anomaly to the alert; and grouping the anomaly with one or more additional anomalies into the alert. 15. The apparatus of claim 10 , wherein the time-series performance metric includes a page-load time. 16. The apparatus of claim 10 , wherein the set of components of the time-series performance metric includes: the connection time; a first byte time; a content download time; and a client rendering time. 17. A system, comprising: an analysis module comprising a non-transitory computer-readable medium comprising instructions that, when executed by one or more processors, cause the system to: obtain a set of components of a time-series performance metric associated with an anomaly in a performance of one or more monitored systems, wherein the set of components includes a connection time; perform a statistical hypothesis test on each component in the set of components to assess a deviation of the component from a corresponding baseline value of the component; analyze additional data associated with the statistically significant deviation to identify a source of the anomaly by: aggregating the connection time by one or more dimension values of a dimension associated with the time-series performance metric; and for each dimension value in the one or more dimension values, comparing the aggregated connection time for the dimension value with a baseline connection time associated with the dimension value to assess the deviation of the aggregated connection time from the baseline connection time; and when the statistically significant deviation is found in the aggregated connection time, associating the dimension value with the source of the anomaly; and include the source in an outputted alert; and a management module comprising a non-transitory computer-readable medium comprising instructions that, when executed by the one or more processors, cause the system to output the alert comprising a root cause of the anomaly that is represented by a statist
Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters · CPC title
involving time analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.