Hybrid anomaly detection for response-time-based events in a managed network
US-2020236015-A1 · Jul 23, 2020 · US
US11093360B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11093360-B2 |
| Application number | US-201916520377-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 24, 2019 |
| Priority date | Jul 24, 2019 |
| Publication date | Aug 17, 2021 |
| Grant date | Aug 17, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method, computerized apparatus and a computer program product for anomaly detection in a distributed system. The method comprises obtaining measurements of metrics of the distributed system within a timeframe. Each measurement comprises a time-series of values to a metric associated with an action of a component of the distributed system that was measured within the timeframe. A set of percentiles of the measurements is computed, whereby a dimensionality of the sets of percentiles is larger than a dimensionality of the metrics. A multivariate anomaly detection is performed based on the weights of the percentiles to determine an anomaly in the sets of percentiles. In response to detecting an anomaly, a source of the anomaly is identified based on a subset of the percentiles having weights above a threshold, by determining common components or actions that are common to at least a portion of the subset of the percentiles.
Opening claim text (preview).
What is claimed is: 1. A method for anomaly detection in a distributed system, wherein the distributed system comprises a plurality of components located on different networked devices, the method comprising: obtaining a plurality of time-series measurements of a plurality of metrics of the distributed system within a timeframe, wherein each time-series measurement comprises a time-series of values to a metric associated with an action of a component of the distributed system that was measured within the timeframe; computing, for each metric, a set of percentiles of the plurality of time-series measurements corresponding to the each metric, wherein the set of percentiles comprises a plurality of percentiles of the each metric for each timeframe, whereby a dimensionality of a plurality of the sets of percentiles is larger than a dimensionality of the plurality of metrics; performing a multivariate anomaly detection to determine an anomaly in the plurality of the sets of percentiles, wherein said performing comprises computing a weight for each percentile in the plurality of the sets of percentiles, wherein the multivariate anomaly detection is based on the weights of the percentiles; and in response to detecting an anomaly, identifying a source of the anomaly based on a subset of the percentiles, wherein each percentile in the subset has a weight above a threshold, wherein said identifying comprises determining one or more common components or actions that are common to at least a portion of the subset of the percentiles. 2. The method of claim 1 , wherein the timeframe has a duration; wherein said performing the multivariate anomaly detection comprises: obtaining a plurality of reference time-series measurements of the plurality of metrics of the distributed system within a reference timeframe, wherein the reference timeframe comprises a plurality of sub-timeframes having the duration; computing, for each metric and for each sub-timeframe, a set of reference percentiles of the plurality of reference time-series measurements corresponding to the each metric in the sub-timeframe; computing an anomaly score for the distributed system at the timeframe based on a plurality of sets of reference percentiles and based on the plurality of the sets of percentiles. 3. The method of claim 1 , wherein said performing the multivariate anomaly detection is performed based on z-scores of the set of percentiles, wherein the weights are z-scores of the percentiles. 4. The method of claim 1 , wherein the set of percentiles comprises at least a first percentile, a second percentile and a third percentile, wherein the first percentile is an approximation of a minimal value, wherein the second percentile is an approximation of a mean value, wherein the third percentile is an approximation of a maximal value. 5. The method of claim 1 , wherein the set of percentiles comprise at least one of: a 1% percentile; a 10% percentile; a 25% percentile; a 75% percentile; a 90% percentile; and a 99% percentile. 6. The method of claim 1 , wherein the set of percentiles comprise at least five different percentiles. 7. A computerized apparatus having a processor, the processor being adapted to perform the steps of: obtaining a plurality of time-series measurements of a plurality of metrics of the distributed system within a timeframe, wherein each time-series measurement comprises a time-series of values to a metric associated with an action of a component of the distributed system that was measured within the timeframe; computing, for each metric, a set of percentiles of the plurality of time-series measurements corresponding to the each metric, wherein the set of percentiles comprises a plurality of percentiles of the each metric for each timeframe, whereby a dimensionality of a plurality of the sets of percentiles is larger than a dimensionality of the plurality of metrics; performing a multivariate anomaly detection to determine an anomaly in the plurality of the sets of percentiles, wherein said performing comprises computing a weight for each percentile in the plurality of the sets of percentiles, wherein the multivariate anomaly detection is based on the weights of the percentiles; and in response to detecting an anomaly, identifying a source of the anomaly based on a subset of the percentiles, wherein each percentile in the subset has a weight above a threshold, wherein said identifying comprises determining one or more common components or actions that are common to at least a portion of the subset of the percentiles. 8. The computerized apparatus of claim 7 , wherein the timeframe has a duration; wherein said performing the multivariate anomaly detection comprises: obtaining a plurality of reference time-series measurements of the plurality of metrics of the distributed system within a reference timeframe, wherein the reference timeframe comprises a plurality of sub-timeframes having the duration; computing, for each metric and for each sub-timeframe, a set of reference percentiles of the plurality of reference time-series measurements corresponding to the each metric in the sub-timeframe; computing an anomaly score for the distributed system at the timeframe based on a plurality of sets of reference percentiles and based on the plurality of the sets of percentiles. 9. The computerized apparatus of claim 7 , wherein said performing the multivariate anomaly detection is performed based on z-scores of the set of percentiles, wherein the weights are z-scores of the percentiles. 10. The computerized apparatus of claim 7 , wherein the set of percentiles comprises at least a first percentile, a second percentile and a third percentile, wherein the first percentile is an approximation of a minimal value, wherein the second percentile is an approximation of a mean value, wherein the third percentile is an approximation of a maximal value. 11. The computerized apparatus of claim 7 , wherein the set of percentiles comprise at least one of: a 1% percentile; a 10% percentile; a 25% percentile; a 75% percentile; a 90% percentile; and a 99% percentile. 12. The computerized apparatus of claim 7 , wherein the set of percentiles comprise at least five different percentiles. 13. A computer program product comprising a non-transitory computer readable storage medium retaining program instructions, which program instructions when read by a processor, cause the processor to perform a method comprising: obtaining a plurality of time-series measurements of a plurality of metrics of the distributed system within a timeframe, wherein each time-series measurement comprises a time-series of values to a metric associated with an action of a component of the distributed system that was measured within the timeframe; computing, for each metric, a set of percentiles of the plurality of time-series measurements corresponding to the each metric, wherein the set of percentiles comprises a plurality of percentiles of the each metric for each timeframe, whereby a dimensionality of a plurality of the sets of percentiles is larger than a dimensionality of the plurality of metrics; performing a multivariate anomaly detection to determine an anomaly in the plurality of the sets of percentiles, wherein said performing comprises computing a weight for each percentile in the plurality of the sets of percentiles, wherein the multivariate anomaly detection is based on the weights of the percentiles; and in response to detecting an anomaly, identifying a source of the anomaly based on a subset of the percentiles, wherein each percentile in the subset has a weight above a threshold, where
Performance evaluation by statistical analysis · CPC title
for systems · CPC title
Data logging (G06F11/14, G06F11/2205 take precedence) · CPC title
Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title
where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.