Anomaly detection for non-stationary data

US2017372207A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2017372207-A1
Application numberUS-201715640191-A
CountryUS
Kind codeA1
Filing dateJun 30, 2017
Priority dateDec 31, 2014
Publication dateDec 28, 2017
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method of detecting anomalies in a time series is disclosed. A training time series corresponding to a process is extracted from an initial time series corresponding to the process, the training time series including a subset of the initial time series. Outlier data points in the training time series are modified based on predetermined acceptability criteria. A plurality of prediction methods are trained using the training time series. An actual data point corresponding to the initial time series is received. The plurality of prediction methods are used to determine a set of predicted data points corresponding to the actual data point. It is determined whether the actual data point is anomalous based on a calculation of whether each of the set of predicted data points is statistically different from the actual data point.

First claim

Opening claim text (preview).

What is claimed is: 1 . A system comprising: one or more computer processors; one or more computer memories; one or more modules incorporated into the one or more computer memories, the one or more modules configuring the one or more computer processors to perform operations, the operations comprising: extracting a training time series corresponding to a process from an initial time series corresponding to the process; modifying outlier data points in the training time series based on predetermined acceptability criteria; training a plurality of prediction methods using the training time series; receiving an actual data point corresponding to the initial time series; using the plurality of prediction methods to determine a set of predicted data points corresponding to the actual data point of the initial time series; determining whether the actual data point is anomalous based on a calculation of whether each of the set of predicted data points is statistically different from the actual data point; and receiving an additional actual data point corresponding to the initial time series and extracting an additional training time series from the initial time series based on the additional actual data point. 2 . The system of claim 1 , wherein the calculation of whether each of the set of predicted data points is statistically different from the actual data point includes a determination that the Mahalanobis distance between the prediction error and the fitted multivariate normal joint probability distribution of each of the set of predicted data points is within a specified range. 3 . The system of claim 1 , wherein the additional actual data point corresponds to the initial time series and the operations further comprise extracting an additional training time series having the length offset by an additional index prior to a last data point of the initial time series the additional index reflecting a relative position of the actual data point to the additional actual data point. 4 . The system of claim 1 , further comprising selecting the combination of each of the plurality of prediction methods to minimize a number of false anomaly detections. 5 . The system of claim 1 , further comprising representing the determination of whether the actual data point is anomalous in a graphical user interface, the representing including providing a strength of the determination. 6 . The system of claim 5 , wherein the strength of the determination is based on a number of the plurality of prediction methods that indicate an anomaly with respect to the data point. 7 . The system of claim 1 , wherein the training time series represents a window of the initial time series that is recent in relation to the actual data point. 8 . A method comprising: extracting a training time series corresponding to a process from an initial time series corresponding to the process; modifying outlier data points in the training time series based on predetermined acceptability criteria; training a plurality of prediction methods using the training time series; receiving an actual data point corresponding to the initial time series; using the plurality of prediction methods to determine a set of predicted data points corresponding to the actual data point of the initial time series; determining whether the actual data point is anomalous based on a calculation of whether each of the set of predicted data points is statistically different from the actual data point; and receiving an additional actual data point corresponding to the initial time series and extracting an additional training time series from the initial time series based on the additional actual data point. 9 . The method of claim 8 , wherein the calculation of whether each of the set of predicted data points is statistically different from the actual data point includes a determination that the Mahalanobis distance between the prediction error and the fitted multivariate normal joint probability distribution of each of the set of predicted data points is within a specified range. 10 . The method of claim 8 , wherein additional actual data point corresponds to the initial time series and the method further comprises extracting an additional training time series having the length offset by an additional index prior to a last data point of the initial time series the additional index reflecting a relative position of the actual data point to the additional actual data point. 11 . The method of claim 8 , further comprising selecting the combination of each of the plurality of prediction methods to minimize a number of false anomaly detections. 12 . The method of claim 8 , further comprising representing the determination of whether the actual data point is anomalous in a graphical user interface, the representing including providing a strength of the determination. 13 . The method of claim 12 , wherein the strength of the determination is based on a number of the plurality of prediction methods that indicate an anomaly with respect to the data point. 14 . The method of claim 8 , wherein the training time series represents a window of the initial time series that is recent in relation to the actual data point. 15 . A non-transitory machine readable medium comprising a set of instructions that, when executed by a processor, causes the processor to perform operations, the operations comprising: extracting a training time series corresponding to a process from an initial time series corresponding to the process; modifying outlier data points in the training time series based on predetermined acceptability criteria; training a plurality of prediction methods using the training time series; receiving an actual data point corresponding to the initial time series; using the plurality of prediction methods to determine a set of predicted data points corresponding to the actual data point of the initial time series; determining whether the actual data point is anomalous based on a calculation of whether each of the set of predicted data points is statistically different from the actual data point; and receiving an additional actual data point corresponding to the initial time series and extracting an additional training time series from the initial time series based on the additional actual data point. 16 . The non-transitory machine readable medium of claim 15 , wherein the calculation of whether each of the set of predicted data points is statistically different from the actual data point includes a determination that the Mahalanobis distance between the prediction error and the fitted multivariate normal joint probability distribution of each of the set of predicted data points is within a specified range. 17 . The non-transitory machine readable medium of claim 15 , wherein the additional actual data point corresponds to the initial time series and the operations further comprise extracting an additional training time series having the length offset by an additional index prior to a last data point of the initial time series the additional index reflecting a relative position of the actual data point to the additional actual data point. 18 . The non-transitory machine readable medium of claim 15 , the operations further comprising selecting the combination of each of the plurality of prediction methods to minimize a number of false anomaly detections. 19 . The non-transitory machine readable medium of claim 15 , the operations further comprising representing the de

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017372207A1 cover?
A method of detecting anomalies in a time series is disclosed. A training time series corresponding to a process is extracted from an initial time series corresponding to the process, the training time series including a subset of the initial time series. Outlier data points in the training time series are modified based on predetermined acceptability criteria. A plurality of prediction methods…
Who is the assignee on this patent?
Ebay Inc
What technology area does this patent fall under?
Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Dec 28 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).