Detection of anomalies in a time series using values of a different time series

US10855712B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10855712-B2
Application numberUS-201916446300-A
CountryUS
Kind codeB2
Filing dateJun 19, 2019
Priority dateJan 31, 2017
Publication dateDec 1, 2020
Grant dateDec 1, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In some implementations, sequences of time series values determined from machine data are obtained. Each sequence corresponds to a respective time series. A plurality of predictive models is generated for a first time series from the sequences of time series values. Each predictive model is to generate predicted values associated with the first time series using values of a second time series. For each of the plurality of predictive models, an error is determined between the corresponding predicted values and values associated with the first time series. A predictive model is selected for anomaly detection based on the determined error of the predictive model. Transmission is caused of an indication of an anomaly detected using the selected predictive model.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: obtaining sequences of time series values determined from raw machine data, each sequence corresponding to a respective time series, wherein the raw machine data is produced by one or more components within an information technology or security environment and reflects activity within the information technology or security environment; identifying a predictive model for a first time series based on the sequences of time series values, the predictive model trained over a training period to generate predicted values associated with the first time series using time series values corresponding to a second time series; evaluating one or more characteristics of the predictive model; and selecting the predictive model for anomaly detection based on the evaluating of the one or more characteristics. 2. The method of claim 1 , further comprising applying the selected predictive model to subsequently received time series values of the second time series to detect an anomaly. 3. The method of claim 1 , wherein the evaluating comprises determining an error between the predicted values and values associated with the first time series, wherein the selecting of the predictive model is based on the error of the predictive model. 4. The method of claim 1 , wherein the one or more characteristics correspond to residuals between the predicted values of the predictive model and time series values associated with the first time series. 5. The method of claim 1 comprising: further training the predictive model over a subsequent training period based on the evaluating of the one or more characteristics; and further evaluating the predictive model based on the further training, wherein the predictive model is selected for the anomaly detection based on the further evaluating. 6. The method of claim 1 comprising: training the predictive model using first portions of the sequences of time series values corresponding to the training period; and analyzing second portions of the sequences of time series values corresponding to a prediction period, wherein the one or more characteristics are based on the analyzed second portions. 7. The method of claim 1 , further comprising clustering the sequences of time series values into a plurality of clusters, wherein the first time series corresponds to a representative time series of a first of the plurality of clusters and the second time series corresponds to at least one time series in a second cluster of the plurality of clusters. 8. The method of claim 1 , wherein the obtaining of the sequences of time series values is responsive to a user interaction associated with the sequences of the series values. 9. The method of claim 1 , wherein the sequences of time series values correspond to at least one of performance metrics or security-related metrics. 10. The method of claim 1 , wherein the predictive model comprises one or more of a polynomial model, a neural network, or a decision tree model. 11. The method of claim 1 , wherein the obtaining of the sequences of time series values is from one or more streams of time series data. 12. The method of claim 1 , wherein the one or more characteristics are based on a first explanatory value associated with a model type of the predictive model. 13. The method of claim 1 , wherein the values of the second time series correspond to events, each event comprising a time stamp and a portion of raw data. 14. The method of claim 1 , wherein each data point of the second time series is associated with a respective time stamp of a respective event. 15. The method of claim 1 , comprising generating the time series values from event data using a late-binding schema. 16. The method of claim 1 , wherein the predicted values are associated with later times than the values of the second time series used to generate the predictive model. 17. The method of claim 1 , further comprising causing an explanatory message to be presented based on the anomaly detection, the explanatory message indicating a predicted relationship corresponding to the predictive model and an observed relationship associated with an anomaly. 18. The method of claim 1 , wherein the selecting is of multiple models including the predictive model based on the evaluating of the one or more characteristics, and the anomaly is detected using the multiple models. 19. One or more non-transitory computer-readable storage media having instructions stored thereon, wherein the instructions, when executed by one or more processors, cause the one or more processors to perform a computer-implemented method comprising: obtaining sequences of time series values determined from raw machine data, each sequence corresponding to a respective time series, wherein the raw machine data is produced by one or more components within an information technology or security environment and reflects activity within the information technology or security environment; identifying a predictive model for a first time series based on the sequences of time series values, the predictive model trained over a training period to generate predicted values associated with the first time series using time series values corresponding to a second time series; evaluating one or more characteristics of the predictive model; and selecting the predictive model for anomaly detection based on the evaluating of the one or more characteristics. 20. The one or more computer-readable storage media of claim 19 , wherein the method further comprises applying the selected predictive model to subsequently received time series values of the second time series to detect an anomaly. 21. The one or more computer-readable storage media of claim 19 , wherein the evaluating comprises determining an error between the predicted values and values associated with the first time series, wherein the selecting of the predictive model is based on the error of the predictive model. 22. The one or more computer-readable storage media of claim 19 , wherein the one or more characteristics correspond to residuals between the predicted values of the predictive model and time series values associated with the first time series. 23. The one or more computer-readable storage media of claim 19 , wherein the method further comprises: further training the predictive model over a subsequent training period based on the evaluating of the one or more characteristics; and further evaluating the predictive model based on the further training, wherein the predictive model is selected for the anomaly detection based on the further evaluating. 24. The one or more computer-readable storage media of claim 19 , wherein the method further comprises: training the predictive model using first portions of the sequences of time series values corresponding to the training period; and analyzing second portions of the sequences of time series values corresponding to a prediction period, wherein the one or more characteristics are based on the analyzed second portions. 25. A computer-implemented system comprising: one or more hardware processors; one or more computer-readable storage media having instructions stored thereon, wherein the instructions, when executed by the one or more processors, cause the one or more processors to perform a method comprising: obtaining sequences of time series values determined from raw machine

Assignees

Inventors

Classifications

  • Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound · CPC title

  • Supervised learning · CPC title

  • Feedforward networks · CPC title

  • Traffic logging, e.g. anomaly detection · CPC title

  • Learning methods · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10855712B2 cover?
In some implementations, sequences of time series values determined from machine data are obtained. Each sequence corresponds to a respective time series. A plurality of predictive models is generated for a first time series from the sequences of time series values. Each predictive model is to generate predicted values associated with the first time series using values of a second time series. …
Who is the assignee on this patent?
Splunk Inc
What technology area does this patent fall under?
Primary CPC classification H04L63/1425. Mapped technology areas include Electricity.
When was this patent published?
Publication date Tue Dec 01 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).