Methods and apparatus for real-time anomaly detection over sets of time-series data

US12159237B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-12159237-B1
Application numberUS-201815884768-A
CountryUS
Kind codeB1
Filing dateJan 31, 2018
Priority dateJan 31, 2018
Publication dateDec 3, 2024
Grant dateDec 3, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Methods and apparatus are provided for real-time anomaly detection over sets of time-series data. One method comprises: obtaining a state-space representation of a plurality of states and transitions between said states based on sets of historical time-series data; obtaining an anomaly detection model trained using a supervised learning technique, wherein the anomaly detection model associates sequences of states in the state-space representation with annotated anomalies in the sets of historical time-series data and assigns a probability to said sequences of states; and, for incoming real-time time-series data, determining a likelihood of a current state belonging to a plurality of possible states in the state-space representation; and determining a probability of incurring said annotated anomalies based on a plurality of likely current state sequences that satisfy a predefined likelihood criteria. Anomalous behavior is optionally distinguished from previously unknown behavior based on a predefined likelihood threshold.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: obtaining a state-space representation of a plurality of states of a given system and transitions between said states based on sets of historical time-series data, wherein the sets of historical time-series data comprise time-series data from a plurality of executions of one or more processes of the given system; obtaining a plurality of event-anomaly models for respective ones of a plurality of different anomalies, wherein each event-anomaly model is separately learned using a supervised learning technique, wherein a given event-anomaly model of the plurality of event-anomaly models: (i) comprises a support for one or more sequences of states that lead to a respective annotated anomaly, wherein the respective annotated anomaly is annotated in one or more of the sets of historical time-series data by a domain expert, wherein the support for a given sequence of states that leads to the respective annotated anomaly is based at least in part on a number of occurrences of instances of the given sequence of states that leads to the respective annotated anomaly in the sets of historical time-series data, (ii) processes states derived from the sets of historical time-series data, (iii) is separately learned, from the other event-anomaly models, to predict events comprising one or more sequences of states in the obtained state-space representation that lead to the respective annotated anomaly in the sets of historical time-series data, and (iv) assigns a probability of occurrence to said events, wherein the probability of occurrence of a given event comprises an aggregation of the support for the one or more sequences of states of the given event, wherein a learning of the given event-anomaly model evaluates a length of the sequences of states that lead to the respective annotated anomaly and updates the given event-anomaly model with the support of the sequences of states that lead to the respective annotated anomaly; performing the following steps for real-time time-series data, for two or more of the plurality of event-anomaly models: mapping a current time window of the real-time time-series data to at least one state in the real-time time-series data, wherein the real-time time-series data is obtained at least in part from one or more sensors; determining a likelihood of one or more events in the real-time time-series data that comprise the at least one state; and determining a probability of an instance of said respective anomaly associated with the respective event-anomaly model occurring in the real-time time-series data based at least in part on the determined likelihoods of the one or more events in the real-time time-series data that satisfy one or more predefined likelihood criteria; and performing at least one automated action for a given instance of a particular anomaly detected based at least in part on the determined probabilities for respective ones of the plurality of different anomalies. 2. The method of claim 1 , wherein the step of determining said likelihood of said one or more events further comprises determining a likelihood of a next state based on probabilities of a prior state and a current time-series measurement. 3. The method of claim 1 , wherein the step of determining said probability of said instance of said respective anomaly associated with the event-anomaly model occurring in the real-time time-series data further comprises identifying a set of most likely state sequences based on a likelihood of said state sequences; determining a likelihood of incurring one or more of said annotated anomalies for each of said most likely state sequences from the event-anomaly model; and calculating said probability of said one or more states in the real-time time-series data incurring one or more of said annotated anomalies based on said likelihood of said most likely state sequences in said set with said likelihoods for each of said most likely state sequences from the event-anomaly model. 4. The method of claim 1 , wherein a particular observation from a set of time series in the state-space representation corresponds to a state originally observed in a different set of time-series data. 5. The method of claim 1 , wherein the supervised learning technique comprises a process mining technique. 6. The method of claim 1 , wherein said obtaining said state-space representation comprises extracting a Hidden Markov Model (HMM), wherein said transitions are weighted based on a probability of the respective transition. 7. The method of claim 6 , wherein the HMM is extracted using a sticky Hierarchical Dirichlet Processes Hidden Markov Model formulation, where a knowledge of the cardinality of a set of states of said Hidden Markov Model is not required. 8. The method of claim 1 , further comprising the step of distinguishing between anomalous behavior and previously unknown behavior based on a predefined likelihood threshold. 9. The method of claim 8 , wherein an observation in the real-time time-series data that does not satisfy the predefined likelihood threshold comprises a previously unknown state. 10. The method of claim 9 , wherein said previously unknown state is classified based on one or more of a substantially complete event-anomaly model and domain knowledge. 11. The method of claim 8 , further comprising the step of relearning said event-anomaly model in response to one or more states classified as previously unknown behavior. 12. A computer program product, comprising a non-transitory machine-readable storage medium having encoded therein executable code of one or more software programs, wherein the one or more software programs when executed by at least one processing device perform the following steps: obtaining a state-space representation of a plurality of states of a given system and transitions between said states based on sets of historical time-series data, wherein the sets of historical time-series data comprise time-series data from a plurality of executions of one or more processes of the given system; obtaining a plurality of event-anomaly models for respective ones of a plurality of different anomalies, wherein each event-anomaly model is separately learned using a supervised learning technique, wherein a given event-anomaly model of the plurality of event-anomaly models: (i) comprises a support for one or more sequences of states that lead to a respective annotated anomaly, wherein the respective annotated anomaly is annotated in one or more of the sets of historical time-series data by a domain expert, wherein the support for a given sequence of states that leads to the respective annotated anomaly is based at least in part on a number of occurrences of instances of the given sequence of states that leads to the respective annotated anomaly in the sets of historical time-series data, (ii) processes states derived from the sets of historical time-series data, (iii) is separately learned, from the other event-anomaly models, to predict events comprising one or more sequences of states in the obtained state-space representation that lead to the respective annotated anomaly in the sets of historical time-series data, and (iv) assigns a probability of occurrence to said events, wherein the probability of occurrence of a given event comprises an aggregation of the support for the one or more sequences of states of the given event, wherein a learning of the given event-anomaly model evaluates a length of the sequences of states that lead to the respective annotated anomaly and updates the given event-anomaly model with the support of the sequences of states that lead to the respective

Assignees

Inventors

Classifications

  • Probabilistic graphical models, e.g. probabilistic networks · CPC title

  • using data annotations, e.g. user-defined metadata · CPC title

  • Machine learning · CPC title

  • G06N5/04Primary

    Inference or reasoning models · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12159237B1 cover?
Methods and apparatus are provided for real-time anomaly detection over sets of time-series data. One method comprises: obtaining a state-space representation of a plurality of states and transitions between said states based on sets of historical time-series data; obtaining an anomaly detection model trained using a supervised learning technique, wherein the anomaly detection model associates …
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 03 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 9 related publications on this page (citations in our corpus or others sharing the same primary CPC).