Computer system and method of defining a set of anomaly thresholds for an anomaly detection model

US11181894B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11181894-B2
Application numberUS-201816161003-A
CountryUS
Kind codeB2
Filing dateOct 15, 2018
Priority dateOct 15, 2018
Publication dateNov 23, 2021
Grant dateNov 23, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computing system may create an anomaly detection model to detect anomalies in multivariate data originating from a given data source by extracting a model object for the anomaly detection model using a first set of training data originating from the given data source, establishing starting values of a set of anomaly thresholds for the anomaly detection model using the extracted model object and a second set of training data originating from the given data source, and refining the starting values of the set of anomaly thresholds for at least a subset of the variables included in the multivariate data using the extracted model object and a set of test data. In turn, the computing system may use the anomaly detection model to monitor for anomalies in observation data originating from the given data source.

First claim

Opening claim text (preview).

What is claimed is: 1. A computing system comprising: a communication interface; at least one processor; a non-transitory computer-readable medium; and program instructions stored on the non-transitory computer-readable medium that are executable by the at least one processor to cause the computing system to perform functions including: creating an anomaly detection model that is configured to detect anomalies in multivariate data originating from a given data source by: extracting a model object for the anomaly detection model using a first set of training data originating from the given data source; establishing starting values of a set of anomaly thresholds for the anomaly detection model using the extracted model object and a second set of training data originating from the given data source, wherein the set of anomaly thresholds includes at least one respective anomaly threshold for each of a given set of variables included in the multivariate data originating from the given data source; refining the starting values of at least a subset of anomaly thresholds from the set of anomaly thresholds using the extracted model object and a set of test data originating from the given data source by: for each respective multiplier value in a set of multiplier values, (a) adjusting the starting values of the subset of anomaly thresholds using the respective multiplier value and thereby producing adjusted values of the subset of anomaly thresholds that correspond to the respective multiplier value, and (b) determining a respective extent of multivariate anomalies detected in the set of test data when evaluated using the extracted model object and the adjusted values of the subset of anomaly thresholds that correspond to the respective multiplier value, thereby producing a dataset that comprises the respective extent of multivariate anomalies determined for each respective multiplier value in the set of multiplier values; based on an evaluation of the dataset, selecting a given multiplier value from the set of multiplier values; and using the adjusted values of the subset of anomaly thresholds that correspond to the given multiplier value as a basis for updating the starting values of the subset of anomaly thresholds; and using the anomaly detection model to monitor for anomalies in observation data originating from the given data source. 2. The computing system of claim 1 , wherein selecting the given multiplier value from the set of multiplier values based on the evaluation of the dataset comprises selecting whichever multiplier value in the set of multiplier values is associated with an elbow of the dataset. 3. The computing system of claim 1 , wherein determining the respective extent of multivariate anomalies detected in the set of test data when evaluated using the extracted model object and the adjusted values of the subset of anomaly thresholds that correspond to the respective multiplier value comprises determining that a threshold extent of multivariate exceedances has occurred within a given window. 4. The computing system of claim 2 , wherein selecting the given multiplier value from the set of multiplier values to be whichever multiplier value is associated with the elbow of the dataset comprises: identifying a data point within the dataset that is furthest away from a straight line drawn between a data point corresponding to a smallest multiplier value within the set of multiplier values and a data point corresponding to a largest multiplier value within the set of multiplier values; and identifying the multiplier value corresponding to the identified data point. 5. The computing system of claim 1 , wherein the subset of anomaly thresholds includes any anomaly thresholds in the set of anomaly thresholds that are for anomalous variables in the given set of variables and excludes any anomaly thresholds in the set of anomaly thresholds that are for non-anomalous variables in the given set of variables, and wherein the computing system further comprises program instructions stored on the non-transitory computer-readable medium that are executable by the at least one processor to cause the computing system to perform functions including: before refining the starting values of the subset of anomaly thresholds, differentiating between the non-anomalous variables and the anomalous variables in the given set of variables using the extracted model object, the starting values for the set of anomaly thresholds, and the set of test data; and deciding to use the starting values of the set of anomaly thresholds for the non-anomalous variables without further refinement. 6. The computing system of claim 5 , wherein differentiating between the non-anomalous variables and the anomalous variables in the given set of variables using the extracted model object, the starting values for the set of anomaly thresholds, and the set of test data comprises, for each variable in the given set of variables: tightening the starting value of each of the at least one respective anomaly threshold for the variable by a given amount; determining an extent of univariate anomalies detected in the set of test data for the variable when evaluated using the extracted model object and the tightened value of each of the at least one respective anomaly threshold for the variable; comparing the determined extent of univariate anomalies to a threshold value that serves as a dividing line between a de minimis extent of anomalies and a meaningful extent of anomalies; and identifying the variable as either: (i) non-anomalous if the extent of univariate anomalies detected in the set of test data for the variable is below the threshold or (ii) anomalous if the extent of univariate anomalies detected in the set of test data for the variable is not below the threshold. 7. The computing system of claim 1 , wherein extracting the model object for the anomaly detection model using the first set of training data originating from the given data source comprises: deriving a set of training metrics from the first set of training data; and using the derived set of training metrics to extract the model object for the anomaly detection model. 8. The computing system of claim 1 , wherein establishing starting values of the set of anomaly thresholds for the anomaly detection model using the extracted model object and the second set of training data comprises: scoring the second set of training data using the extracted model object for the anomaly detection model; and using a distribution of score values for each of the given set of variables included in the multivariate data originating from the given data source to determine at least one respective anomaly threshold for each of the given set of variables. 9. The computing system of claim 1 , wherein the given data source comprises an asset, and wherein the multivariate data originating from the given data source comprises multivariate sensor data. 10. The computing system of claim 1 , wherein the anomaly detection model comprises a regression model based on principal component analysis (PCA), and wherein the model object comprises a projection matrix. 11. The computing system of claim 1 , wherein using the anomaly detection model to monitor for anomalies in observation data originating from the given data source comprises: using the extracted model object for the anomaly detection model to score each of plurality of multivariate data points originating from the given data source; evaluating whether a threshold extent of multivariate data points within a given window of time violate the set of anomaly thresholds for the anomaly detection model; and determine that an an

Assignees

Inventors

Classifications

  • Generating training patterns; Bootstrap methods, e.g. bagging or boosting · CPC title

  • Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection · CPC title

  • G05B23/024Primary

    Quantitative history assessment, e.g. mathematical relationships between available data; Functions therefor; Principal component analysis [PCA]; Partial least square [PLS]; Statistical classifiers, e.g. Bayesian networks, linear regression or correlation analysis; Neural networks · CPC title

  • Indexing scheme relating to error detection, to error correction, and to monitoring · CPC title

  • model based detection method, e.g. first-principles knowledge model · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11181894B2 cover?
A computing system may create an anomaly detection model to detect anomalies in multivariate data originating from a given data source by extracting a model object for the anomaly detection model using a first set of training data originating from the given data source, establishing starting values of a set of anomaly thresholds for the anomaly detection model using the extracted model object a…
Who is the assignee on this patent?
Uptake Tech Inc
What technology area does this patent fall under?
Primary CPC classification G05B23/024. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Nov 23 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 11 related publications on this page (citations in our corpus or others sharing the same primary CPC).