Scalable predictive early warning system for data backup event log

US9804909B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-9804909-B1
Application numberUS-201514675177-A
CountryUS
Kind codeB1
Filing dateMar 31, 2015
Priority dateJan 23, 2015
Publication dateOct 31, 2017
Grant dateOct 31, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Techniques to detect backup-related anomalies are disclosed. In various embodiments, a processor is used to generate based at least in part on backup log data associated with a training period a predictive model. The predictive model is to detect, using the processor, anomalies in corresponding backup log data associated with a detection period.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of detecting backup related anomalies, comprising: filtering backup log data associated with one or more backup clients during a training period into one or more sets of backup log data based on one or more attributes; extracting from a set of the one or more sets of backup log data a prescribed set of features; generating, using a processor, based at least in part on the extracted prescribed set of features, a predictive model; using, by the processor, the predictive model to detect anomalies in backup log data associated with the one or more backup clients during a detection period, wherein the anomalies at least includes data being erroneously deleted; computing a score for a detected anomaly; and performing one or more responsive actions based at least in part on a comparison between the computed score and a detection threshold, wherein a responsive action includes a processor using backup data stored at a prior time to restore the erroneously deleted data. 2. The method of claim 1 , further comprising receiving via a network communication interface said backup log data associated with the training period. 3. The method of claim 2 , further comprising receiving via the network communication interface said backup log data associated with the one or more backup clients during the detection period. 4. The method of claim 1 , wherein said one or more attributes include one or more of the following: backup type; backup schedule; backup size; number of objects backed up; source system type; and other source system attribute. 5. The method of claim 1 , wherein the prescribed set of features include one or more of the following: backup size; number of objects backed up; and amount of change in backup data size. 6. The method of claim 1 , wherein the predictive model is associated with one or more of the following model types Gaussian hypothesis testing; KS test; and Kernel Density Estimation. 7. The method of claim 1 , wherein the predictive model is configured to be used to predict for a given set of extracted features associated with a backup performed during the detection period a corresponding statistical probability of occurrence of said given set of features. 8. The method of claim 1 , further comprising ranking detected anomalies based at least in part on their respective scores. 9. The method of claim 1 , wherein the one or more responsive actions are performed based at least in part on a determination that the computed score exceeds a detection threshold. 10. A system to detect backup related anomalies, comprising: a communication interface; and a processor coupled to the communication interface and configured to: receive via the communication interface backup log data associated with one or more backup clients during a training period; filter the backup log data associated with the one or more backup clients during the training period into one or more sets of backup log data based on one or more attributes; extracting from a set of the one or more sets of backup log data a prescribed set of features; generate a predictive model based at least in part on the extracted prescribed set of features; use the predictive model to detect anomalies in backup log data associated with the one or more clients during a detection period, wherein the anomalies at least includes data being erroneously deleted; compute a score for a detected anomaly; and perform one or more responsive actions based at least in part on a comparison between the computed score and a detection threshold, wherein a responsive action includes a processor using backup data stored at a prior time to restore the erroneously deleted data. 11. The system of claim 10 , wherein said one or more attributes include one or more of the following: backup type; backup schedule; backup size; number of objects backed up; source system type; and other source system attribute. 12. The system of claim 10 , wherein the prescribed set of features include one or more of the following: backup size; number of objects backed up; and amount of change in backup data size. 13. A computer program product to detect backup related anomalies, the computer program product being embodied in a non-transitory computer readable storage medium and comprising computer instructions for: filtering backup log data associated with one or more backup clients during a training period into one or more sets of backup log data based on one or more attributes; extracting from a set of the one or more sets of backup log data a prescribed set of features; generating, using a processor, based at least in part on the extracted prescribed set of features, a predictive model; using, by the processor, the predictive model to detect anomalies in backup log data associated with the one or more backup clients during a detection period, wherein the anomalies at least includes data being erroneously deleted; computing a score for a detected anomaly; and performing one or more responsive actions based at least in part on a comparison between the computed score and a detection threshold, wherein a responsive action includes a processor using backup data stored at a prior time to restore the erroneously deleted data.

Assignees

Inventors

Classifications

  • by exceeding a count or rate limit, e.g. word- or bit count limit · CPC title

  • Threshold · CPC title

  • Using snapshots, i.e. a logical point-in-time copy of the data · CPC title

  • Error or fault detection not based on redundancy (power supply failures G06F1/30; network fault management H04L41/06) · CPC title

  • Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9804909B1 cover?
Techniques to detect backup-related anomalies are disclosed. In various embodiments, a processor is used to generate based at least in part on backup log data associated with a training period a predictive model. The predictive model is to detect, using the processor, anomalies in corresponding backup log data associated with a detection period.
Who is the assignee on this patent?
Emc Corp, Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F11/0751. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 31 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).