Automatic anomaly detection in computer processing pipelines

US2020201650A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2020201650-A1
Application numberUS-201816228663-A
CountryUS
Kind codeA1
Filing dateDec 20, 2018
Priority dateDec 20, 2018
Publication dateJun 25, 2020
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer processing pipeline is automatically computer monitored. The computer processing pipeline includes a plurality of ordered computer stages. At least one computer stage is configured to receive an input data set and perform one or more computer processing operations on the input data set to produce an output data set. The output data set is provided as input to another computer stage of the computer processing pipeline. A historical expected schedule is automatically computer generated for compliant execution of the at least one computer stage. The output data set is automatically computer sampled at a designated time dictated by the historical expected schedule. The sampled output data set is automatically computer tested for compliance with one or more detection rules. An anomaly alert that identifies one or more anomalies is automatically computer issued based on non-compliance of the output data set with the one or more detection rules.

First claim

Opening claim text (preview).

1 . A method of detecting computer anomalies, comprising: automatically computer monitoring a computer processing pipeline executed on a distributed computer system and including a plurality of ordered computer stages, at least one computer stage configured to receive an input data set from one or more storage machines of the distributed computer system and perform one or more computer processing operations on the input data set to produce an output data set, wherein the output data set is stored on the one or more storage machines and provided as input to a different computer stage of the computer processing pipeline; automatically computer generating a historical expected schedule for compliant execution of the at least one computer stage; automatically computer sampling the output data set from the one or more storage machines at a designated time dictated by the historical expected schedule; automatically computer testing the sampled output data set for compliance with one or more detection rules; and automatically computer issuing an anomaly alert identifying one or more anomalies based on non-compliance of the output data set with the one or more detection rules. 2 . The method of claim 1 , wherein the steps of automatically computer recognizing, automatically computer sampling, automatically computer testing, and automatically computer issuing are performed for each of a plurality of different computer stages of the computer processing pipeline. 3 . The method of claim 1 , wherein the steps of automatically computer recognizing, automatically computer sampling, automatically computer testing, and automatically computer issuing are repeated at a regular interval for the at least one computer stage. 4 . The method of claim 3 , further comprising: at each interval repeat, for each of one or more previously-identified anomalies, automatically computer re-testing a re-sampled output data set that triggered the previously-identified anomaly for compliance with the one or more detection rules; and automatically computer resolving the previously-identified anomaly based on the re-sampled data set complying with the one or more detection rules. 5 . The method of claim 1 , wherein the one or more detection rules specify that an anomaly is generated based on the output data set being unavailable for sampling at the designated time. 6 . The method of claim 1 , wherein the one or more detection rules specify that an anomaly is generated based on the output data set being unavailable to be provided as input to the different computer stage of the computer processing pipeline. 7 . The method of claim 1 , wherein the one or more detection rules specify that an anomaly is generated based on a value in the output data set being outside of an expected value range. 8 . The method of claim 1 , wherein the one or more detection rules specify that an anomaly is generated based on a format of the output data set being different than an expected format. 9 . The method of claim 1 , further comprising: automatically computer assigning a priority level to each of the one or more anomalies based on one or more priority rules. 10 . The method of claim 9 , wherein different anomaly alerts are issued for different priority levels of the one or more anomalies. 11 . The method of claim 1 , wherein issuing the anomaly alert includes presenting, via a display, a graphical user interface including visual representations of the one or more anomalies. 12 . The method of claim 1 , wherein issuing the anomaly alert includes sending an alert message identifying the one or more anomalies. 13 . The method of claim 12 , wherein sending an alert message includes sending an email to an administer computer system. 14 . The method of claim 12 , wherein sending an alert message includes sending a text message to a telephone. 15 . The method of claim 1 , wherein the historical expected schedule is generated based on data sets and associated historical processing metrics for performing processing operations on the data sets parsed from processing workflow configuration files for the computer processing pipeline. 16 . The method of claim 15 , wherein the historical processing metrics are determined based on observation of actual operation of the one or more computer stages performing processing operations. 17 . A computing system, comprising: one or more logic machines; and one or more storage machines holding instructions executable by the one or more logic machines to: automatically computer monitor a computer processing pipeline executed on a distributed computer system and including a plurality of ordered computer stages, at least one computer stage configured to receive an input data set from one or more storage machines of the distributed computer system and perform one or more computer processing operations on the input data set to produce an output data set, wherein the output data set is stored on the one or more storage machines and provided as input to a different computer stage of the computer processing pipeline; automatically computer generate a historical expected schedule for compliant execution of the at least one computer stage; automatically computer sample the output data set from the one or more storage machines at a designated time dictated by the historical expected schedule; automatically computer test the sampled output data set for compliance with one or more detection rules; and automatically computer issue an anomaly alert identifying one or more anomalies based on non-compliance of the output data set with the one or more detection rules. 18 . The computing system of claim 17 , wherein the steps of automatically computer recognizing, automatically computer sampling, automatically computer testing, and automatically computer issuing are repeated at a regular interval for the at least one computer stage. 19 . The computing system of claim 18 , wherein the one or more storage machines hold instructions executable by the one or more logic machines to: at each interval repeat, for each of one or more previously-identified anomalies, automatically computer re-test a re-sampled output data set that triggered the previously-identified anomaly for compliance with the one or more detection rules; and automatically computer resolve the previously-identified anomaly based on the re-sampled data set complying with the one or more detection rules. 20 . A method of detecting computer anomalies, comprising: automatically computer monitoring a computer processing pipeline executed on a distributed computer system and including a plurality of ordered computer stages, at least one computer stage configured to receive an input data set from one or more storage machines of the distributed computer system and perform one or more computer processing operations on the input data set to produce an output data set, wherein the output data set is stored on the one or more storage machines and provided as input to a different computer stage of the computer processing pipeline; automatically computer generate a historical expected schedule for compliant execution of the at least one computer stage; automatically computer sampling the output data set from the one or more storage machines at a designated time dictated by the historical expected schedule; automatically computer testing the sampled output data set for compliance with one or more detection rules; and automatically computer identifying one or more anomal

Assignees

Inventors

Classifications

  • G06F9/3861Primary

    Recovery, e.g. branch miss-prediction, exception handling (error detection or correction G06F11/00) · CPC title

  • Monitoring · CPC title

  • Root cause analysis, i.e. error or fault diagnosis (in a hardware test environment G06F11/22; in a software test environment G06F11/36) · CPC title

  • where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems (multiprogramming arrangements G06F9/46; allocation of resources G06F9/50) · CPC title

  • by assessing time · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2020201650A1 cover?
A computer processing pipeline is automatically computer monitored. The computer processing pipeline includes a plurality of ordered computer stages. At least one computer stage is configured to receive an input data set and perform one or more computer processing operations on the input data set to produce an output data set. The output data set is provided as input to another computer stage o…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F9/3861. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 25 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).