Data stream quality management for analytic environments

US2017286462A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2017286462-A1
Application numberUS-201715624295-A
CountryUS
Kind codeA1
Filing dateJun 15, 2017
Priority dateMay 4, 2012
Publication dateOct 5, 2017
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Data stream quality management for analytic environments includes deploying, into a runtime environment upstream from an analytic module, an ingress quality specification (IQS) module. The IQS module receives a data stream and analyzes a subset of data of the data stream to determine if the subset of data meets a quality expectation of the analytic module. The subset of data is annotated to indicate a quality status based on whether the subset of data meets the quality expectation of the analytic module. The data stream is output with the annotated subset of data to the analytic module, and the analytic module analyzes the data stream to assess an operating characteristic of an upstream device.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, comprising: deploying, into a runtime environment upstream from an analytic module, an ingress quality specification (IQS) module; receiving, by the IQS module, a data stream; analyzing, by the IQS module, a subset of data of the data stream to determine if the subset of data meets a quality expectation of the analytic module; annotating the subset of data to indicate a quality status based on whether the subset of data meets the quality expectation of the analytic module; and outputting the data stream with the annotated subset of data to the analytic module, the analytic module analyzing the data stream to assess an operating characteristic of an upstream device. 2 . The method of claim 1 , further comprising receiving a selection of the IQS module from a plurality of IQS modules associated with the analytic module. 3 . The method of claim 1 , further comprising: selecting, by the IQS module, a portion of the data stream as the subset of data; and applying, by the IQS module, a predicate to the subset of data. 4 . The method of claim 3 , further comprising flagging data of the subset passing the predicate. 5 . The method of claim 4 , further comprising flagging data of the subset failing the predicate. 6 . The method of claim 1 , further comprising determining, by the IQS module, whether the selected subset of data of the data stream includes a minimum quantity of data samples based on the quality expectation of the analytic module. 7 . The method of claim 1 , wherein annotating comprises annotating the subset of data via an additional tuple field of the data. 8 . A method, comprising: deploying, into a runtime environment upstream from an analytic module, an ingress quality specification (IQS) module; receiving, by the IQS module, a data stream; selecting, by the IQS module, a subset of data of the data stream; determining if the subset of data meets a quality expectation of the analytic module by analyzing whether the subset of data includes a minimum quantity of data samples based on the quality expectation of the analytic module; responsive to determining that the subset of data meets the quality expectation, annotating the subset of data to indicate a quality status; and outputting the data stream with the annotated subset of data to the analytic module. 9 . The method of claim 8 , wherein annotating comprises flagging data of the subset meeting the quality expectation. 10 . The method of claim 9 , wherein annotating comprises flagging data of the subset failing to meet the quality expectation. 11 . The method of claim 8 , further comprising receiving, via an interface, a selection of the IQS modules to deploy with the analytic module. 12 . The method of claim 8 , further comprising: independently deploying the IQS module relative to the analytic module; and linking the IQS module to the analytic module. 13 . The method of claim 9 , wherein the data stream comprises a first data stream received from a first source and second data stream received from a second source. 14 . A method, comprising: receiving, by a first ingress quality specification (IQS) module deployed in a runtime environment, a data stream; applying, by the first IQS module, a first predicate to a subset of data of the data stream to determine if the subset of data meets a first quality expectation of a downstream analytic module; making a first annotation in the subset of data to indicate a quality status based on a result of applying the first predicate; outputting by the first IQS module the data stream to a second IQS module; applying, by the second IQS module, a second predicate to the subset of data to determine if the subset of data meets a second quality expectation of the analytic module; making a second annotation in the subset of data to indicate a quality status based on a result of applying the second predicate; receiving, by the analytic module, the data stream from the second IQS module and analyzing the subset of data to assess a performance level of a monitored object, the analytic module discerning between the first annotation and the second annotation to determine whether to include the subset of data in the assessment of the monitored object. 15 . The method of claim 14 , wherein making the first and second annotation comprises flagging data of the subset meeting the respective first and second predicates. 16 . The method of claim 15 , wherein making the first and second annotation further comprises flagging data of the subset failing the respective first and second predicates. 17 . The method of claim 14 , further comprising omitting, by the analytic module, the subset of data from an analysis of the data stream in response to the subset failing the first or second predicate. 18 . The method of claim 14 , wherein making the first and second annotation comprises annotating the subset of data via an additional tuple field of the data. 19 . The method of claim 14 , further comprising receiving, via an interface, a selection of the first and second IQS modules to deploy with the analytic module. 20 . The method of claim 14 , further comprising: independently deploying the first and second IQS modules relative to the analytic module; and linking the first and second IQS modules to the analytic module.

Assignees

Inventors

Classifications

  • Ensuring data consistency and integrity · CPC title

  • Annotation, e.g. comment data or footnotes · CPC title

  • G06F16/22Primary

    Indexing; Data structures therefor; Storage structures · CPC title

  • Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • Data stream processing; Continuous queries · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017286462A1 cover?
Data stream quality management for analytic environments includes deploying, into a runtime environment upstream from an analytic module, an ingress quality specification (IQS) module. The IQS module receives a data stream and analyzes a subset of data of the data stream to determine if the subset of data meets a quality expectation of the analytic module. The subset of data is annotated to ind…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/22. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Oct 05 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).