Data stream quality management for analytic environments

US9690812B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9690812-B2
Application numberUS-201313764696-A
CountryUS
Kind codeB2
Filing dateFeb 11, 2013
Priority dateMay 4, 2012
Publication dateJun 27, 2017
Grant dateJun 27, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to one aspect of the present disclosure, a method and technique for data quality management is disclosed. The method includes: deploying, into a runtime environment with a data stream analytic module, an ingress quality specification (IQS) module associated with the analytic module; receiving, by the IQS module, the data stream; analyzing, by the IQS module, a subset of data of the data stream to determine if the subset of data meets a quality expectation of the analytic module; annotating the subset of data to indicate a quality status based on whether the subset of data meets the quality expectation of the analytic module; and outputting the data stream to the analytic module.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: deploying, into a runtime environment upstream from an analytic module, an ingress quality specification (IQS) module; receiving, by the IQS module, a data stream; analyzing, by the IQS module, a subset of data of the data stream to determine if the subset of data meets a quality expectation of the analytic module; annotating the subset of data to indicate a quality status based on whether the subset of data meets the quality expectation of the analytic module; outputting the data stream with the annotated subset of data to the analytic module; receiving, by the analytic module, the data stream, the analytic module analyzing the data stream to assess an operating characteristic of an upstream device; determining, by the analytic module, the quality status of the subset of data; and responsive to determining that the quality status indicates that the subset of data does not meet the quality expectation, omitting the subset of data from an analysis of the data stream by the analytic module. 2. The method of claim 1 , further comprising receiving a selection of the IQS module from a plurality of IQS modules associated with the analytic module. 3. The method of claim 1 , further comprising: selecting, by the IQS module, a portion of the data stream as the subset of data; and applying, by the IQS module, a predicate to the subset of data. 4. The method of claim 3 , further comprising flagging data of the subset passing the predicate. 5. The method of claim 4 , further comprising flagging data of the subset failing the predicate. 6. The method of claim 1 , further comprising determining, by the IQS module, whether the selected subset of data of the data stream includes a minimum quantity of data samples based on the quality expectation of the analytic module. 7. A method, comprising: deploying, into a runtime environment an analytic module, the analytic module configured to analyze a data stream associated with an upstream object and output an analysis of the object; deploying, into the runtime environment upstream from the analytic module, an ingress quality specification (IQS) module; receiving, by the IQS module, the data stream; analyzing, by the IQS module, a subset of data of the data stream to determine if the subset of data meets a quality expectation of the analytic module; annotating the subset of data via an additional field of the data to indicate a quality status based on whether the subset of data meets the quality expectation of the analytic module; outputting the data stream with the annotated subset of data to the analytic module; receiving, by the analytic module, the data stream; determining, by the analytic module, the quality status of the subset of data; and responsive to determining that the quality status indicates that the subset of data does not meet the quality expectation, omitting the subset of data from an analysis of the data stream by the analytic module. 8. The method of claim 7 , further comprising receiving a selection of the IQS module from a plurality of IQS modules associated with the analytic module. 9. The method of claim 7 , further comprising: selecting, by the IQS module, a portion of the data stream as the subset of data; and applying, by the IQS module, a predicate to the subset of data. 10. The method of claim 9 , further comprising flagging data of the subset passing the predicate. 11. The method of claim 10 , further comprising flagging data of the subset failing the predicate. 12. A method, comprising: deploying, into a runtime environment with a data stream analytic module, an ingress quality specification (IQS) module associated with the analytic module; receiving, by the IQS module, the data stream; analyzing, by the IQS module, a subset of data of the data stream to determine if the subset of data meets a quality expectation of the analytic module; annotating the subset of data to indicate a quality status based on whether the subset of data meets the quality expectation of the analytic module; outputting the data stream with the annotated subset of data to the analytic module; receiving, by the analytic module, the data stream from the IQS module; identifying, by the analytic module, data not meeting the quality expectation based on the annotations; and omitting, by the analytic module, from its analysis of the subset data the data not meeting the quality expectation. 13. The method of claim 12 , further comprising receiving a selection of the IQS module from a plurality of IQS modules associated with the analytic module. 14. The method of claim 12 , further comprising: selecting, by the IQS module, a portion of the data stream as the subset of data; and applying, by the IQS module, a predicate to the subset of data. 15. The method of claim 14 , further comprising flagging data of the subset passing the predicate. 16. The method of claim 15 , further comprising flagging data of the subset failing the predicate.

Assignees

Inventors

Classifications

  • Ensuring data consistency and integrity · CPC title

  • Data stream processing; Continuous queries · CPC title

  • G06F16/22Primary

    Indexing; Data structures therefor; Storage structures · CPC title

  • Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors · CPC title

  • Annotation, e.g. comment data or footnotes · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9690812B2 cover?
According to one aspect of the present disclosure, a method and technique for data quality management is disclosed. The method includes: deploying, into a runtime environment with a data stream analytic module, an ingress quality specification (IQS) module associated with the analytic module; receiving, by the IQS module, the data stream; analyzing, by the IQS module, a subset of data of the da…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/24568. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 27 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).