What technology area does this patent fall under?

Primary CPC classification G06F16/2458. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Nov 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Automated data exploration and validation

US11176148B2 · US · B2

Patent metadata
Field	Value
Publication number	US-11176148-B2
Application number	US-201916537351-A
Country	US
Kind code	B2
Filing date	Aug 9, 2019
Priority date	Jan 13, 2017
Publication date	Nov 16, 2021
Grant date	Nov 16, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments for automated data exploration and validation by a processor. One or more optimal data flows are provided in response to a query for one or more heterogeneous data sources according to an inference model based on a knowledge graph a plurality of data flows between one or more heterogeneous data sources relating to the query. An analytical flow is provided for one or more of the plurality of data flows for those of the one or more heterogeneous data sources that are undetected, and two or more of the one or more of the plurality of data flows are aggregated or disaggregated for the one or more heterogeneous data sources that are nested within the knowledge graph. One or more criteria is received from a user via an interactive graphical user interface (GUI) to use for defining the one or more optimal data flows.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method, by a processor, for automated data exploration and validation, comprising: generating one or more optimal data flows in response to a query for one or more heterogeneous data sources according to an inference model based on a knowledge graph of a plurality of data flows between one or more heterogeneous data sources relating to the query; providing an analytical flow for one or more of the plurality of data flows for those of the one or more heterogeneous data sources that are undetected; aggregating or disaggregating two or more of the one or more of the plurality of data flows for the one or more heterogeneous data sources that are nested within the knowledge graph; and receiving one or more criteria from a user via an interactive graphical user interface (GUI) to use for defining the one or more optimal data flows. 2. The method of claim 1 , further including: measuring one or more key performance indicators (KPIs) of each of the plurality of data flows that answer the query; and assigning a confidence score to each of the plurality of data flows for each of the plurality of data flows based on the KPIs. 3. The method of claim 2 , further including ranking each of the plurality of data flows according to the confidence score. 4. The method of claim 1 , further including receiving user feedback relating to a confidence score such that the inference model is updated based on the user feedback. 5. The method of claim 1 , further including selecting, as the one or more optimal data flows, at least one of the plurality of data flows having a highest confidence score as compared to those of the plurality of data flows having a lower confidence score in relation to each other according to the inference model. 6. The method of claim 1 , further including providing a mapping between plurality of data flows and the one or more heterogeneous data sources on the knowledge graph that satisfy the query. 7. A system for automated data exploration and validation, comprising: one or more computers with executable instructions that when executed cause the system to: generate one or more optimal data flows in response to a query for one or more heterogeneous data sources according to an inference model based on a knowledge graph of a plurality of data flows between one or more heterogeneous data sources relating to the query; provide an analytical flow for one or more of the plurality of data flows for those of the one or more heterogeneous data sources that are undetected; aggregate or disaggregate two or more of the one or more of the plurality of data flows for the one or more heterogeneous data sources that are nested within the knowledge graph; and receive one or more criteria from a user via an interactive graphical user interface (GUI) to use for defining the one or more optimal data flows. 8. The system of claim 7 , wherein the executable instructions: measure one or more key performance indicators (KPIs) of each of the plurality of data flows that answer the query; and assign a confidence score to each of the plurality of data flows for each of the plurality of data flows based on the KPIs. 9. The system of claim 7 , wherein the executable instructions rank each of the plurality of data flows according to the confidence score. 10. The system of claim 7 , wherein the executable instructions receive user feedback relating to a confidence score such that the inference model is updated based on the user feedback. 11. The system of claim 7 , wherein the executable instructions select, as the one or more optimal data flows, at least one of the plurality of data flows having a highest confidence score as compared to those of the plurality of data flows having a lower confidence score in relation to each other according to the inference model. 12. The system of claim 7 , wherein the executable instructions provide a mapping between plurality of data flows and the one or more heterogeneous data sources on the knowledge graph that satisfy the query. 13. A computer program product for, by a processor, automated data exploration and validation, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising: an executable portion that generates one or more optimal data flows in response to a query for one or more heterogeneous data sources according to an inference model based on a knowledge graph of a plurality of data flows between one or more heterogeneous data sources relating to the query; an executable portion that provides an analytical flow for one or more of the plurality of data flows for those of the one or more heterogeneous data sources that are undetected; an executable portion that aggregates or disaggregates two or more of the one or more of the plurality of data flows for the one or more heterogeneous data sources that are nested within the knowledge graph; and an executable portion that receives one or more criteria from a user via an interactive graphical user interface (GUI) to use for defining the one or more optimal data flows. 14. The computer program product of claim 13 , further including an executable portion that: measures one or more key performance indicators (KPIs) of each of the plurality of data flows that answer the query; and assigns a confidence score to each of the plurality of data flows for each of the plurality of data flows based on the KPIs. 15. The computer program product of claim 13 , further including an executable portion that ranks each of the plurality of data flows according to the confidence score. 16. The computer program product of claim 13 , further including an executable portion that receives user feedback relating to a confidence score such that the inference model is updated based on the user feedback. 17. The computer program product of claim 13 , further including an executable portion that selects, as the one or more optimal data flows, at least one of the plurality of data flows having a highest confidence score as compared to those of the plurality of data flows having a lower confidence score in relation to each other according to the inference model. 18. The computer program product of claim 13 , further including an executable portion that provides a mapping between plurality of data flows and the one or more heterogeneous data sources on the knowledge graph that satisfy the query.

Assignees

Inventors

Classifications

G06F30/20
Design optimisation, verification or simulation (optimisation, verification or simulation of circuit designs G06F30/30) · CPC title
G06F16/2458Primary
Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries · CPC title
G06F16/27
Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor · CPC title
G06F16/24578Primary
using ranking · CPC title
G06F16/248
Presentation of query results · CPC title

Patent family

Related publications grouped by family.

View patent family 62840850

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11176148B2 cover?: Embodiments for automated data exploration and validation by a processor. One or more optimal data flows are provided in response to a query for one or more heterogeneous data sources according to an inference model based on a knowledge graph a plurality of data flows between one or more heterogeneous data sources relating to the query. An analytical flow is provided for one or more of the plur…
Who is the assignee on this patent?: IBM
What technology area does this patent fall under?: Primary CPC classification G06F16/2458. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Nov 16 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Personal knowledge graph population from declarative user utterances

Business intelligence (bi) query and answering using full text search and keyword semantics

System for linking diverse data systems

Traceability in a modeling environment

Method and system for retrieving information from knowledge-based assistive network to assist users intent

Unsupervised Relation Detection Model Training

Frequently asked questions