Online data fusion

US9348891B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9348891-B2
Application numberUS-201113311034-A
CountryUS
Kind codeB2
Filing dateDec 5, 2011
Priority dateDec 5, 2011
Publication dateMay 24, 2016
Grant dateMay 24, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An online data fusion system receives a query, probes a first source for an answer to the query, returns the answer from the first source, refreshes the answer while probing an additional source, and applies fusion techniques on data associated with an answer that is retrieved from the additional source. For each retrieved answer, the online data fusion system computes the probability that the answer is correct and stops retrieving data for the answer after gaining enough confidence that data retrieved from the unprocessed sources are unlikely to change the answer. The online data fusion system returns correct answers and terminates probing additional sources in an expeditious manner without sacrificing the quality of the answers.

First claim

Opening claim text (preview).

We claim: 1. A method comprising: receiving, by an online data fusion system comprising a processor, answers to a query from at least two probed sources in response to probing the at least two probed sources; computing, by the processor of the online data fusion system, a probability that each answer of the answers is correct based, at least in part, upon a copying relationship between at least two of the at least two probed sources, wherein computing the probability that each answer of the answers is correct comprises computing, by the processor, an expected probability, a maximum probability, and a minimum probability that each answer of the answers is correct, and refreshing, by the processor, the expected probability, the maximum probability, and the minimum probability of a first answer of the answers from a first probed source of the at least two probed sources based, at least in part, on a second answer received from a second probed source of the at least two probed sources as the second answer is received from the second probed source of the at least two probed sources; when the online data fusion system gains enough confidence that, based upon the probability that each answer of the answers is correct, probing an additional source is unlikely to change the first answer, terminating, by processor, probing without probing the additional source; and providing, by the processor of the online data fusion system, the first answer of the answers in response to the query. 2. The method of claim 1 , wherein computing the probability that each answer of the answers is correct comprises computing the probability that each answer of the answers is correct further based upon a source quality of the first probed source of the at least two probed sources. 3. The method of claim 2 , wherein computing the probability that each answer of the answers is correct comprises computing the probability that each answer of the answers is correct further based upon a coverage of the first probed source of the at least two probed sources. 4. The method of claim 1 , wherein computing the probability that each answer of the answers is correct comprises computing the probability that each answer of the answers is correct further based upon a coverage of the first probed source of the at least two probed sources. 5. The method of claim 1 , wherein computing the probability that each answer of the answers is correct comprises computing the probability that each answer of the answers is correct further based upon an answer expected from the additional source that is unprobed considering a copying relationship between the additional source and the first probed source of the at least two probed sources. 6. The method of claim 1 , further comprising receiving, using the online data fusion system, an ordered list of sources specifying an order in which to probe the at least two probed sources. 7. The method of claim 1 , wherein terminating probing without probing the additional source when the online data fusion system gains enough confidence that probing the additional source is unlikely to change the answers comprises terminating probing without probing the additional source when a termination condition is satisfied. 8. A computer storage medium comprising computer-executable instructions which, when executed by a computer, cause the computer to perform operations comprising: receiving, in response to probing at least two probed sources, answers to a query from the at least two probed sources; computing a probability that each answer of the answers is correct based, at least in part, upon a copying relationship between at least two of the at least two probed sources, wherein computing the probability that each answer of the answers is correct comprises computing an expected probability, a maximum probability, and a minimum probability that each answer of the answers is correct, and refreshing the expected probability, the maximum probability, and the minimum probability of a first answer of the answers from a first probed source of the at least two probed sources based, at least in part, on a second answer received from a second probed source of the at least two probed sources as the second answer is received from the second probed source of the at least two probed sources; when the online data fusion system gains enough confidence that, based upon the probability that each answer of the answers is correct, probing an additional source is unlikely to change the first answer, terminating probing without probing the additional source; and providing the first answer of the answers in response to the query. 9. The computer storage medium of claim 8 , wherein computing the probability that each answer of the answers is correct comprises computing the probability that each answer of the answers is correct further based upon at least one of the following: a source quality of the first probed source of the at least two probed sources; or a coverage of the first probed source of the at least two probed sources. 10. The computer storage medium of claim 8 , wherein computing the probability that each answer of the answers is correct comprises computing the probability that each answer of the answers is correct based upon an answer expected from the additional source that is unprobed considering a copying relationship between the additional source and the first probed source of the at least two probed sources. 11. The computer storage medium of claim 8 , wherein computing the probability that each answer of the answers is correct comprises computing the probability that each answer of the answers is correct without knowledge of a quality of a probed source of the at least two probed sources. 12. The computer storage medium of claim 8 , wherein the operations further comprise receiving an ordered list of sources specifying an order in which to probe the at least two probed sources. 13. The computer storage medium of claim 8 , wherein terminating probing without probing the additional source when the online data fusion system gains enough confidence that probing the additional source is unlikely to change the answers comprises terminating probing without probing the additional source when a termination condition is satisfied. 14. The computer storage medium of claim 8 , wherein computing the probability that each answer of the answers is correct comprises computing the probability that each answer of the answers is correct further based upon a source quality of the first probed source of the at least two probed sources and a coverage of the first probed source of the at least two probed sources. 15. An online data fusion system comprising: a processor; and a memory that stores instructions which, when executed by the processor, cause the processor to perform operations comprising receiving, in response to probing at least two probed sources, answers to a query from the at least two probed sources, computing a probability that each answer of the answer is correct based, at least in part, upon a copying relationship between at least two of the at least two probed sources, wherein computing the probability that each answer of the answers is correct comprises computing an expected probability, a maximum probability, and a minimum probability that each answer of the answers is correct, and refreshing the expected probability, the maximum probability, and the minimum probability of a first answer of the answers from a first probed source of the at least two probed sources based, at least in part, on a second answer received from a second probed source o

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9348891B2 cover?
An online data fusion system receives a query, probes a first source for an answer to the query, returns the answer from the first source, refreshes the answer while probing an additional source, and applies fusion techniques on data associated with an answer that is retrieved from the additional source. For each retrieved answer, the online data fusion system computes the probability that the …
Who is the assignee on this patent?
Srivastava Divesh, Dong Xin, Liu Xuan, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F16/33. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 24 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).