Selecting queries for execution on a stream of real-time data

US10657134B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10657134-B2
Application numberUS-201514818895-A
CountryUS
Kind codeB2
Filing dateAug 5, 2015
Priority dateAug 5, 2015
Publication dateMay 19, 2020
Grant dateMay 19, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A computer-implemented method for executing a query on data items located at different places in a stream of near real-time data to provide near-real time intermediate results for the query, as the query is being executed, the method including: from time to time, executing, by one or more computer systems, the query on two or more of the data items located at different places in the stream, with the two or more data items being accessed in near real-time with respect to each of the two or more data items; generating information indicative of results of executing the query; and as the query continues being executed, generating intermediate results of query execution by aggregating the results with prior results of executing the query on data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results of query execution, prior to completion of execution of the query.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for executing a dataflow graph that represents a query on data items in a stream of near real-time data to provide, as the dataflow graph is being executed, intermediate results for the query, the method including: receiving a stream of near real-time data having data items located in different places in the stream; between a first time and a second time, intermittently executing the dataflow graph that represents the query multiple times, with the dataflow graph being executed by one or more computer systems in near real-time with respect to receipt of the stream of near real-time data and being executed upon the stream of near real-time data for two or more of the data items, with the dataflow graph including computer code to implement the query, and with the dataflow graph receiving as input query specifications for the query; generating, during execution of the dataflow graph, one or more query results that satisfy the query; generating intermediate results from the one or more query results, as the dataflow graph intermittently executes between the first time and the second time, by aggregating the one or more query results with one or more prior query results of one or more prior executions of the dataflow graph on the stream of near real-time data for the two or more of the data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results during intermittent execution of the dataflow graph, prior to completion of execution of the dataflow graph. 2. A system for executing a dataflow graph that represents a query on data items in a stream of near real-time data to provide, as the dataflow graph is being executed, intermediate results for the query, the system including: one or more processing devices; and one or more machine-readable hardware storage devices storing instructions that are executable by the one or more processing devices to perform operations including: receiving a stream of near real-time data having data items located in different places in the stream; between a first time and a second time, intermittently executing the dataflow graph that represents the query multiple times, with the dataflow graph being executed by one or more computer systems in near real-time with respect to receipt of the stream of near real-time data and being executed upon the stream of near real-time data for two or more of the data items, with the dataflow graph including computer code to implement the query, and with the dataflow graph receiving as input query specifications for the query; generating, during execution of the dataflow graph, one or more query results that satisfy the query; generating intermediate results from the one or more query results, as the dataflow graph intermittently executes between the first time and the second time, by aggregating the one or more results with one or more prior query results of one or more prior executions of the query dataflow graph on the stream of near real-time data for the two or more of the data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results during intermittent execution of the dataflow graph, prior to completion of execution of the dataflow graph. 3. One or more machine-readable hardware storages for executing a dataflow graph that represents a query on data items in a stream of near real-time data to provide, as the dataflow graph is being executed, intermediate results for the query, the one or more machine-readable hardware storages storing instructions that are executable by one or more processing devices to perform operations including: receiving a stream of near real-time data having data items located in different places in the stream; between a first time and a second time, intermittently executing the dataflow graph that represents the query multiple times, with the dataflow graph being executed by one or more computer systems in near real-time with respect to receipt of the stream of near real-time data and being executed upon the stream of near real-time data for two or more of the data items, with the dataflow graph including computer code to implement the query, and with the dataflow graph receiving as input query specifications for the query; generating, during execution of the dataflow graph, one or more query results that satisfy the query; generating intermediate results from the one or more query results, as the dataflow graph intermittently executes between the first time and the second time, by aggregating the one or more query results with one or more prior query results of one or more prior executions of the dataflow graph on the stream of near real-time data for the two or more of the data items that previously appeared in the stream of near real-time data; and transmitting to one or more client devices the intermediate results during intermittent execution of the dataflow graph, prior to completion of execution of the dataflow graph. 4. The computer-implemented method of claim 1 , further including: at a subsequent point in time, aggregating the intermediate results with results for executing the dataflow graph at the subsequent point in time to generate final results. 5. The computer-implemented method of claim 1 , wherein intermittently executing the dataflow graph includes: executing the dataflow graph on a first one of the data items in the stream of near real-time data located in a first portion of the stream of near real-time data, and executing the dataflow graph on a second one of the data items in the stream of near real-time data located in a second portion of the stream. 6. The computer-implemented method of claim 1 , wherein the dataflow graph includes components that represent operations to be performed in execution of the query, and wherein the method further includes: for a component: performing a checkpoint operation that saves a local state of the component to enable recoverability of a state of the dataflow graph. 7. The computer-implemented method of claim 1 , wherein the dataflow graph is executed on data items that appear in the stream of near real-time data during a period of time the end of which is unknown at a start of executing the dataflow graph. 8. The computer-implemented method of claim 1 , wherein an amount of data items in the stream of near real-time data on which the dataflow graph is executed is unknown at a start of executing the dataflow graph. 9. The computer-implemented method of claim 1 , further including: generating, based on the aggregated results, a near real-time alert to alert a user of detection of a pre-defined condition. 10. The computer-implemented method of claim 1 , wherein the stream of near real-time data includes a stream of near real-time data in which data items are (i) periodically received at different times or (ii) continuously received at different times. 11. The computer-implemented method of claim 1 , further including receiving the stream of near real-time data from a data queue, a data repository, or a data feed. 12. The computer-implemented method of claim 1 , wherein the dataflow graph is a first dataflow graph, with the method further including: selecting a second dataflow graph for execution on two or more of the data items that appear at different locations in the stream of near real-time data; and executing the first and second dataflow graphs in near real-time with respect to the data items of the stream of near real-time data. 13. The computer-implemented method of

Assignees

Inventors

Classifications

  • Data stream processing; Continuous queries · CPC title

  • Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title

  • Query predicate definition using graphical user interfaces, including menus and forms (G06F16/2423 takes precedence) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10657134B2 cover?
A computer-implemented method for executing a query on data items located at different places in a stream of near real-time data to provide near-real time intermediate results for the query, as the query is being executed, the method including: from time to time, executing, by one or more computer systems, the query on two or more of the data items located at different places in the stream, wit…
Who is the assignee on this patent?
Ab Initio Technology Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/24568. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 19 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).