Managing real time data stream processing

US11520796B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11520796-B2
Application numberUS-202016848833-A
CountryUS
Kind codeB2
Filing dateApr 14, 2020
Priority dateApr 14, 2020
Publication dateDec 6, 2022
Grant dateDec 6, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for managing data processing includes receiving, from a user of a data query system, a data query for data stored in a data store in communication with the data query system. The method also includes receiving a staleness parameter indicating an upper time boundary for the data query. The upper time boundary limits a query response to data within the data store that is older than the upper time boundary. The method further includes determining whether the data stored within the data store satisfies the staleness parameter. When a portion of the data within the data store fails to satisfy the staleness parameter, the method includes generating the query response that excludes the portion of the data that fails to satisfy the staleness parameter.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, at data processing hardware, from a user of a data query system, a data query for data stored in a data store in communication with the data query system; determining, by the data processing hardware, an upper time boundary for a staleness parameter, the upper time boundary limiting a query response to data within the data store that is older than the upper time boundary by: receiving, at the data processing hardware, user data from the user; ingesting, by the data processing hardware, the received user data into the data store to form one or more log files at a first time; converting, by the data processing hardware, the one or more log files into a columnar data format at a second time, the columnar data format optimized for a respective query; determining, by the data processing hardware, a time difference between the first time and the second time; and assigning, by the data processing hardware, the time difference to the upper time boundary for the staleness parameter; receiving, at the data processing hardware, the upper time boundary for the staleness parameter; determining, by the data processing hardware, whether the data stored within the data store is older than the upper time boundary for the staleness parameter; and when a portion of the data within the data store has been written to the data store more recently than the upper time boundary for the staleness parameter, generating, by the data processing hardware, the query response that excludes the portion of the data that has been written to the data store more recently than the upper time boundary of the staleness parameter. 2. The method of claim 1 , wherein receiving the staleness parameter comprises receiving the staleness parameter from the user of the data query system. 3. The method of claim 1 , further comprising: identifying, by the data processing hardware, log files for the data stored within the data store, each log file comprising a plurality of rows of data, each row of data of the plurality of rows of data comprising a timestamp; and determining, by the data processing hardware, a respective timestamp within the log files that most closely matches the upper time boundary of the staleness parameter; and for the query response, reading, by the data processing hardware, data within the log files that is older than the timestamp that most closely matches the upper time boundary of the staleness parameter. 4. The method of claim 3 , wherein the timestamp indicates a time that the data store generated the respective row of data in a respective log file. 5. The method of claim 1 , further comprising: receiving, at the data processing hardware, a set of data from the user; ingesting, by the data processing hardware, the set of data into the data store; and generating, by the data processing hardware, one or more log files for the set of data, each log file comprising rows of data corresponding to data from the set of data, and wherein generating the one or more log files generates a timestamp for each row of data within a respective log file. 6. A system comprising: data processing hardware; and memory hardware in communication with the data processing hardware, the memory hardware storing instructions that when executed on the data processing hardware cause the data processing hardware to perform operations comprising: receiving from a user of a data query system, a data query for data stored in a data store in communication with the data query system; determining an upper time boundary for a staleness parameter, the upper time boundary limiting a query response to data within the data store that is older than the upper time boundary by: receiving user data from the user; ingesting the received user data into the data store to form one or more log files at a first time; converting the one or more log files into a columnar data format at a second time, the columnar data format optimized for a respective query; determining a time difference between the first time and the second time; and assigning the time difference to the upper time boundary for the staleness Parameter; receiving the upper time boundary for the staleness parameter indicating an upper time boundary for the data query, the upper time boundary limiting a query response to data within the data store that is older than the upper time boundary; determining whether the data stored within the data store is older than the upper time boundary for satisfies the staleness parameter; and when a portion of the data within the data store has been written to the data store more recently than the upper time boundary for the staleness parameter, generating the query response that excludes the portion of the data that has been written to the data store more recently than the upper time boundary of the staleness parameter. 7. The system of claim 6 , wherein receiving the staleness parameter comprises receiving the staleness parameter from the user of the data query system. 8. The system of claim 6 , further comprising: identifying log files for the data stored within the data store, each log file comprising a plurality of rows of data, each row of data of the plurality of rows of data comprising a timestamp; and determining a respective timestamp within the log files that most closely matches the upper time boundary of the staleness parameter; and for the query response, reading data within the log files that is older than the timestamp that most closely matches the upper time boundary of the staleness parameter. 9. The system of claim 8 , wherein the timestamp indicates a time that the data store generated the respective row of data in a respective log file. 10. The system of claim 6 , further comprising: receiving a set of data from the user; ingesting the set of data into the data store; and generating one or more log files for the set of data, each log file comprising rows of data corresponding to data from the set of data, and wherein generating the one or more log files generates a timestamp for each row of data within a respective log file.

Assignees

Inventors

Classifications

  • using data annotations, e.g. user-defined metadata · CPC title

  • using cached or materialised query results · CPC title

  • Temporal data queries · CPC title

  • Updating · CPC title

  • Presentation of query results · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11520796B2 cover?
A method for managing data processing includes receiving, from a user of a data query system, a data query for data stored in a data store in communication with the data query system. The method also includes receiving a staleness parameter indicating an upper time boundary for the data query. The upper time boundary limits a query response to data within the data store that is older than the u…
Who is the assignee on this patent?
Google Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/24573. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 06 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).