Selecting between hydration-based scanning and stateless scale-out scanning to improve query performance

US12197437B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12197437-B2
Application numberUS-202318171245-A
CountryUS
Kind codeB2
Filing dateFeb 17, 2023
Priority dateSep 29, 2021
Publication dateJan 14, 2025
Grant dateJan 14, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

When a query is received by a stateful data processing service, the service determines, for each table scan (and associated operations) of a query, whether to select the table scan for execution by a stateless data processing service. The selected table scans are sent to the stateless data processing service for execution, and results are received by the stateful data processing service. The stateful data processing service may also execute other table scans of the query locally, against a local data cache. If the data is not present in the local data cache, then the stateful data processing service will copy the table data into the local data cache before executing the table scan. A query result based on the remote and/or local table scans may then be returned to the client.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: one or more processors; and one or more memories, wherein the one or more memories have stored thereon instructions, which when executed by the one or more processors of a provider network, cause the one or more processors to implement a stateful data processing service, wherein the stateful data processing service is configured to: receive, from a client, a query for a database, wherein the query indicates table operations to be performed on one or more tables of a database; select by the stateful data processing service, from among the table operations of the query to be performed and based on one or more criteria, at least one table operation of the query to be performed on at least one table of the database by a stateless data processing service instead of the stateful data processing service, wherein at least one remaining table operation of the received query is performed by the stateful data processing service, and wherein data of the at least one table is accessible to both the stateless data processing service and the stateful data processing service; send an indication of the at least one table operation to a stateless data processing service; receive one or more results from the stateless data processing service, wherein the one or more results are based on performance, by the stateless data processing service, of the at least one table operation on the at least one of the tables of the database; and generate, for the client, a query result based at least on the one or more results. 2. The system as recited in claim 1 , wherein to select, based on one or more criteria, at least one table operation from among the table operations to be performed, the stateful data processing service is further configured to: determine that an amount of time to perform the at least one table operation by the stateless data processing service will be less than an amount of time to perform the at least one table operation by the stateful data processing service. 3. The system as recited in claim 1 , wherein to select, based on one or more criteria, at least one table operation from among the table operations to be performed, the stateful data processing service is further configured to determine one or more of: a size of a table to be scanned by the at least one table operation is above a threshold size, or a number of requests to be made by the stateless data processing service to perform the at least one table operation is above a threshold number. 4. The system as recited in claim 1 , wherein to select, based on one or more criteria, at least one table operation from among the table operations to be performed, the stateful data processing service is further configured to determine that: no data of a table to be scanned by the at least one table operation is stored by the stateful data processing service, or an amount of data of the table to be scanned by the at least one table operation that is stored by the stateful data processing service is less than a threshold amount. 5. The system as recited in claim 1 , wherein to select, based on one or more criteria, at least one table operation from among the table operations to be performed, the stateful data processing service is further configured to determine that: an amount of data returned by the at least one table operation is less than a threshold amount. 6. The system as recited in claim 1 , wherein to select, based on one or more criteria, at least one table operation from among the table operations to be performed, the stateful data processing service is further configured to: determine that compute resource usage to perform the at least one table operation is above a threshold amount. 7. The system as recited in claim 1 , wherein at least a portion of the tables is stored by a data cache of the stateful data processing service, and wherein the stateful data processing service is further configured to: perform another table operation on the data cache to generate another result, and wherein to generate the query result, the stateful data processing service is configured to generate the query result based at least on the one or more results and the other result. 8. A method, comprising: performing, by a stateful data processing service implemented by a plurality of computing devices: receiving, from a client, a query for a database, wherein the query indicates table operations to be performed on one or more tables of a database; selecting by the stateful data processing service, from among the table operations of the query to be performed and based on one or more criteria, at least one table operation of the query to be performed on at least one table of the database by a stateless data processing service instead of the stateful data processing service, wherein at least one remaining table operation of the received query is performed by the stateful data processing service, and wherein data of the at least one table is accessible to both the stateless data processing service and the stateful data processing service; sending an indication of the at least one table operation to a stateless data processing service; receiving one or more results from the stateless data processing service, wherein the one or more results are based on performance, by the stateless data processing service, of the at least one table operation on the at least one of the tables of the database; and generating, for the client, a query result based at least on the one or more results. 9. The method as recited in claim 8 , wherein selecting, based on one or more criteria, at least one table operation from among the table operations to be performed comprises: determining that an amount of time to perform the at least one table operation by the stateless data processing service will be less than an amount of time to perform the at least one table operation by the stateful data processing service. 10. The method as recited in claim 8 , wherein selecting, based on one or more criteria, at least one table operation from among the table operations to be performed comprises determining one or more of: a size of a table to be scanned by the at least one table operation is above a threshold size, or a number of requests to be made by the stateless data processing service to perform the at least one table operation is above a threshold number. 11. The method as recited in claim 8 , wherein selecting, based on one or more criteria, at least one table operation from among the table operations to be performed comprises determining that: no data of a table to be scanned by the at least one table operation is stored by the stateful data processing service, or an amount of data of the table to be scanned by the at least one table operation that is stored by the stateful data processing service is less than a threshold amount. 12. The method as recited in claim 8 , wherein selecting, based on one or more criteria, at least one table operation from among the table operations to be performed comprises determining that: an amount of data returned by the at least one table operation is less than a threshold amount. 13. The method as recited in claim 8 , wherein selecting, based on one or more criteria, at least one table operation from among the table operations to be performed comprises: determining that compute resource usage to perform the at least one table operation is above a threshold amount. 14. The method as recited in claim 8 , wherein at least a portion of the tables is stored by a data cache of the stateful data processing service, and further co

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12197437B2 cover?
When a query is received by a stateful data processing service, the service determines, for each table scan (and associated operations) of a query, whether to select the table scan for execution by a stateless data processing service. The selected table scans are sent to the stateless data processing service for execution, and results are received by the stateful data processing service. The st…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24537. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 14 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).