Storage level parallel query processing

US11615083B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11615083-B1
Application numberUS-201815918965-A
CountryUS
Kind codeB1
Filing dateMar 12, 2018
Priority dateNov 22, 2017
Publication dateMar 28, 2023
Grant dateMar 28, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Storage level query processing may be implemented for processing database queries. Nodes that can access a database may perform parallel processing for at least a portion of a database query. An indication may be received that specifies parallel processing for the database query. The nodes can then be caused to perform the portion of the query as part of providing a result in response to the database query instead of a node, such as a query engine node, that received the database query.

First claim

Opening claim text (preview).

What is claimed is: 1. A system, comprising: at least one processor; and a memory to store program instructions that when executed by the at least one processor, cause the at least one processor to implement a query engine node, configured to: receive a database query to a database; generate a plurality of query plans to perform the database query, wherein the plurality of query plans includes a particular query plan to perform the database query, wherein to generate the particular query plan, the query engine node is configured to determine how to divide processing of the database query into a parallel portion of the database query and a query engine node portion of the database query; wherein the particular query plan comprises operations to: assign the parallel portion of the database query to a plurality of storage nodes that are configured to perform the parallel portion of the database query, wherein performance of the parallel portion comprises processing, at respective ones of the storage nodes, different sub-portions of the parallel portion of the database query using different respective data of the database stored at respective ones of the plurality of storage nodes; receive, from the plurality of storage nodes, an operation result for the parallel portion of the database query, the operation result comprising output from the storage nodes resulting from processing the parallel portion of the database query; and access, by the query engine node, further data pages from one or more of the plurality of storage nodes to perform the query engine node portion of the database query to obtain an operation result for the query engine node portion of the database query; select, from among the plurality of query plans, the particular query plan to perform the database query, the selection based, at least in part, on a comparison of respective cost estimates of the particular query plan with respective cost estimates of one or more other query plans of the plurality of query plans generated to perform the database query; and cause the storage nodes to perform the parallel portion of the database query and to return the operation result for the parallel portion of the database query to the query engine node according to the particular query plan as part of performing the database query. 2. The system of claim 1 , wherein to cause the storage nodes to perform the parallel portion of the database query, the program instructions cause the at least one processor to perform a method to send, by the query engine node, respective instructions to the storage nodes to perform at least a portion of the first query plan generated for the database query that includes the parallel portion of the database query; wherein the program instructions further cause the at least one processor to perform the method to: receive, at the query engine node, one or more dirty tuples of the database from one or more of the storage nodes; apply, by the query engine node, one or more undo log records to at least one of the one or more dirty tuples to generate a different version of the at least one dirty tuple; perform, by the query engine node, one or more operations included in the first query plan to process the different version of the at least one dirty tuple; and combine, by the query engine node, the processed at least one dirty tuple with one or more clean tuples received from one or more of the storage nodes as part of generating a result to the database query. 3. The system of claim 1 , wherein the program instructions further cause the at least one processor to perform a method to: substantially concurrent with performance of the database query: receive, by the query engine node, a request to perform another database query; obtain, by the query engine node from at least one of the storage nodes, one or more data pages of the database to perform the other database query; perform, by the query engine node, the other database query with respect to the one or more data pages at the query engine node; and return, by the query engine node, a result to the other database query. 4. The system of claim 1 , wherein the at least one processor is implemented as part of a network-based database service that hosts the database, wherein the storage nodes are implemented as part of a separate storage service, and wherein an indication that specifies performance of the parallel portion of the database query is received via an interface for the database service. 5. A method, comprising: receiving, at a query engine node, a database query; generating, by the query engine node, a plurality of query plans to perform the database query, wherein the plurality of query plans includes a particular query plan to perform the database query, wherein to generate the particular query plan, the query engine node is configured to determine how to divide processing of the database query into a parallel portion of the database query and a query engine node portion of the database query; wherein the particular query plan comprises operations to: assign the parallel portion of the database query to a plurality of storage nodes that are configured to perform the parallel portion of the database query, wherein performance of the parallel portion comprises processing, at respective ones of the storage nodes, different sub-portions of the parallel portion of the database query using different respective data of a database stored at respective ones of the plurality of storage nodes; receive, from the plurality of storage nodes, an operation result for the parallel portion of the database query, the operation result comprising output from a plurality of nodes resulting from processing the parallel portion of the database query; and access, by the query engine node, further data pages from one or more of the plurality of storage nodes to perform the query engine node portion of the database query to obtain an operation result for the query engine node portion of the database query; selecting, from among the plurality of query plans, the particular query plan to perform the database query, the selection based, at least in part, on a comparison of respective cost estimates of the particular query plan with respective cost estimates of one or more other query plans generated to perform the database query; and causing the plurality of nodes to perform the parallel portion of the database query and to return the operation result for the parallel portion of the database query to the query engine node according to the particular query plan as part of performing the database query. 6. The method of claim 5 , wherein causing the plurality of nodes to perform the parallel portion of the database query to return the operation result for the parallel portion of the database query according to the first query plan of performing the database query comprises sending instructions to perform the parallel portion of the database query included in the first query plan. 7. The method of claim 5 , wherein the method further comprises: receiving, at the query engine node, one or more clean tuples from at least one of the plurality of nodes; and including, by the query engine node, the one or more clean tuples in a result provided in response to the database query. 8. The method of claim 7 , further comprising: receiving, at the query engine node, one or more dirty tuples from at least one of the plurality of nodes; applying, by the query engine node, one or more undo log records to the one or more dirty tuples to generate different respective versions of the one or more dirty tuples; performing, by the query engine node, one or more operations to process the different respective versions of the on

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11615083B1 cover?
Storage level query processing may be implemented for processing database queries. Nodes that can access a database may perform parallel processing for at least a portion of a database query. An indication may be received that specifies parallel processing for the database query. The nodes can then be caused to perform the portion of the query as part of providing a result in response to the da…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24532. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 28 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).