Database query processing for data in a remote data store

US11640399B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11640399-B2
Application numberUS-202017115909-A
CountryUS
Kind codeB2
Filing dateDec 9, 2020
Priority dateDec 9, 2020
Publication dateMay 2, 2023
Grant dateMay 2, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

In some examples, a database system identifies a plurality of query portions in a database query that contain references to a first external table, the first external table being based on data from a remote data store coupled to the database system over a network. The database system creates a common spool portion that includes projections and selections of the plurality of query portions, and rewrites the plurality of query portions into rewritten query portions that refer to a spool containing an output of the common spool portion. For execution of the database query, the database system determines, as part of optimizer planning, whether to use the plurality of query portions or the common spool portion and the rewritten query portions.

First claim

Opening claim text (preview).

What is claimed is: 1. A non-transitory machine-readable storage medium comprising instructions that upon execution cause a database system to: identify a plurality of query portions in a database query that contain references to a first external table, the first external table being based on data from a remote data store coupled to the database system over a network; create a common spool portion containing query logic that comprises projections and selections of the plurality of query portions, the query logic of the common spool portion comprising: a conjunction of the projections in corresponding query portions of the plurality of query portions, the conjunction of the projections in the corresponding query portions comprising a union of columns of the first external table that appear in the corresponding query portions, and a disjunction of the selections in the corresponding query portions of the plurality of query portions; rewrite the plurality of query portions into rewritten query portions that refer to a spool containing an output of the common spool portion; and for execution of the database query, determine, as part of optimizer planning, whether to use the plurality of query portions or the common spool portion and the rewritten query portions. 2. The non-transitory machine-readable storage medium of claim 1 , wherein the determining of whether to use the plurality of query portions or the common spool portion and the rewritten query portions is based on a comparison of a first cost associated with executing the plurality of query portions and a second cost associated with executing the common spool portion and the rewritten query portions. 3. The non-transitory machine-readable storage medium of claim 2 , wherein the first cost associated with executing the plurality of query portions is based on a sum of costs of executing respective query portions of the plurality of query portions, and the second cost associated with executing the common spool portion and the rewritten query portions is based on a sum of a cost of executing the common spool portion and costs of executing the rewritten query portions. 4. The non-transitory machine-readable storage medium of claim 2 , wherein the instructions upon execution cause the database system to: select use of the plurality of query portions in response the first cost not exceeding the second cost, and select use of the common spool portion and the rewritten query portions in response to the first cost exceeding the second cost. 5. The non-transitory machine-readable storage medium of claim 1 , wherein the selections in the corresponding query portions are based on respective predicates in the corresponding query portions. 6. The non-transitory machine-readable storage medium of claim 5 , wherein a predicate in the query logic of the common spool portion is a disjunction of the respective predicates in the corresponding query portions. 7. The non-transitory machine-readable storage medium of claim 5 , wherein a predicate in the query logic of the common spool portion is a logical OR of the respective predicates in the corresponding query portions. 8. The non-transitory machine-readable storage medium of claim 1 , wherein the spool referred to by the rewritten query portions comprises a temporary storage to store an output of the query logic in the common spool portion. 9. The non-transitory machine-readable storage medium of claim 1 , wherein the database system comprises a plurality of processing engines to process the database query on respective portions of data of the spool stored in memories of respective processing engines of the plurality of processing engines. 10. The non-transitory machine-readable storage medium of claim 1 , wherein the determining of whether to use the plurality of query portions or the common spool portion and the rewritten query portions is performed on an individual query basis for the database query. 11. The non-transitory machine-readable storage medium of claim 1 , wherein the determining of whether to use the plurality of query portions or the common spool portion and the rewritten query portions is performed during processing of the database query. 12. The non-transitory machine-readable storage medium of claim 1 , wherein use of the common spool portion and the rewritten query portions avoids plural reads of data from the first external table in response to the plurality of query portions in the database query that contain references to the first external table. 13. The non-transitory machine-readable storage medium of claim 1 , wherein use of the common spool portion and the rewritten query portions avoids plural reprocessing of data from the first external table in response to the plurality of query portions in the database query that contain references to the first external table. 14. A database system comprising: at least one processor; and a non-transitory storage medium storing instructions executable on the at least one processor to: identify a plurality of query portions in a database query that contain query logic referring to a common external table, the common external table being based on data from a remote data store coupled to the database system over a network; create a common query logic including projections and selections of the plurality of query portions, the common query logic comprising: a conjunction of the projections in corresponding query portions of the plurality of query portions, the conjunction of the projections in the corresponding query portions comprising a union of columns of the common external table that are referred to by the corresponding query portions, and a logical OR of predicates in the corresponding query portions of the plurality of query portions; rewrite the plurality of query portions into rewritten query portions that contain query logic referring to a spool containing an output of the common query logic; and for execution of the database query, determine, as part of optimizer planning, whether to use the plurality of query portions or the common query logic and the rewritten query portions. 15. The database system of claim 14 , wherein the data from the remote data store comprises one or more objects in non-relational format. 16. The database system of claim 14 , wherein the determining of whether to use the plurality of query portions or the common query logic and the rewritten query portions is based on a comparison of a first cost associated with executing the plurality of query portions and a second cost associated with executing the common query logic and the rewritten query portions. 17. A method of a database system comprising a hardware processor, comprising: parsing a database query that identifies a plurality of query blocks in the database query that contain references to a common external table, the common external table being based on data from a remote data store coupled to the database system over a network; creating a common query logic that comprises projections and selections of the plurality of query blocks, the common query logic comprising: a conjunction of projections in corresponding query blocks of the plurality of query blocks, the conjunction of the projections in the corresponding query blocks comprising a union of columns of the common external table that are referred to by the corresponding query blocks, and a logical OR of predicates in the corresponding query blocks of the plurality of query blocks; rewriting the plurality of query blocks into rewritten query blocks that refer to a spool

Assignees

Inventors

Classifications

  • Selectivity estimation or determination · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11640399B2 cover?
In some examples, a database system identifies a plurality of query portions in a database query that contain references to a first external table, the first external table being based on data from a remote data store coupled to the database system over a network. The database system creates a common spool portion that includes projections and selections of the plurality of query portions, and …
Who is the assignee on this patent?
Teradata Us Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24545. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue May 02 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).