Elimination of query fragment duplication in complex database queries

US11475005B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11475005-B2
Application numberUS-202017064490-A
CountryUS
Kind codeB2
Filing dateOct 6, 2020
Priority dateDec 21, 2018
Publication dateOct 18, 2022
Grant dateOct 18, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A database engine includes one or more computing devices, each having one or more processors and memory. The memory stores programs configured for execution by the processors. The database engine receives a database query from a client, and parses the database query to build a query operator tree. The query operator tree includes a plurality of query operators. The database engine performs one or more optimization passes on the query operator tree, including a deduplication optimization pass, to form an optimized execution plan. The deduplication optimization pass includes determining that a first query operator is equivalent to a second query operator during a traversal of the query operator tree, and replacing the second query operator with a link to reuse results from the first query operator. The database engine executes the optimized execution plan to retrieve a result set from the database and returns the result set to the client.

First claim

Opening claim text (preview).

What is claimed is: 1. A database engine, comprising: one or more computing devices, each having one or more processors and memory, wherein the memory stores one or more programs configured for execution by the one or more processors and the one or more programs comprise instructions for: receiving a database query from a client; parsing the database query to build a query operator tree, the query operator tree including a plurality of query operators; performing one or more optimization passes on the query operator tree, including a deduplication optimization pass, to form an optimized execution plan, the deduplication optimization pass including: determining that a first query operator is equivalent to a second query operator during a traversal of the query operator tree; and replacing the second query operator with a link to reuse results from the first query operator; executing the optimized execution plan to retrieve a result set from the database; and returning the result set to the client. 2. The database engine of claim 1 , wherein the one or more programs further comprise instructions for computing a list of dependencies among the query operators in the tree, wherein the traversal comprises a breadth-first post-order traversal of the query operator tree based on the list of dependencies so that query operators that do not have dependencies are visited before query operators with dependencies. 3. The database engine of claim 2 , wherein the one or more programs further comprise instructions for: in accordance with a determination that a third query operator has no equivalent query operators in the query operator tree, updating the list of dependencies to specify that the third query operator has dependencies so that a parent of the third query operator is not selected during the breadth-first post-order traversal. 4. The database engine of claim 1 , wherein the deduplication optimization pass further includes determining that the first query operator can be materialized. 5. The database engine of claim 4 , wherein the first query operator is one of: a GROUPBy operator, a GROUPJOIN operator, a SORT operator, a WINDOW operator, or a TEMP operator. 6. The database engine of claim 1 , wherein the one or more programs further comprise instructions for performing a tree refactoring optimization pass before the deduplication optimization pass to refactor the query operator tree so as to increase duplicate query operators in the query operator tree. 7. The database engine of claim 1 , wherein determining if the first query operator is equivalent to the second query operator comprises: determining if input operators of the first query operator and the second query operator are equivalent; and determining if the first query operator and the second query operator have equivalent properties. 8. The database engine of claim 7 , wherein determining if the first query operator is equivalent to the second query operator uses commutativity and associativity rules. 9. A non-transitory computer-readable storage medium storing one or more programs configured for execution by a computer system having one or more processors and memory, the one or more programs comprising instructions for: receiving a database query from a client; parsing the database query to build a query operator tree, the query operator tree including a plurality of query operators; performing one or more optimization passes on the query operator tree, including a deduplication optimization pass, to form an optimized execution plan, the deduplication optimization pass including: determining that a first query operator is equivalent to a second query operator during a traversal of the query operator tree; and replacing the second query operator with a link to reuse results from the first query operator; executing the optimized execution plan to retrieve a result set from the database; and returning the result set to the client. 10. The computer-readable storage medium of claim 9 , wherein the deduplication optimization pass further includes determining that the first query operator can be materialized. 11. The computer-readable storage medium of claim 9 , wherein the one or more programs further comprise instructions for performing a tree refactoring optimization pass before the deduplication optimization pass to refactor the query operator tree so as to increase duplicate query operators in the query operator tree. 12. The computer-readable storage medium of claim 9 , wherein determining if the first query operator is equivalent to the second query operator comprises: determining if input operators of the first query operator and the second query operator are equivalent; and determining if the first query operator and the second query operator have equivalent properties. 13. A method of retrieving data from a database, comprising: at a computer system having one or more computing devices, each computing device having one or more processors and memory storing one or more programs configured for execution by the one or more processors: receiving a database query from a client; parsing the database query to build a query operator tree, the query operator tree including a plurality of query operators; performing one or more optimization passes on the query operator tree, including a deduplication optimization pass, to form an optimized execution plan, the deduplication optimization pass including: determining that a first query operator is equivalent to a second query operator during a traversal of the query operator tree; and replacing the second query operator with a link to reuse results from the first query operator; executing the optimized execution plan to retrieve a result set from the database; and returning the result set to the client. 14. The method of claim 13 , further comprising computing a list of dependencies among the query operators in the tree, wherein the traversal comprises a breadth-first post-order traversal of the query operator tree based on the list of dependencies so that query operators that do not have dependencies are visited before query operators with dependencies. 15. The method of claim 14 , further comprising: in accordance with a determination that a third query operator has no equivalent query operators in the query operator tree, updating the list of dependencies to specify that the third query operator has dependencies so that a parent of the third query operator is not selected during the breadth-first post-order traversal. 16. The method of claim 13 , wherein the deduplication optimization pass further includes determining that the first query operator can be materialized. 17. The method of claim 16 , wherein the first query operator is one of: a GROUPBy operator, a GROUPJOIN operator, a SORT operator, a WINDOW operator, or a TEMP operator. 18. The method of claim 13 , further comprising performing a tree refactoring optimization pass before the deduplication optimization pass to refactor the query operator tree so as to increase duplicate query operators in the query operator tree. 19. The method of claim 13 , wherein determining if the first query operator is equivalent to the second query operator comprises: determining if input operators of the first query operator and the second query operator are equivalent; and determining if the first query operator and the second query operator have equivalent properties. 20. The method of claim 19 , wherein determining if the first query

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11475005B2 cover?
A database engine includes one or more computing devices, each having one or more processors and memory. The memory stores programs configured for execution by the processors. The database engine receives a database query from a client, and parses the database query to build a query operator tree. The query operator tree includes a plurality of query operators. The database engine performs one …
Who is the assignee on this patent?
Tableau Software Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24539. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 18 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 7 related publications on this page (citations in our corpus or others sharing the same primary CPC).