Hybrid comparison for unicode text strings consisting primarily of ASCII characters
US-10540425-B2 · Jan 21, 2020 · US
US11068520B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-11068520-B1 |
| Application number | US-201815890277-A |
| Country | US |
| Kind code | B1 |
| Filing date | Feb 6, 2018 |
| Priority date | Nov 6, 2016 |
| Publication date | Jul 20, 2021 |
| Grant date | Jul 20, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method is executed at a computer system to retrieve data from a database. Upon receiving a database query, a database engine of the computer system parses the query to form an operator tree including a plurality of join operators. For each of the plurality of clauses, the database engine adds to the operator tree a respective node that specifies a mark join operator, a single join operator, an inner join operator, or an outer join operator. Specifically, the database engine adds the mark join operator when the respective clause includes one of a predetermined set of predicate subqueries, and adds the single join operator when the respective clause includes a scalar subquery. The database engine performs one or more optimization passes on the operator tree to form an optimized execution plan, and executes the optimized execution plan to retrieve a result set from the database.
Opening claim text (preview).
What is claimed is: 1. A method for retrieving data from a database, comprising: at a computer system having one or more processors and memory storing one or more programs configured for execution by the one or more processors: receiving a database query from a client, the database query including a plurality of clauses; parsing the database query to form an operator tree including a plurality of join operators, including, for each of the plurality of clauses: in accordance with a determination that the respective clause includes one of a predetermined set of predicate subqueries, adding to the operator tree a respective node that specifies a mark join operator between a respective left data set and a respective right data set, wherein the mark join operator is configured to create a mark column in a respective intermediate result set for the respective node, the mark column specifying, for each tuple of the respective intermediate result set, whether or not the respective left data set has a join partner from the right data set, and wherein the predetermined set of predicate subqueries comprises an EXISTS subquery, a NOT EXISTS subquery, a UNIQUE subquery, and a quantified comparison predicate; in accordance with a determination that the respective clause includes a scalar subquery, adding to the operator tree a respective node that specifies a single join operator between a respective left data set and a respective right data set, wherein the single join operator is configured to raise an error when there is a row in the respective left data set having two or more join partners in the respective right data set, and the single join operator is configured to operate as a left outer join otherwise; and in accordance with a determination that the respective clause does not include a scalar subquery and does not include any of the predetermined set of predicate subqueries, adding a respective inner join operator or a respective outer join operator to the operator tree for each join condition in the respective clause; performing one or more optimization passes on the operator tree to form an optimized execution plan, including unnesting one or more mark joins or single joins from the operator tree; and executing the optimized execution plan to retrieve a result set from the database. 2. The method of claim 1 , wherein performing the one or more optimization passes further comprises: identifying a first single join operator in the operator tree, the first single join operator joining a first column of a first left data set to a second column of a first right data set; and in accordance with a determination that the second column is a primary key for the first right data set, replacing the first single join operator with a corresponding left outer join operator between the first left data set and the second left data set. 3. The method of claim 1 , wherein the mark column created for a mark join operator has a Boolean data type. 4. The method of claim 1 , wherein each of the plurality of clauses has a clause type selected from the group consisting of: from clause, where clause, group by clause, having clause, select clause, and order by clause. 5. The method of claim 4 , wherein the database query has at most one clause of each clause type. 6. The method of claim 4 , wherein forming the operator tree comprises translating the plurality of clauses in an order according to clause type, in the order: 1) from clause, 2) where clause, 3) group by clause, 4) having clause, 5) select clause, and 6) order by clause. 7. The method of claim 6 , wherein forming the operator tree comprises incrementally adding operators at a top node of an interim operator tree as the clauses are processed in order. 8. The method of claim 1 , wherein performing one or more optimization passes on the operator tree further includes changing an order of the plurality of join operators in the operator tree. 9. The method of claim 8 , wherein the plurality of join operators includes a first mark join operator and a first inner join operator, and changing the order of the plurality of join operators in the operator tree includes performing the first mark join operator prior to the first inner join operator. 10. The method of claim 8 , wherein the order of the plurality of join operators is determined according to a cost-based join enumeration method. 11. The method of claim 1 , wherein performing the one or more optimization passes on the operator tree further includes, for each of the plurality of join operators, selecting a left variant or a right variant to implement the respective join operator according to respective sizes of left and right data sets for a respective join operation. 12. The method of claim 1 , wherein performing the one or more optimization passes on the operator tree includes translating one or more mark join operators to one or more semi join operators. 13. The method of claim 1 , wherein performing the one or more optimization passes on the operator tree includes translating one or more outer join operators to one or more inner join operators. 14. A computer system having one or more computing devices, each computing device having one or more processors and memory, wherein the memory stores one or more programs configured for execution by the one or more processors, the one or more programs comprising instructions for: receiving a database query from a client, the database query including a plurality of clauses; parsing the database query to form an operator tree including a plurality of join operators, including, for each of the plurality of clauses: in accordance with a determination that the respective clause includes one of a predetermined set of predicate subqueries, adding to the operator tree a respective node that specifies a mark join operator between a respective left data set and a respective right data set, wherein the mark join operator is configured to create a mark column in a respective intermediate result set for the respective node, the mark column specifying, for each tuple of the respective intermediate result set, whether or not the respective left data set has a join partner from the right data set, and wherein the predetermined set of predicate subqueries comprises an EXISTS subquery, a NOT EXISTS subquery, a UNIQUE subquery, and a quantified comparison predicate; in accordance with a determination that the respective clause includes a scalar subquery, adding to the operator tree a respective node that specifies a single join operator between a respective left data set and a respective right data set, wherein the single join operator is configured to raise an error when there is a row in the respective left data set having two or more join partners in the respective right data set, and the single join operator is configured to operate as a left outer join otherwise; and in accordance with a determination that the respective clause does not include a scalar subquery and does not include any of the predetermined set of predicate subqueries, adding a respective inner join operator or a respective outer join operator to the operator tree for each join condition in the respective clause; performing one or more optimization passes on the operator tree to form an optimized execution plan, including unnesting one or more mark joins or single joins from the operator tree; and executing the optimized execution plan to retrieve a result set from the database. 15. The computer system of claim 14 , wherein each of the plurality of clauses has a clause type selected from the group consisting of: from clause, where
Query rewriting; Transformation · CPC title
Updating · CPC title
Query translation · CPC title
Query execution (filtering based on additional data G06F16/335) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.