Database group-by query cardinality estimation
US-2024143586-A1 · May 2, 2024 · US
US12450233B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-12450233-B1 |
| Application number | US-202418747670-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jun 19, 2024 |
| Priority date | Jun 19, 2024 |
| Publication date | Oct 21, 2025 |
| Grant date | Oct 21, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of incorporating a bloom filter (BF) into cost-based bottom-up query optimization includes: receiving a query including a request to join at least two data tables that include a first data table and a second data table; generating, in response to the request, at least two query plans that includes a first query plan including subplans to: create the BF based on the second data table, apply the BF to the first data table for filtering data of the first data table before the first data table is joined with another data table, and join the first data table with the another data table; submitting the at least two query plans to a cost-based bottom-up optimizer to obtain a target query plan; and providing, as a response to the received query, a set of data that is retrieved from a data retrieval system by executing the target query plan.
Opening claim text (preview).
What is claimed is: 1. A method of incorporating a bloom filter (BF) into cost-based bottom-up query optimization, performed by a database management system (DBMS), the method comprising: receiving a query that includes a request to join at least two data tables, the at least two data tables including a first data table and a second data table; determining whether to create one or more additional subplans for scan of the first data table that include application of the BF based on the second data table, based on a target information, wherein the target information includes semantic information of a join predicate between the at least two data tables, and wherein the target information further includes at least one of a filtering capacity represented by the join predicate corresponding to the join or a data amount of a data table associated with the join; after the determining, generating one or more additional BF subplans for the scan of the first data table, wherein each BF subplan includes directives to: create the BF based on the second data table; and apply the BF to the first data table for filtering data of the first data table before the first data table is joined with another data table, wherein each BF subplan includes a revised estimate of a number of rows and a cost for applying the BF such that the scan of the first data table includes at least two subplans, the at least two subplans include a BF subplan that includes the application of the BF, and a non-BF subplan that includes no BF; submitting the at least two subplans to a cost-based bottom-up optimizer to obtain a target query plan, wherein the cost-based bottom-up optimizer considers the revised estimate of the number of rows and cost of the one or more additional BF subplans in comparison to non-BF subplans and arrive at the target query plan with a different join order, join method, or other operations had the BF subplan not been included; and providing, as a response to the query, a set of data that is retrieved from a data retrieval system by executing the target query plan. 2. The method of claim 1 , wherein a first subplan of creating the BF includes operations to: for a join of the at least two data tables, scan the second data table; and create the BF based on the second data table; a second subplan of applying the BF to the first data table includes operations to: scan the first data table; and apply the BF to a scan plan to filter data of the first data table, so as to obtain a plan where the BF is applied; and a third subplan of joining the first data table with the another data table includes operations to: join the filtered first data table and the another data table in the at least two data tables. 3. The method of claim 1 , wherein the submitting the at least two subplans to the cost-based bottom-up optimizer to obtain the target query plan, includes: submitting the at least two subplans to the cost-based bottom-up optimizer to obtain costs of at least two query plans, wherein a cost of a first query plan in the at least two query plans includes a cost of building the BF and a cost of applying the BF; and determining the target query plan based on the costs of the at least two query plans. 4. The method of claim 1 , wherein the join is used to join a data table on a creating side of the BF with a data table on an applying side of the BF out of the at least two data tables, and wherein the determining whether to create and apply the BF before the join based on the target information includes: determining not to create the BF based on the data table on the creating side of the BF and not to apply the BF to the data table on the applying side of the BF before the join, in a case where the semantic information indicates that a primary key of the data table on the creating side of the BF is joined with a foreign key of the data table on the applying side of the BF and no filter predicate exists in the data table on the creating side of the BF. 5. The method of claim 1 , wherein the filtering capacity represented by the join predicate corresponding to the join, and the join is used to join a data table on a creating side of the BF with a data table on an applying side of the BF out of the at least two data tables, and wherein the determining whether to create and apply the BF before the join based on the target information includes: determining to create the BF based on the data table on the creating side of the BF and apply the BF to the data table on the applying side of the BF before the join if the filtering capacity represented by the join predicate corresponding to the join is greater than a capacity threshold in at least one case where: the semantic information indicates that a primary key of the data table on the creating side of the BF is joined with a field other than a foreign key of the data table on the applying side of the BF, the semantic information indicates that a field other than the primary key of the data table on the creating side of the BF is joined with the foreign key of the data table on the applying side of the BF, the semantic information indicates that the field other than the primary key of the data table on the creating side of the BF is joined with the field other than the foreign key of the data table on the applying side of the BF, the semantic information indicates that at least one of the primary key of the data table on the creating side of the BF or the foreign key of the data table on the applying side of the BF does not exists, or a filter predicate exists in the data table on the creating side of the BF. 6. The method of claim 1 , wherein the data amount of the data table associated with the join, and the join is used to join two data tables out of the at least two data tables, wherein the two data tables include the data table on a creating side of the BF and the data table on an applying side of the BF, and wherein the determining whether to create and apply the BF before the join based on the target information includes: determining to apply the BF to a data table having a larger data amount in the two data tables before the join in at least one case where: the semantic information indicates that a primary key of the data table on the creating side of the BF is joined with a field other than a foreign key of the data table on the applying side of the BF, the semantic information indicates that a field other than the primary key of the data table on the creating side of the BF is joined with the foreign key of the data table on the applying side of the BF, the semantic information indicates that the field other than the primary key of the data table on the creating side of the BF is joined with the field other than the foreign key of the data table on the applying side of the BF, the semantic information indicates that at least one of the primary key of the data table on the creating side of the BF or the foreign key of the data table on the applying side of the BF does not exists, or a filter predicate exists in the data table on the creating side of the BF. 7. The method of claim 1 , further comprising: in a case where it is determined to apply at least two BFs to a same data table associated with the join between the at least two data tables based on the target information, the at least two BFs including at least one first BF and at least one second BF, a filtering capacity of the at least one first BF being lower than a filtering capacity of the at least one second BF, discarding the at least one first BF before the join. 8. The method of claim 1 , wherein the at least two subplans further include a second query plan, and the BF is applied to filter data of the first data table in both a first
Plan optimisation · CPC title
Join order optimisation · CPC title
Selectivity estimation or determination · CPC title
Join operations · CPC title
for performance assessment · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.