Table scan predicate with integrated semi-join filter
US-2024419650-A1 · Dec 19, 2024 · US
US9852181B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9852181-B2 |
| Application number | US-201314076598-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 11, 2013 |
| Priority date | Dec 4, 2012 |
| Publication date | Dec 26, 2017 |
| Grant date | Dec 26, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method, system, and/or computer program product optimizes an order of execution of column join operations. A first partitioning of the first data column splits the first data column into first subsets of rows. A second partitioning of the second data column splits the second data column into a second subsets of rows. A first value frequency information indicates a frequency of attribute values within a subset of rows of the first data column processed. A second value frequency information indicates a frequency of attribute values within a subset of rows of the second data column. Cardinalities of sub-tables derived by a respective joining of the subsets of rows of the first and second data columns are estimated, based on the first and second value frequency information. An order of execution of multiple join operations is then optimized based on the estimated cardinalities of the sub-tables.
Opening claim text (preview).
What is claimed is: 1. A method for optimizing an order of execution of multiple join operations based on at least a first data column and a second data column in a database system having multiple processing units, the method comprising: providing, by one or more processors, at least a first partitioning of the first data column, wherein said at least the first partitioning splits the first data column into a plurality of first subsets of rows, each of the first subsets of rows being correlated with a distinct processing unit from the multiple processing units, wherein each of the first subsets of rows are handled by processing units that differ from one another; providing, by one or more processors, at least a second partitioning of the second data column, wherein said at least the second partitioning splits the second data column into a plurality of second subsets of rows, each of the second subsets of rows being correlated with a distinct processing unit from the multiple processing units, wherein each of the second subsets of rows are handled by processing units that differ from one another; providing, by one or more processors, at least a first value frequency information for each processing unit from the multiple processing units, the first value frequency information indicating a frequency of attribute values within a subset of rows of the first data column processed by a respective processing unit from the multiple processing units; providing, by one or more processors, at least a second value frequency information for each processing unit from the multiple processing units, the second value frequency information indicating a frequency of attribute values within a subset of rows of the second data column processed by the respective processing unit from the multiple processing units; estimating, by one or more processors, cardinalities of sub-tables derived by a respective joining of the subset of rows of the first data column and the subset of rows of the second data column which are processed by a same processing unit from the multiple processing units, wherein estimated cardinalities of the sub-tables are based on the first and second value frequency information of the respective processing unit from the multiple processing units, wherein the cardinalities of the sub-tables describe a quantity of rows in the sub-tables, and wherein the cardinalities of the sub-tables derived by the respective joining of the subset of rows of the first data column and the subset of rows of the second data column are estimated according to: F ( a , b ) = ∑ i = 0 n - 1 ( max ( ∫ x i x i + 1 f ( x ) dx , 0 ) u 1 i * max ( ∫ x i x i + 1 g ( x ) dx , 0 ) u 2 i ) * max ( u 1 i , u 2 i ) where x 0 =a; x n =b; x 1 ε[a,b]; f(x) is a density distribution function of a first data column R; g(x) is a density distribution function of a second data column S; a, b are a starting row and end row of respective columns R and S which are incorporated into a join; u 1 i is a number of unique values of the first data column R in a respective interval i; u 2 i is a number of unique values of the second data column S in the respective interval i; F(a,
Join order optimisation · CPC title
Query optimisation · CPC title
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.