Data processing method and related apparatus
US-2024152807-A1 · May 9, 2024 · US
US9244974B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9244974-B2 |
| Application number | US-85518607-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 14, 2007 |
| Priority date | Sep 14, 2007 |
| Publication date | Jan 26, 2016 |
| Grant date | Jan 26, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of the invention provide techniques for generating predicted cardinality statistics for grouped aggregation functions included in database queries. In general, characteristics of a database query are determined, and are then supplied to a probability function configured to generate a predicted cardinality statistic. The generated statistic represents a prediction of the probable cardinality of the results of a grouped aggregation function in the event that the query is executed. The predicted cardinality statistic may be used by a query optimizer to determine an efficient query plan for executing the database query.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method to optimize database queries having grouped aggregation functions, the computer-implemented method comprising: determining characteristics of a database query having a grouped aggregation function; generating, by operation of one or more computer processors, a predicted cardinality statistic based on a probability function and the determined characteristics and without executing the database query, wherein the predicted cardinality statistic predicts a probable cardinality of results of the grouped aggregation function; selecting, from a plurality of candidate query plans and based at least on the predicted cardinality statistic, a first candidate query plan to use in executing the database query, thereby optimizing the database query based on the predicted cardinality statistic; and executing the selected query plan in order to generate a set of query results, wherein the set of query results is returned to a requesting entity. 2. The computer-implemented method of claim 1 , wherein the probability function is configured to determine a probability that multiple independent events will occur. 3. The computer-implemented method of claim 1 , wherein the probability function is configured to determine a probability that multiple dependent events will occur. 4. The computer-implemented method of claim 1 , wherein the characteristics of the database query are selected from the group comprising: (i) a number of rows returned by the database query if executed without grouping, (ii) a number of groups returned by the database query if executed with grouping, and (iii) an average number of rows per group. 5. The computer-implemented method of claim 1 , wherein the database query is composed in the Structured Query Language (SQL) query language. 6. The computer-implemented method of claim 5 , wherein the grouped aggregation function included in the database query is selected from the COUNT, AVG, SUM, MIN, MAX, VARIANCE, and STANDARD_DEVIATION aggregation functions. 7. The computer-implemented method of claim 5 , wherein the characteristics of the database query are selected from the group comprising: (i) attributes of a GROUP BY clause, (ii) attributes of a WHERE clause, (ii) attributes of a JOIN clause, (iv) attributes of a HAVING clause, and (v) attributes of the grouped aggregation function. 8. A computer readable storage medium containing a program which, when executed, performs an operation to optimize database queries having grouped aggregation functions, the operation comprising: determining characteristics of a database query having a grouped aggregation function; generating, by operation of one or more computer processors when executing the program, a predicted cardinality statistic based on a probability function and the determined characteristics and without executing the database query, wherein the predicted cardinality statistic predicts a probable cardinality of results of the grouped aggregation function; selecting, from a plurality of candidate query plans and based at least on the predicted cardinality statistic, a first candidate query plan to use in executing the database query, thereby optimizing the database query based on the predicted cardinality statistic; and executing the selected query plan in order to generate a set of query results, wherein the set of query results is returned to a requesting entity. 9. The computer readable storage medium of claim 8 , wherein the probability function is configured to determine a probability that multiple independent events will occur. 10. The computer readable storage medium of claim 8 , wherein the probability function is configured to determine a probability that multiple dependent events will occur. 11. The computer readable storage medium of claim 8 , wherein the characteristics of the database query are selected from the group comprising (i) a number of rows returned by the database query if executed without grouping, (ii) a number of groups returned by the database query if executed with grouping, and (iii) an average number of rows per group. 12. The computer readable storage medium of claim 8 , wherein the database query is composed in the Structured Query Language (SQL) query language. 13. The computer readable storage medium of claim 12 , wherein the grouped aggregation function included in the database query is selected from the COUNT, AVG, SUM, MIN, MAX, VARIANCE, and STANDARD_DEVIATION aggregation functions. 14. The computer readable storage medium of claim 12 , wherein the characteristics of the database query are selected from the group comprising: (i) attributes of a GROUP BY clause, (ii) attributes of a WHERE clause, (ii) attributes of a JOIN clause, (iv) attributes of a HAVING clause, and (v) attributes of the grouped aggregation function. 15. A system to optimize database queries having grouped aggregation functions, the system comprising: a database; a processor; and a memory containing a program, wherein the program is configured to: determine characteristics of a database query having a grouped aggregation function; generate a predicted cardinality statistic based on a probability function and the determined characteristics and without executing the database query, wherein the predicted cardinality statistic predicts a probable cardinality of results of the grouped aggregation function; select, from a plurality of candidate query plans and based at least on the predicted cardinality statistic, a first candidate query plan to use in executing the database query, thereby optimizing the database query based on the predicted cardinality statistic; and execute the selected query plan in order to generate a set of query results, wherein the set of query results is returned to a requesting entity. 16. The system of claim 15 , wherein the probability function is configured to determine a probability that multiple independent events will occur. 17. The system of claim 15 , wherein the probability function is configured to determine a probability that multiple dependent events will occur. 18. The system of claim 15 , wherein the characteristics of the database query are selected from the group comprising: (i) a number of rows returned by the database query if executed without grouping, (ii) a number of groups returned by the database query if executed with grouping, and (iii) an average number of rows per group. 19. The system of claim 15 , wherein the database query is composed in the Structured Query Language (SQL) query language. 20. The system of claim 19 , wherein the grouped aggregation function included in the database query is selected from the COUNT, AVG, SUM, MIN, MAX, VARIANCE, and STANDARD_DEVIATION aggregation functions. 21. The system of claim 19 , wherein the characteristics of the database query are selected from the group comprising: (i) attributes of a GROUP BY clause, (ii) attributes of a WHERE clause, (ii) attributes of a JOIN clause, (iv) attributes of a HAVING clause, and (v) attributes of the grouped aggregation function. 22. The computer-implemented method of claim 1 , wherein the predicted cardinality statistic causes selection of the first candidate query plan, wherein without the predicted cardinality statistic, a second candidate query plan, different from the first candidate query plan, is selected. 23. The computer-implemented method of claim 22 , wherein in candidate query plan selection, the predicted
Query optimisation · CPC title
Approximate or statistical queries · CPC title
Physics · mapped topic
Physics · mapped topic
Related publications grouped by family.
Answers are generated from the same data shown on this page.