Who is the assignee on this patent?

Muras Brian Robert, Smith Mark Steven, IBM

What technology area does this patent fall under?

Primary CPC classification G06F16/2453. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Jan 26 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Optimization of database queries including grouped aggregation functions

US9244974B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9244974-B2
Application number	US-85518607-A
Country	US
Kind code	B2
Filing date	Sep 14, 2007
Priority date	Sep 14, 2007
Publication date	Jan 26, 2016
Grant date	Jan 26, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Embodiments of the invention provide techniques for generating predicted cardinality statistics for grouped aggregation functions included in database queries. In general, characteristics of a database query are determined, and are then supplied to a probability function configured to generate a predicted cardinality statistic. The generated statistic represents a prediction of the probable cardinality of the results of a grouped aggregation function in the event that the query is executed. The predicted cardinality statistic may be used by a query optimizer to determine an efficient query plan for executing the database query.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method to optimize database queries having grouped aggregation functions, the computer-implemented method comprising: determining characteristics of a database query having a grouped aggregation function; generating, by operation of one or more computer processors, a predicted cardinality statistic based on a probability function and the determined characteristics and without executing the database query, wherein the predicted cardinality statistic predicts a probable cardinality of results of the grouped aggregation function; selecting, from a plurality of candidate query plans and based at least on the predicted cardinality statistic, a first candidate query plan to use in executing the database query, thereby optimizing the database query based on the predicted cardinality statistic; and executing the selected query plan in order to generate a set of query results, wherein the set of query results is returned to a requesting entity. 2. The computer-implemented method of claim 1 , wherein the probability function is configured to determine a probability that multiple independent events will occur. 3. The computer-implemented method of claim 1 , wherein the probability function is configured to determine a probability that multiple dependent events will occur. 4. The computer-implemented method of claim 1 , wherein the characteristics of the database query are selected from the group comprising: (i) a number of rows returned by the database query if executed without grouping, (ii) a number of groups returned by the database query if executed with grouping, and (iii) an average number of rows per group. 5. The computer-implemented method of claim 1 , wherein the database query is composed in the Structured Query Language (SQL) query language. 6. The computer-implemented method of claim 5 , wherein the grouped aggregation function included in the database query is selected from the COUNT, AVG, SUM, MIN, MAX, VARIANCE, and STANDARD_DEVIATION aggregation functions. 7. The computer-implemented method of claim 5 , wherein the characteristics of the database query are selected from the group comprising: (i) attributes of a GROUP BY clause, (ii) attributes of a WHERE clause, (ii) attributes of a JOIN clause, (iv) attributes of a HAVING clause, and (v) attributes of the grouped aggregation function. 8. A computer readable storage medium containing a program which, when executed, performs an operation to optimize database queries having grouped aggregation functions, the operation comprising: determining characteristics of a database query having a grouped aggregation function; generating, by operation of one or more computer processors when executing the program, a predicted cardinality statistic based on a probability function and the determined characteristics and without executing the database query, wherein the predicted cardinality statistic predicts a probable cardinality of results of the grouped aggregation function; selecting, from a plurality of candidate query plans and based at least on the predicted cardinality statistic, a first candidate query plan to use in executing the database query, thereby optimizing the database query based on the predicted cardinality statistic; and executing the selected query plan in order to generate a set of query results, wherein the set of query results is returned to a requesting entity. 9. The computer readable storage medium of claim 8 , wherein the probability function is configured to determine a probability that multiple independent events will occur. 10. The computer readable storage medium of claim 8 , wherein the probability function is configured to determine a probability that multiple dependent events will occur. 11. The computer readable storage medium of claim 8 , wherein the characteristics of the database query are selected from the group comprising (i) a number of rows returned by the database query if executed without grouping, (ii) a number of groups returned by the database query if executed with grouping, and (iii) an average number of rows per group. 12. The computer readable storage medium of claim 8 , wherein the database query is composed in the Structured Query Language (SQL) query language. 13. The computer readable storage medium of claim 12 , wherein the grouped aggregation function included in the database query is selected from the COUNT, AVG, SUM, MIN, MAX, VARIANCE, and STANDARD_DEVIATION aggregation functions. 14. The computer readable storage medium of claim 12 , wherein the characteristics of the database query are selected from the group comprising: (i) attributes of a GROUP BY clause, (ii) attributes of a WHERE clause, (ii) attributes of a JOIN clause, (iv) attributes of a HAVING clause, and (v) attributes of the grouped aggregation function. 15. A system to optimize database queries having grouped aggregation functions, the system comprising: a database; a processor; and a memory containing a program, wherein the program is configured to: determine characteristics of a database query having a grouped aggregation function; generate a predicted cardinality statistic based on a probability function and the determined characteristics and without executing the database query, wherein the predicted cardinality statistic predicts a probable cardinality of results of the grouped aggregation function; select, from a plurality of candidate query plans and based at least on the predicted cardinality statistic, a first candidate query plan to use in executing the database query, thereby optimizing the database query based on the predicted cardinality statistic; and execute the selected query plan in order to generate a set of query results, wherein the set of query results is returned to a requesting entity. 16. The system of claim 15 , wherein the probability function is configured to determine a probability that multiple independent events will occur. 17. The system of claim 15 , wherein the probability function is configured to determine a probability that multiple dependent events will occur. 18. The system of claim 15 , wherein the characteristics of the database query are selected from the group comprising: (i) a number of rows returned by the database query if executed without grouping, (ii) a number of groups returned by the database query if executed with grouping, and (iii) an average number of rows per group. 19. The system of claim 15 , wherein the database query is composed in the Structured Query Language (SQL) query language. 20. The system of claim 19 , wherein the grouped aggregation function included in the database query is selected from the COUNT, AVG, SUM, MIN, MAX, VARIANCE, and STANDARD_DEVIATION aggregation functions. 21. The system of claim 19 , wherein the characteristics of the database query are selected from the group comprising: (i) attributes of a GROUP BY clause, (ii) attributes of a WHERE clause, (ii) attributes of a JOIN clause, (iv) attributes of a HAVING clause, and (v) attributes of the grouped aggregation function. 22. The computer-implemented method of claim 1 , wherein the predicted cardinality statistic causes selection of the first candidate query plan, wherein without the predicted cardinality statistic, a second candidate query plan, different from the first candidate query plan, is selected. 23. The computer-implemented method of claim 22 , wherein in candidate query plan selection, the predicted

Assignees

Inventors

Classifications

G06F16/2453Primary
Query optimisation · CPC title
G06F16/2462
Approximate or statistical queries · CPC title
G06F17/30536
Physics · mapped topic
G06F17/30442Primary
Physics · mapped topic

Patent family

Related publications grouped by family.

View patent family 40455642

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9244974B2 cover?: Embodiments of the invention provide techniques for generating predicted cardinality statistics for grouped aggregation functions included in database queries. In general, characteristics of a database query are determined, and are then supplied to a probability function configured to generate a predicted cardinality statistic. The generated statistic represents a prediction of the probable car…
Who is the assignee on this patent?: Muras Brian Robert, Smith Mark Steven, IBM
What technology area does this patent fall under?: Primary CPC classification G06F16/2453. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Jan 26 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).