Optimization of aggregate queries in database management systems using an early out join when processing min and max functions

US9524317B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9524317-B2
Application numberUS-84503107-A
CountryUS
Kind codeB2
Filing dateAug 24, 2007
Priority dateAug 24, 2007
Publication dateDec 20, 2016
Grant dateDec 20, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method, computer program product, and system for optimizing aggregate queries are provided. The method, computer program product, and system provide for receiving an aggregate query comprising a GROUP BY operation and an aggregate function, creating an access plan for executing the aggregate query, the access plan including a join between an outer relation and an inner relation, and designating the join included in the access plan as an early out join.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for optimizing aggregate queries, the method comprising: receiving an aggregate query comprising a GROUP BY operation, relations, and an aggregate function; identifying a first relation of the relations as an outer relation of a join between the outer relation and an inner relation; identifying a second relation of the relations as the inner relation of the join; marking the join as an early out join such that upon execution of the aggregate query in accordance with the access plan, each row in the outer relation is joined to a first row of one or more qualifying rows in the inner relation; creating a first subplan having a first cost and that includes the join that is marked as the early out join; creating a second subplan having a second cost and that does not include the join that is marked as the early out join; determining that the first subplan includes the join that is marked as the early out join and that the first cost is greater than the second cost; and creating an access plan that includes the first subplan based on the determination. 2. The method of claim 1 , wherein the join included in the access plan is designated as an early out join where, for each row in the outer relation to be joined to one or more qualifying rows in the inner relation, all rows in the inner relation to be joined to the row in the outer relation have a same value for a set of one or more group by columns, each of the one or more group by columns being a column involved in the GROUP BY operation, any predicate of the aggregate query not yet applied either completely filters out or completely preserves all rows in the inner relation to be joined to the row in the outer relation, all rows in the inner relation to be joined to the row in the outer relation are ordered descendingly on a column of the inner relation involved in the aggregate function when the aggregate function is a MAX function, and all rows in the inner relation to be joined to the row in the outer relation are ordered ascendingly on the column of the inner relation involved in the aggregate function when the aggregate function is a MIN function. 3. The method of claim 1 , further comprising: evaluating a plurality of relations involved in the aggregate query to identify a relation that is a source of a column involved in the GROUP BY operation, a source of a column involved in the aggregate function, and involved in a join that feeds the GROUP BY operation; and for each subplan generated that involves the identified relation, specially marking the subplan responsive to the subplan producing an output in which, for each row from another relation involved in the join that feeds the GROUP BY operation, all rows from the identified relation to be joined to the row from the other relation have a same value for a set of one or more group by columns, each of the one or more group by columns being a column involved in the GROUP BY operation, any predicate of the aggregate query not yet applied either completely filters out or completely preserves all rows from the identified relation to be joined to the row from the other relation, all rows from the identified relation to be joined to the row from the other relation are ordered descendingly on the column involved in the aggregate function when the aggregate function is a MAX function, and all rows from the identified relation to be joined to the row from the other relation are ordered ascendingly on the column involved in the aggregate function when the aggregate function is a MIN function, wherein the join included in the access plan is designated as an early out join responsive to the identified relation being the inner relation of the join, and a subplan for the inner relation of the join being a specially marked subplan. 4. The method of claim 3 , further comprising: preventing the first subplan from being pruned in favor of the second subplan responsive to the first subplan being a specially marked subplan and the second subplan not being a specially marked subplan. 5. The method of claim 1 , wherein, for each row in the outer relation to be joined to one or more qualifying rows in the inner relation, only the first of the one or more qualifying rows from the inner relation is retrieved to compute the result set for the aggregate query responsive to the first row having a non-nullable value in a column involved in the aggregate function, or none of the one or more qualifying rows having a non-nullable value in the column involved in the aggregate function. 6. The method of claim 1 , wherein the inner relation of the join is a relation derived from another join. 7. The method of claim 1 , further comprising: preventing a subplan for the join designated as an early out join from being pruned in favor of a subplan for a join that is not designated as an early out join. 8. A system for optimizing aggregate queries, wherein the system comprises: a processor; and a memory device coupled to the processor and storing program code, wherein the program code is executed by the processor to perform: receiving an aggregate query comprising a GROUP BY operation, relations, and an aggregate function; identifying a first relation of the relations as an outer relation of a join between the outer relation and an inner relation; identifying a second relation of the relations as the inner relation of the join; marking the join as an early out join such that upon execution of the aggregate query in accordance with the access plan, each row in the outer relation is joined to a first row of one or more qualifying rows in the inner relation; creating a first subplan having a first cost and that includes the join that is marked as the early out join; creating a second subplan having a second cost and that does not include the join that is marked as the early out join; determining that the first subplan includes the join that is marked as the early out join and that the first cost is greater than the second cost; and creating an access plan that includes the first subplan based on the determination. 9. The system of claim 8 , wherein the join included in the access plan is designated as an early out join where, for each row in the outer relation to be joined to one or more qualifying rows in the inner relation, all rows in the inner relation to be joined to the row in the outer relation have a same value for a set of one or more group by columns, each of the one or more group by columns being a column involved in the GROUP BY operation, any predicate of the aggregate query not yet applied either completely filters out or completely preserves all rows in the inner relation to be joined to the row in the outer relation, all rows in the inner relation to be joined to the row in the outer relation are ordered descendingly on a column of the inner relation involved in the aggregate function when the aggregate function is a MAX function, and all rows in the inner relation to be joined to the row in the outer relation are ordered ascendingly on the column of the inner relation involved in the aggregate function when the aggregate function is a MIN function. 10. The system of claim 8 , wherein the database management system further evaluates a plurality of relations involved in the aggregate query to identify a relation that is a source of a column involved in the GROUP BY operation, a source of a column involved in the aggregate function, and involved in a join that feeds the GROUP BY operation, and for each subplan generated that involves the identified relation, specially marks the subplan responsive to the subplan producing an output in which, for each

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9524317B2 cover?
A method, computer program product, and system for optimizing aggregate queries are provided. The method, computer program product, and system provide for receiving an aggregate query comprising a GROUP BY operation and an aggregate function, creating an access plan for executing the aggregate query, the access plan including a join between an outer relation and an inner relation, and designati…
Who is the assignee on this patent?
Branish Ii Edward Gust, Hornibrook John F, La Dieu Quang, and 2 more
What technology area does this patent fall under?
Primary CPC classification G06F16/24537. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 20 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).