Query optimization with zone map selectivity modeling

US10042887B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10042887-B2
Application numberUS-201414562325-A
CountryUS
Kind codeB2
Filing dateDec 5, 2014
Priority dateDec 5, 2014
Publication dateAug 7, 2018
Grant dateAug 7, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

According to one embodiment of the present invention, a system for processes a query for accessing data within one or more database objects stores an element of a database object among a plurality of different storage regions. Each storage region is associated with first and second range values indicating a value range for element values within that storage region. The system examines the first and second range values for the storage regions of each database object element and determines an effectiveness value representing a degree of overlap between the storage regions of that database object element. The system determines a selectivity model for the storage regions for each database object utilizing the effectiveness value, determines a query plan based on the selectivity model, and executes the query plan. Embodiments of the present invention further include a method and computer program product for processing a query in substantially the same manners.

First claim

Opening claim text (preview).

What is claimed is: 1. A system for processing a query for accessing data within one or more database objects, wherein a database object element is stored among a plurality of different storage regions with each storage region being associated with a first range value and a second range value indicating a value range for element values within that storage region, the system comprising: at least one processor configured to: determine a zone map effectiveness value representing a degree of overlap between the storage regions of each database object element by examining the first range value and the second range value for the storage regions of that database object element, the at least one processor being configured to determine the zone map effectiveness value for each respective database object element by: sorting the first range values and the second range values for storage regions of the respective database object element, the first range values being minimum values for respective storage regions and the second range values being maximum values for the respective storage regions, for each respective sorted value of the first range values and the second range values, determining a respective rolling count, the rolling count being incremented for each minimum value encountered and being decremented for each maximum value encountered, adding each of the respective rolling counts to produce an accumulated sum, and producing the zone map effectiveness value based on the accumulated sum and a number of respective storage regions for the respective database object element; determine a zone map selectivity model for the storage regions for each database object element of the query utilizing the zone map effectiveness value of the each database object element of the query and a total quantity of the storage regions for the each database object element to model zone map selectivity and multiplying the zone map effectiveness value of each individual respective database object element of the each database object element of the query by the total quantity of the corresponding storage regions for the each individual respective database object element to produce an estimate of a quantity of storage regions to be read when performing a second query regarding the each individual respective database object; determine an order for executing a plurality of query plans for the query based on the determined zone map selectivity model for the each database object element of the query, the determined order for executing the plurality of query plans having a lowest expected cost among all possible orders for executing the plurality of query plans; and execute the plurality of query plans in the determined order to access data from the database object elements for the query. 2. The system of claim 1 , wherein the zone map effectiveness value for each database object element is in a value range between zero and one. 3. The system of claim 1 , wherein each database object includes a database table, and the database object element includes a database table column. 4. The system of claim 3 , wherein the query includes a restriction on plural database table columns and determining the zone map selectivity model further comprises: combining the zone map effectiveness values for the database table columns of the restriction with column correlation statistics to determine an amount of input/output operations for the restriction on the plural database table columns. 5. The system of claim 1 , wherein the determined order for executing the plurality of query plans indicates at least one of a scan order and a join order for the database objects based on the zone map selectivity model of the corresponding database object elements. 6. A computer program product for processing a query for accessing data within one or more database objects, wherein a database object element is stored among a plurality of different storage regions with each storage region being associated with a first range value and a second range value indicating a value range for element values within that storage region, the computer program product comprising: a computer readable storage medium having computer readable program code embodied therewith for execution on a processing system, the computer readable program code comprising computer readable program code configured to: determine a zone map effectiveness value representing a degree of overlap between the storage regions of each database object element by examining the first range value and the second range value for the storage regions of that database object element, the computer readable program code being further configured to determine the zone map effectiveness value for each respective database object element by: sorting the first range values and the second range values for storage regions of the respective database object element, the first range values being minimum values for respective storage regions and the second range values being maximum values for the respective storage regions, for each of respective sorted value of the first range values and the second range values, determining a respective rolling count, the rolling count being incremented for each minimum value encountered and being decremented for each maximum value encountered, adding each of the respective rolling counts to produce an accumulated sum, and producing the zone map effectiveness value based on the accumulated sum and a number of respective storage regions for the respective database object element; determine a zone map selectivity model for the storage regions for each database object element of the query utilizing the zone map effectiveness value of the each database object element of the query and a total quantity of the storage regions for the each database object element and multiplying the zone map effectiveness value of each individual respective database object element of the each database object element of the query by the total quantity of the corresponding storage regions for the each individual respective database object element to produce an estimate of a quantity of storage regions to be read when performing a second query regarding the each individual respective database object; determine an order for executing a plurality of query plans for the query based on the determined zone map selectivity model for the each database object element of the query, the determined order for executing the plurality of query plans having a lowest expected cost among all possible orders for executing the plurality of query plans; and execute the plurality of query plans in the determined order to access data from the database object elements for the query. 7. The computer program product of claim 6 , wherein the zone map effectiveness value for each database object element is in a value range between zero and one. 8. The computer program product of claim 6 , wherein each database object includes a database table, and the database object element includes a database table column. 9. The computer program product of claim 8 , wherein the query includes a restriction on plural database table columns and determining the zone map selectivity model further comprises: combining the zone map effectiveness values for the database table columns of the restriction with column correlation statistics to determine an amount of input/output operations for the restriction on the plural database table columns.

Assignees

Inventors

Classifications

  • Join order optimisation · CPC title

  • using directory or table look-up (use of a directory or look-up table in file systems G06F16/13) · CPC title

  • Selectivity estimation or determination · CPC title

  • Query processing · CPC title

  • Physics · mapped topic

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10042887B2 cover?
According to one embodiment of the present invention, a system for processes a query for accessing data within one or more database objects stores an element of a database object among a plurality of different storage regions. Each storage region is associated with first and second range values indicating a value range for element values within that storage region. The system examines the first…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/24544. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 07 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).