Query optimization in hybrid DBMS

US11061899B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11061899-B2
Application numberUS-201715839283-A
CountryUS
Kind codeB2
Filing dateDec 12, 2017
Priority dateSep 13, 2016
Publication dateJul 13, 2021
Grant dateJul 13, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A mechanism is provided for generating statistical information for query optimization in a data processing system. The mechanism comprises a first database engine maintaining a current first dataset currently being stored, a second database engine maintaining a second dataset. The second dataset is generated from previous first datasets or from the previous first datasets and current first dataset, the previous first datasets being datasets that were previously maintained by the first database engine. The first database engine receives a database query for accessing the first dataset, the database query involving one or more attributes of the first data set. The first database engine generates a query execution plan for the database query on the first dataset using collected statistical information on at least the second dataset. The first database engine processes the database query according to the query execution plan.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, in a data processing system comprising at least one processor and at least one memory coupled to the at least one processor, for generating statistical information for query optimization in the data processing system, the data processing system comprising a first database engine maintaining a current first dataset and an analytics accelerator maintaining a second dataset generated from previous first datasets, the method comprising: receiving, at the first database engine, a database query for accessing the current first dataset, the database query involving one or more attributes of the current first data set; obtaining, by the first database engine from the analytics accelerator, statistical information on one or more attributes of the current first dataset collected from the second dataset that is generated from previous first datasets, wherein the previous first datasets are datasets that were maintained previous to the current first dataset by the first database engine; generating, by the first database engine, a query execution plan for the database query on the current first dataset using the statistical information collected on the second dataset; and processing, by the first database engine, the database query according to the query execution plan. 2. The method of claim 1 , wherein the collecting of the statistical information further comprises: receiving by the first database engine from the analytics accelerator the statistical information. 3. The method of claim 1 , wherein the collecting of the statistical information further comprises: receiving, by the first database engine, from the analytics accelerator a random sample on the second dataset; and calculating, by the first database engine, the statistical information based on the random sample. 4. The method of claim 3 , wherein the receiving of the random sample is performed in response to sending a request from the first database engine to the analytics accelerator. 5. The method of claim 3 , wherein the receiving of the random sample is automatically performed on a predefined periodic basis. 6. The method of claim 1 , wherein the current first dataset comprises records of a given table having a commit date after a predefined date and wherein the second dataset comprises records of the given table having a commit date before that predefined date. 7. The method of claim 1 , wherein the current first dataset comprises records of a given table having an access frequency higher than a predefined access frequency threshold and wherein the second dataset comprises records of the given table having an access frequency smaller than the predefined access frequency threshold. 8. The method of claim 1 , wherein the data processing system is a hybrid on-line transaction processing (OLTP) and on-line analytical processing (OLAP) database system, wherein the first database engine is configured for performing OLTP processes, and wherein the analytics accelerator is configured for performing OLAP processes. 9. The method of claim 1 , further comprising: receiving, by the analytics accelerator, another database query for accessing the second dataset; generating, by the analytics accelerator, a query execution plan for the other database query using the collected statistical information; and processing, by the analytics accelerator, the other database query in the second database engine according to the query execution plan. 10. The method of claim 1 , wherein the statistical information comprises at least one of: the number of distinct values of the one or more attributes; the cardinality of values of the one or more attributes; minimum and maximum values of the one or more attributes; the fraction of NULL values of the one or more attributes; histogram of values of the one or more attributes; or correlation factor between values of different attributes.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11061899B2 cover?
A mechanism is provided for generating statistical information for query optimization in a data processing system. The mechanism comprises a first database engine maintaining a current first dataset currently being stored, a second database engine maintaining a second dataset. The second dataset is generated from previous first datasets or from the previous first datasets and current first data…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/24542. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jul 13 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).