Evaluating query performance

US11567916B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11567916-B2
Application numberUS-202016813873-A
CountryUS
Kind codeB2
Filing dateMar 10, 2020
Priority dateMar 10, 2020
Publication dateJan 31, 2023
Grant dateJan 31, 2023

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An approach is provided for evaluating a performance of a query. A risk of selecting a low performance access path for a query is determined. The risk is determined to exceed a risk threshold. Based on the risk exceeding the risk threshold and using a machine learning optimizer, first costs of access paths for the query are determined. Using a cost-based database optimizer, second costs of the access paths are determined. Using a strong classifier operating on the first costs and the second costs, a final access path for the query is selected from the access paths.

First claim

Opening claim text (preview).

What is claimed is: 1. A method of evaluating a performance of a query, the method comprising: determining, by one or more processors, a risk of selecting an access path for a query which provides a performance of the query that does not exceed a performance threshold, wherein the determining the risk is based on real count information, explain tables, and a machine learning model, wherein the real count information includes result rows of clauses of the query, the result rows including (i) an amount of rows that are qualified after applying a predicate and (ii) an amount of rows that are returned after tables in the query are joined, wherein the explain tables include information about a performance of SQL statements and functions included in an execution of the query, and wherein the machine learning model is trained for predicting a performance of the access path and is based on the real count information and the explain tables; determining, by the one or more processors, that the risk exceeds a risk threshold; based on the risk exceeding the risk threshold and using a machine learning optimizer that employs a machine learning system, determining, by the one or more processors, first costs of access paths for the query; using a cost-based database optimizer, determining, by the one or more processors, second costs of the access paths for the query; and using a strong classifier operating on the first costs and the second costs, selecting, by the one or more processors, a final access path for the query from the access paths. 2. The method of claim 1 , further comprising: performing, by the one or more processors, the determining the risk and the determining that the risk exceeds the risk threshold by using a potential error module; and sending the final access path as feedback to enhance the potential error module and the machine learning system. 3. The method of claim 1 , further comprising: prior to the determining the risk, receiving, by the one or more processors, the query; and parsing, by the one or more processors, the query, wherein the query in the determining the risk, the determining the first costs, the determining the second costs, and the selecting the final access path is the parsed query. 4. The method of claim 1 , further comprising: receiving and parsing, by the one or more processors, a second query; determining, by the one or more processors and a potential error module, a second risk of selecting a second access path for the parsed second query which provides a second performance of the parsed second query that does not exceed the performance threshold; determining, by the one or more processors, that the second risk does not exceed the risk threshold; based on the second risk not exceeding the risk threshold, using the cost-based database optimizer, and without using the machine learning system, determining, by the one or more processors, third costs of second access paths for the parsed second query; and based on the third costs and without using the strong classifier, selecting, by the one or more processors, a second final access path for the second query from the second access paths. 5. The method of claim 1 , further comprising providing a first performance of the query using the final access path that exceeds a second performance of the query using another access path determined by the cost-based database optimizer, without using the machine learning system, and without using the strong classifier. 6. The method of claim 1 , wherein the determining the risk includes: receiving historical training data for a risk prediction model; and based on the historical training data and using the risk prediction model and a logical classifier, the machine learning optimizer determining the risk of selecting the access path for the query which provides the performance of the query that does not exceed the performance threshold. 7. The method of claim 1 , wherein the selecting the final access path for the query includes employing a machine learning algorithm that uses a boosted classifier to select the final access path based on a combination of the first costs and the second costs. 8. The method of claim 1 , further comprising: providing at least one support service for at least one of creating, integrating, hosting, maintaining, and deploying computer readable program code in the computer, the program code being executed by a processor of the computer to implement the determining the risk, the determining that the risk exceeds the risk threshold, the determining the first costs of the access paths, the determining the second costs of the access paths, and the selecting the final access path. 9. A computer program product comprising: a computer readable storage medium having computer readable program code stored on the computer readable storage medium, the computer readable program code being executed by a central processing unit (CPU) of a computer system to cause the computer system to perform a method of evaluating a performance of a query, the method comprising the steps of: the computer system determining a risk of selecting an access path for a query which provides a performance of the query that does not exceed a performance threshold, wherein the determining the risk is based on real count information, explain tables, and a machine learning model, wherein the real count information includes result rows of clauses of the query, the result rows including (i) an amount of rows that are qualified after applying a predicate and (ii) an amount of rows that are returned after tables in the query are joined, wherein the explain tables include information about a performance of SQL statements and functions included in an execution of the query, and wherein the machine learning model is trained for predicting a performance of the access path and is based on the real count information and the explain tables; the computer system determining that the risk exceeds a risk threshold; based on the risk exceeding the risk threshold and using a machine learning optimizer that employs a machine learning system, the computer system determining first costs of access paths for the query; using a cost-based database optimizer, the computer system determining second costs of the access paths for the query; and using a strong classifier operating on the first costs and the second costs, the computer system selecting a final access path for the query from the access paths. 10. The computer program product of claim 9 , wherein the method further comprises: the computer system performing the determining the risk and the determining that the risk exceeds the risk threshold by using a potential error module; and the computer system sending the final access path as feedback to enhance the potential error module and the machine learning system. 11. The computer program product of claim 9 , wherein the method further comprises: prior to the determining the risk, the computer system receiving the query; and the computer system parsing the query, wherein the query in the determining the risk, the determining the first costs, the determining the second costs, and the selecting the final access path is the parsed query. 12. The computer program product of claim 9 , wherein the method further comprises: the computer system receiving and parsing a second query; the computer system determining, by a potential error module, a second risk of selecting a second access path for the parsed second query which provides a second performance of the parsed second query that does not exceed the performance threshold; the computer system determining that the second r

Assignees

Inventors

Classifications

  • G06F16/217Primary

    Database tuning (G06F16/2282 takes precedence; database performance monitoring G06F11/3409) · CPC title

  • Access plan code generation and invalidation; Reuse of access plans · CPC title

  • Machine learning · CPC title

  • Feedforward networks · CPC title

  • Supervised learning · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11567916B2 cover?
An approach is provided for evaluating a performance of a query. A risk of selecting a low performance access path for a query is determined. The risk is determined to exceed a risk threshold. Based on the risk exceeding the risk threshold and using a machine learning optimizer, first costs of access paths for the query are determined. Using a cost-based database optimizer, second costs of the …
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F16/217. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 31 2023 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).