Predicting execution times of concurrent queries

US2016203404A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016203404-A1
Application numberUS-201314917074-A
CountryUS
Kind codeA1
Filing dateSep 14, 2013
Priority dateSep 14, 2013
Publication dateJul 14, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Example embodiments relate to predicting execution times of concurrent queries. In example embodiments, historical data is iteratively generated for a machine learning model by varying a concurrency level of query executions in a database, determining a query execution plan for a pending concurrent query, extracting query features from the query execution plan, and executing the pending concurrent query to determine a query execution time. The machine learning model may then be created based on the query features, variation in the concurrency level, and the query execution time. The machine learning model is used to generate an execution schedule for production queries, where the execution schedule satisfies service level agreements of the production queries.

First claim

Opening claim text (preview).

1 . A system for predicting execution times of concurrent queries, the system comprising of: a processor to: iteratively generate historic data for creating a machine learning model by: varying a concurrency level of query executions in a database; determining a query execution plan for a pending concurrent query; extracting a plurality of query features from the query execution plan; and executing the pending concurrent query to determine a query execution time; create the machine learning model based on the plurality of query features, variation in the concurrency level, and the query execution time; and use the machine learning model to generate an execution schedule for a plurality of production queries, wherein the execution schedule satisfies service level agreements of the plurality of production queries. 2 . The system of claim 1 , wherein the processor uses the machine learning model to generate the execution schedule for the plurality of production queries by: matching one of the plurality of production queries to a subset to the plurality of query features; determining a predicted execution time for the one of the plurality of production queries based on the subset; and determining an execution order for the plurality of production queries based on the predicted execution time. 3 . The system of claim 2 , wherein the processor is further to: identify significant features of the plurality of features that are statistically used more often in production, wherein the subset includes the significant features. 4 . The system of claim 1 , wherein the processor is further to: determine a production query execution plan for each of the plurality of production queries; extract a plurality of production query features from each of the production query execution plan; execute each of the plurality of production queries to determine a production query execution time; update the machine learning model based on the plurality of production query features and the production query execution time of each of the plurality of production queries. 5 . The system of claim 1 , wherein the concurrency level is in a range of two to a maximum value greater than two, wherein each value in the range is iteratively used as the concurrency level to generate the historic data. 6 . The system of claim 1 , wherein the machine learning model is created using a boosted trees technique that generates a group of decision trees based on the plurality of query features, variation in the concurrency level, and the query execution time. 7 . A method for predicting execution times of concurrent querues, comprising: receiving historic data associated with a database for creating a machine learning model, wherein the historic data includes query execution times for training queries that have been iteratively executed at varying concurrency levels and a plurality of query features that have been extracted from query execution plans of the training queries; using a boosted trees technique to create the machine learning model based on the plurality of query features, the varying concurrency levels, and the query execution times; and using the machine learning model to generate an execution schedule for a plurality of production queries, wherein the execution schedule satisfies service level agreements of the plurality of production queries. 8 . The method of claim 7 , wherein using the machine learning model to generate the execution schedule for the plurality of production queries comprises: matching one of the plurality of production queries to a subset of the plurality of query features; determining a predicted execution time for the one of the plurality of production queries based on the subset; and determining an execution order for the plurality of production queries based on the predicted execution time. 9 . The method of claim 8 , further comprising: identifying significant features of the plurality of features that are statistically used more often in production, wherein the subset includes the significant features. 10 . The method of claim 7 , further comprising: determining a production query execution plan for each of the plurality of production queries; extracting a plurality of production query features from each of the production query execution plan; executing each of the plurality of production queries to determine a production query execution time; updating the machine learning model based on the plurality of production query features and the production query execution time of each of the plurality of production queries. 11 . The method of claim 7 , wherein the varying concurrency levels are in a range of two to a maximum value greater than two, wherein each value in the range has been iteratively used to generate the historic data. 12 . A non-transitory machine-readable storage medium encoded with instructions executable by a processor for predicting execution times of concurrent queries, the machine-readable storage medium comprising instructions to: iteratively generate historic data for creating a machine learning model by: varying a concurrency level of query executions in a database, wherein the concurrency level is iteratively varied to values in a range of two to a maximum value greater than two; determining a query execution plan for pending concurrent query; extracting a plurality of query features from the query execution plan; and executing the pending concurrent query to determine a query execution time; create the machine learning model based on the plurality of query features, variation in the concurrency level, and the query execution time; and use the machine learning model to generate an execution schedule for a plurality of production queries, wherein the execution schedule satisfies service level agreements of the plurality of production queries. 13 . The non-transitory machine-readable storage medium of claim 12 , wherein using the machine learning model to generate the execution schedule for the plurality of production queries comprises: matching one of the plurality of production queries to a subset of the plurality of query features; determining a predicted execution time for the one of the pluralityy of production queries based on the subset; and determining an execution order for the plurality of production queries based on the predicted execution time. 14 . The non-transitory machine-readable storage medium of claim 13 , further comprising instructions to: identify significant features of the plurality of features that are statistically used more often in production, wherein the subset includes the significant features. 15 . The non-transitory machine-readable storage medium of claim 12 , further comprising instructions to: determine a production query execution plan for each of the plurality of production queries; extract a plurality of production query features from each of the production query execution plan; execute each of the plurality of production queries to determine a production query execution time; update the machine learning model based on the plurality of production query features and the production query execution time of each of the plurality of production queries.

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016203404A1 cover?
Example embodiments relate to predicting execution times of concurrent queries. In example embodiments, historical data is iteratively generated for a machine learning model by varying a concurrency level of query executions in a database, determining a query execution plan for a pending concurrent query, extracting query features from the query execution plan, and executing the pending concurr…
Who is the assignee on this patent?
Hewlett Packard Entpr Dev Lp
What technology area does this patent fall under?
Primary CPC classification G06N5/04. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jul 14 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).