System-wide query optimization

US11249997B1 · US · B1

Patent metadata
FieldValue
Publication numberUS-11249997-B1
Application numberUS-201815933140-A
CountryUS
Kind codeB1
Filing dateMar 22, 2018
Priority dateNov 30, 2012
Publication dateFeb 15, 2022
Grant dateFeb 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A locally optimized plan for executing a command using a sequence of steps can be determined for a single computing node. However, the locally optimized sequence of steps may not be optimized for a combined system comprising multiple computing nodes, any one of which may be tasked with executing the command. A plan that is optimized for the combined system may be determined by comparing the predicted cost of locally optimized plans for computing nodes in the combined system.

First claim

Opening claim text (preview).

What is claimed: 1. A system, comprising: a first computing device, comprising a hardware processor and at least one memory, that, in response to receiving information indicative of a query of a data set, determines a first plan for executing the query on a first partition of the data set, based at least in part on a first set of statistics, the first set of statistics based on performance of the first computing device; a second computing device that, in response to receiving information indicative of the query, determines a second plan for executing the query on a second partition of the data set, based at least in part on a second set of statistics different than the first set of statistics, the second set of statistics based on performance of the second computing device; and a third computing device that determines, by comparing a first cost of executing the first plan to a second cost of executing the second plan, that the first plan should be used for executing the query on both the first partition and the second partition and causes the second computing device to execute the query using the first plan. 2. The system of claim 1 , wherein the first computing device comprises a first database that accesses the first partition and the second computing device comprises a second database that accesses the second partition. 3. The system of claim 1 , wherein the third computing device determines that the first plan should be used by comparing a first cost of executing the first plan to a second cost of executing the second plan. 4. The system of claim 3 , wherein the comparing comprises calculating an adjusted first cost of executing based at least in part on a characteristic of the first computing device. 5. The system of claim 1 , wherein the second computing device receives information indicative of the first plan and executes the query on the second partition using the first plan. 6. The system of claim 1 , wherein the query is a parameterized query. 7. The system of claim 1 , wherein the third computing device is configured to determine whether the first plan or the second plan should be used to execute the query based on a metric collected during a prior execution of the query. 8. A non-transitory computer-readable storage medium comprising processor-executable instructions that, in response to execution by at least one processor of a computing device, cause the computing device to at least: obtain a first plan for executing a query on a first partition of a data set on a first computing node, the first plan based at least in part on a first set of statistics associated with performance of the first computing node; obtain a second plan for executing the query on a second partition of the data set on a second computing node, the second plan based at least in part on a second set of statistics associated with performance of the second computing node; determine, based on comparison of a predicted cost of executing the first plan with a predicted cost of executing the second plan, that the first plan is preferred for executing the query; and send the first plan for executing a query to the second computing node. 9. The non-transitory computer-readable storage medium of claim 8 , wherein the processor-executable instructions that cause the computing device to send the first plan for executing the query to the second computing node further cause the computing device to transmit the first plan to a client application. 10. The non-transitory computer-readable storage medium of claim 8 , wherein the processor-executable instructions that cause the computing device to send the first plan for executing the query to the second computing node further cause the computing device to transmit the first plan and metadata to a user, the metadata comprising one or more of a storage device, queue, or messaging system. 11. The non-transitory computer-readable storage medium of claim 8 , wherein the processor-executable instructions that cause the computing device to send the first plan for executing the query to the second computing node further cause the computing device to send suggestions for modifying the query to cause a query optimizer to produce a third plan for executing the query consistent with the first plan. 12. The non-transitory computer-readable storage medium of claim 8 , wherein the processor-executable instructions that cause the computing device to send the first plan for executing the query to the second computing node further cause the computing device to send suggestions to route the query to a specific computing device. 13. A computer-implemented method comprising: determining, based at least in part on statistics associated with performance of a first computing device, a first plan for executing a query on a first partition of a data set; determining, based at least in part on statistics associated with performance of a second computing device, a second plan for executing the query on a second partition of the data set; determining, based at least in part on a comparison of a first predicted cost of executing the first plan with a second predicted cost of executing the second plan, that the first plan is preferred for executing the query; and causing the second computing device to execute the query on the second partition using the first plan. 14. The computer-implemented method of claim 13 , wherein the first computing device comprises a first database that accesses the first partition and the second computing device comprises a second database that accesses the second partition. 15. The computer-implemented method of claim 13 , wherein the first predicted cost is adjusted based on the statistics associated with the first computing device and on performance characteristics of the first computing device. 16. The computer-implemented method of claim 13 , wherein the comparison is based on weighing characteristics of the first computing device and of the second computing device. 17. The computer-implemented method of claim 15 , wherein the comparison is executed by a third computing device and in response to the third computing device being idle. 18. The computer-implemented method of claim 13 , wherein determining the first plan is further based on a received hint, the hint indicating a type of scan as preferable for a specified query. 19. The computer-implemented method of claim 13 , wherein determining the first plan is based on a hardware configuration changing at the first computing device. 20. The computer-implemented method of claim 19 , wherein determining the second plan is based on at least one of: the hardware configuration changing at the first computing device; and receiving a notification that the first plan indicated a change in cost relative to a previous plan utilized by the first computing device.

Assignees

Inventors

Classifications

  • Distributed queries · CPC title

  • Unary operations; Data partitioning operations · CPC title

  • Plan optimisation · CPC title

  • Selectivity estimation or determination · CPC title

  • Access plan code generation and invalidation; Reuse of access plans · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11249997B1 cover?
A locally optimized plan for executing a command using a sequence of steps can be determined for a single computing node. However, the locally optimized sequence of steps may not be optimized for a combined system comprising multiple computing nodes, any one of which may be tasked with executing the command. A plan that is optimized for the combined system may be determined by comparing the pre…
Who is the assignee on this patent?
Amazon Tech Inc
What technology area does this patent fall under?
Primary CPC classification G06F16/24542. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).