Systems and methods for a distributed query execution engine

US10176236B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10176236-B2
Application numberUS-201514807807-A
CountryUS
Kind codeB2
Filing dateJul 23, 2015
Priority dateJul 29, 2014
Publication dateJan 8, 2019
Grant dateJan 8, 2019

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments of the present disclosure can include systems, methods, and non-transitory computer readable media configured to receive at least one database query to be executed. Code corresponding to the at least one database query can be generated. One or more optimizations to the generated code can be performed to produce specialized modular code. The one or more optimizations can include Just-In-Time (JIT) compilation techniques. Respective portions of the code can be distributed to a plurality of distributed computing systems for execution, wherein each of the distributed computing systems is connected to a portion of the plurality of distributed computing systems. A result for the at least one database query can be provided.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method comprising: receiving, by a computing system that comprises one or more processors, a database query to be executed; generating, by the computing system, code corresponding to the database query; performing, by the computing system, one or more optimizations to the generated code to produce specialized modular code, the one or more optimizations including at least a Just-In-Time (JIT) compilation technique that modifies at least a portion of the generated code to tailor the portion of generated code to a particular distributed computing system of a plurality of distributed computing systems; distributing, by the computing system, respective portions of the modular code to the plurality of distributed computing systems for execution, each of the distributed computing systems being connected to a portion of the plurality of distributed computing systems, and wherein at least the modified portion of the modular code tailored to the particular computing system is distributed to the particular computing system for execution; and providing, by the computing system, a result for the database query. 2. The computer-implemented method of claim 1 , wherein the specialized modular code for the at least one database query is generated entirely on-the-fly. 3. The computer-implemented method of claim 1 , wherein performing, by the computing system, one or more optimizations to the generated code further comprises: performing, by the computing system, at least one of: register allocation, inlining, constant folding, loop strength reduction, or loop-invariant code motion. 4. The computer-implemented method of claim 1 , wherein performing, by the computing system, one or more optimizations to the generated code further comprises: combining, by the computing system, the generated code with at least a portion of pre-compiled code. 5. The computer-implemented method of claim 4 , wherein the code for the at least one database query is generated entirely on-the-fly and wherein the pre-compiled code is generated prior to receiving the at least one database query. 6. The computer-implemented method of claim 4 , wherein the pre-compiled code is customized based at least in part on the code generated for the database query. 7. The computer-implemented method of claim 4 , wherein the pre-compiled code is written in a high-level programming language. 8. The computer-implemented method of claim 4 , wherein the pre-compiled code, when executed, performs vectorized instructions (SIMD) with a variable number of operands, and wherein combining, by the computing system, the generated code with at least a portion of pre-compiled code further comprises: customizing, by the computing system, the vectorized instructions (SIMD) using one or more operands included in the database query. 9. The computer-implemented method of claim 1 , wherein execution of the code is based at least in part on one or more iterator-based execution nodes, wherein respective code associated with the one or more iterator-based execution nodes is able to be suppressed upon determining that the respective code does not need to be executed. 10. The computer-implemented method of claim 9 , wherein performing, by the computing system, the one or more optimizations to the generated code further comprises: performing, by the computing system, one or more levels of code optimization, wherein each level of optimization introduces a further specialization to the code, and wherein each level of optimization is cached. 11. The computer-implemented method of claim 9 , wherein at least one result associated with an iterator-based execution node resides in one or more processor registers and does not materialize into memory. 12. The computer-implemented method of claim 9 , further comprising: determining that the database query involves manipulation of one or more nested arrays; and evaluating the one or more nested arrays based at least in part on the one or more iterator-based execution nodes. 13. The computer-implemented method of claim 12 , wherein evaluating the one or more nested arrays further comprises: processing each array in the one or more nested arrays as a series of rows. 14. The computer-implemented method of claim 1 , wherein the computing system utilizes a caching mechanism that allows data items to be scheduled to the same distributed computing system across instances of a same query as the at least one database query. 15. The computer-implemented method of claim 1 , wherein the computing system utilizes a caching mechanism that allows data items to be scheduled to the same distributed computing system across instances of different queries. 16. The computer-implemented method of claim 1 , the method further comprising: determining, by the computing system, that at least one of the distributed computing systems are non-responsive; and re-distributing, by the computing system, respective portions of the optimized code corresponding to the at least one of the distributed computing system to one or more of the distributed computing systems that are responsive. 17. The computer-implemented method of claim 16 , wherein determining, by the computing system, that at least one of the distributed computing systems are non-responsive further comprises: determining, by the computing system, that at least one of the distributed computing systems are non-responsive while the at least one database query is being executed. 18. The computer-implemented method of claim 1 , distributing, by the computing system, respective portions of the optimized code to a plurality of distributed computing systems for execution further comprises: dispatching, by the computing system, the respective portions of the optimized code to a plurality of distributed computing systems based at least in part on a local chunked sort technique. 19. A system comprising: at least one processor; and a memory storing instructions that, when executed by the at least one processor, cause the system to perform: receiving a database query to be executed; generating code corresponding to the database query; performing one or more optimizations to the generated code, the one or more optimizations including at least a Just-In-Time (JIT) compilation technique that modifies at least a portion of the generated code to tailor the portion of generated code to a particular type of distributed computing system of a plurality of distributed computing systems types; distributing respective portions of the optimized code to a plurality of distributed computing systems for execution, each of the distributed computing systems being connected to a portion of the plurality of distributed computing systems, and wherein at least the modified portion of the generated code tailored to the particular type of computing system is distributed to a particular computing system of the particular type for execution; and providing a result for the database query. 20. A computer-readable hardware storage medium including instructions that, when executed by at least one processor of a computing system, cause the computing system to perform: receiving at least one database query to be executed; generating code corresponding to the at least one database query; performing one or more optimizations to the generated code, the one or more optimizations including at least a Just-In-Time (JIT) compilation technique that modifies at least a portion of the generated code to tailor th

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10176236B2 cover?
Various embodiments of the present disclosure can include systems, methods, and non-transitory computer readable media configured to receive at least one database query to be executed. Code corresponding to the at least one database query can be generated. One or more optimizations to the generated code can be performed to produce specialized modular code. The one or more optimizations can incl…
Who is the assignee on this patent?
Microsoft Technology Licensing Llc
What technology area does this patent fall under?
Primary CPC classification G06F17/30545. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 08 2019 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).