Dynamic resource-based parallelization in distributed query execution frameworks

US10114825B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10114825-B2
Application numberUS-201414212163-A
CountryUS
Kind codeB2
Filing dateMar 14, 2014
Priority dateMar 14, 2014
Publication dateOct 30, 2018
Grant dateOct 30, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

As part of query processing within a distributed execution environment framework, available resources taken into account when generating an execution plan and/or executing an execution plan to determine whether to parallelize any operations. Related apparatus, systems, methods and articles are also described.

First claim

Opening claim text (preview).

What is claimed is: 1. A method comprising: receiving, at runtime, by a database server from a remote application server, a query associated with a calculation scenario defining a data flow model, the data flow model comprising a plurality of calculation nodes defining a plurality operations to be executed at runtime by a calculation engine on the database server, the plurality of calculation nodes comprising a dynamic split operator, the dynamic split operator identifying at least one operation of the plurality of operations as available for parallelizing, the dynamic split operator further comprising a criterion to be evaluated to determine, based at least on an input of the at least one operation, a quantity of partitions for splitting the input, the at least one operation being performed on each partition of the input in parallel, the criterion being evaluated during an execution of an execution plan associated with the plurality of operation instead of during a generation of the execution plan, and the criterion quantifying an amount of available database resources necessary to allow parallelization of the at least one operation without negatively affecting other processes running on the database server; instantiating, by the database server, a runtime model of the calculation scenario based on the plurality of calculation nodes; generating, at runtime from the runtime model of the calculation scenario, the execution plan specifying how the plurality of operations are to be executed against a database managed by the database server, the runtime model comprising the dynamic split operator; executing, by the database server, the execution plan, the execution of the execution plan comprising: evaluating the criterion to at least determine, during the execution of the execution plan, the quantity of partitions for splitting the input, the criterion being evaluated based at least on the input to the at least operation; and splitting, during the execution of the execution plan, the input into a first partition and a second partition, the splitting being based at least on the quantity of partitions, the first partition and the second partition being operated upon by two or more parallel processor threads comprising the at least one operation; and providing, by the database server to the application server, the data set. 2. A method as in claim 1 , wherein the fetched data characterizes a number of available processor threads. 3. A method as in claim 1 , wherein the fetched data characterizes a number of available processor cores. 4. A method as in claim 1 , wherein the fetched data characterizes an amount of available memory. 5. A method as in claim 1 , wherein the fetched data characterizes an amount of available I/O bandwidth. 6. A method as in claim 1 , wherein at least a portion of paths and/or attributes defined by the calculation scenario are not required to respond to the query, and wherein the instantiated calculation scenario omits the paths and attributes defined by the calculation scenario that are not required to respond to the query. 7. A method as in claim 1 , wherein at least one of the calculation nodes filters results obtained from the database server. 8. A method as in claim 1 , wherein at least one of the calculation nodes sorts results obtained from the database server. 9. A method as in claim 1 , wherein the calculation scenario is instantiated in a calculation engine layer by the calculation engine. 10. A method as in claim 9 , wherein the calculation engine layer interacts with a physical table pool and a logical layer, the physical table pool comprising physical tables containing data to be queried, and the logical layer defining a logical metamodel joining at least a portion of the physical tables in the physical table pool. 11. A method as in claim 9 , wherein the calculation engine invokes an SQL processor for executing set operations. 12. A method as in claim 1 , wherein an input for each calculation node comprises one or more of: a physical index, a join index, an OLAP index, and another calculation node. 13. A method as in claim 1 , wherein the executing comprises: forwarding the query to a calculation node in the calculation scenario that is identified as a default node if the query does not specify a calculation node at which the query should be executed. 14. A method comprising: receiving, at runtime, by a database server from a remote application server, a query associated with a calculation scenario defining a data flow model, the data flow model comprising a plurality of calculation nodes defining a plurality operations to be executed at runtime by a calculation engine on the database server, the plurality of calculation nodes comprising a dynamic split operator, the dynamic split operator identifying at least one operation of the plurality of operations as available for parallelizing, the dynamic split operator further comprising a criterion to be evaluated to determine, based at least on an input of the at least one operation, a quantity of partitions for splitting the input, the at least one operation being performed on each partition of the input in parallel, the criterion being evaluated during an execution of an execution plan associated with the plurality of operation instead of during a generation of the execution plan, and the criterion quantifying an amount of available database resources necessary to allow parallelization of the at least one operation without negatively affecting other processes running on the database server; instantiating, by the database server, a runtime model of the calculation scenario based on the plurality of calculation nodes; generating, at runtime from the runtime model of the calculation scenario, the execution plan specifying how the plurality of operations are to be executed against a database managed by the database server, the runtime model comprising the dynamic split operator; executing, by the database server, the execution plan, the execution of the execution plan comprising: evaluating the criterion to at least determine, during the execution of the execution plan, the quantity of partitions for splitting the input, the criterion being evaluated based at least on the input to the at least operation; and splitting, during the execution of the execution plan, the input into a first partition and a second partition, the splitting being based at least on the quantity of partitions, the first partition and the second partition being operated upon by two or more parallel processor threads comprising the at least one operation; and providing, by the database server to the application server, the data set. 15. A method as in claim 14 , wherein the fetched data characterizes a number of available processor threads. 16. A method as in claim 14 , wherein the fetched data characterizes a number of available processor cores. 17. A method as in claim 14 , wherein the fetched data characterizes an amount of available memory. 18. A method as in claim 1 , wherein the fetched data characterizes an amount of available I/O bandwidth. 19. A method comprising: receiving, at runtime, by a database server from a remote application server, a query associated with a calculation scenario defining a data flow model, the data flow model comprising a plurality of calculation nodes defining a plurality operations to be executed at runtime by a calculation engine on the database server, the plurality of calculation nodes comprising a dynamic split operator, the dynamic split opera

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10114825B2 cover?
As part of query processing within a distributed execution environment framework, available resources taken into account when generating an execution plan and/or executing an execution plan to determine whether to parallelize any operations. Related apparatus, systems, methods and articles are also described.
Who is the assignee on this patent?
Weyerhaeuser Christoph, Mindnich Tobias, Merx Johannes, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06F16/10. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Oct 30 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).