Split processing paths for a database calculation engine

US10146834B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10146834-B2
Application numberUS-201414518593-A
CountryUS
Kind codeB2
Filing dateOct 20, 2014
Priority dateDec 23, 2011
Publication dateDec 4, 2018
Grant dateDec 4, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A dynamic split node defined within a calculation model can receive data being operated on by a calculation plan generated based on the calculation model. A partition specification can be applied to one or more reference columns in a table containing at least some of the received data. The applying can cause the table to be split such that a plurality of records in the table are partitioned according to the partition specification. A separate processing path can be set for each partition, and execution of the calculation plan can continue using the separate processing paths, each of which can be assigned to a processing node of a plurality of available processing nodes.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer program product comprising a non-transitory machine-readable storage medium storing instructions that, when executed by at least one programmable processor, cause the at least one programmable processor to perform operations comprising: receiving, at a first node defined within a calculation model, first data, the calculation model including a master node and a plurality of processing nodes comprising the first node, the plurality of processing nodes being controlled by the master node, the first data comprising at least one of a first intermediate result of an earlier operation in a calculation plan generated from the calculation model or second data from a table upon which the calculation plan is currently operating in response to a query, the calculation plan generated at runtime and tailored to both the query and a response to the query; applying, at the first node and in response to receiving the first data, a partition specification to a column in the first data, the applying comprising: determining, at the first node and in response to receiving the first data, a number of unique values in a column in the first data; splitting, at the first node and according to the partition specification, a first table into a number of partitions equal to the number of unique values in the column, wherein each unique value has a corresponding partition; and placing records of the first table having the same unique value together in a partition; assigning, at the first node, a separate processing path for each of the number of partitions, each separate processing path being assigned for generation, by a respective processing node of the plurality of processing nodes, of one or more intermediate results based on a respective partition of the number of partitions; executing, at a second node of the plurality of processing nodes, a union operation to union the generated one or more intermediate results, wherein the unioned one or more intermediate results comprise a union result; using the union result as a subsequent intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan; and returning the union result as a final result for the query if no additional execution operations are required by the calculation plan. 2. A computer program product as in claim 1 , wherein the operations further comprise: initiating the query; receiving, by a recipient node of the plurality of processing nodes, a data request in response to the initiated query, wherein the data request is received directly from a requesting machine without being handled by the master node; identifying, by the recipient node, a target node of the plurality of processing nodes to handle the data request, the identifying comprising: applying partitioning information to determine one partition of the number of partitions to which the data request should be directed and mapping information associating each of the number of partitions with an assigned node of the plurality of processing nodes; and redirecting, by the recipient node, the data request to the target node, wherein the target node acts on the one partition in response to the data request to obtain at least a portion of the received data specified by the calculation plan. 3. A computer program product as in claim 2 , wherein the operations further comprise: accessing the partitioning information and the mapping information from at least one of a local storage accessible to the recipient node and a metadata repository accessible to each of the plurality of processing nodes. 4. A system comprising: at least one programmable processor; and a machine-readable medium storing instructions that, when executed by the at least one processor, cause the at least one programmable processor to perform operations comprising: receiving, at a first node defined within a calculation model, first data, the calculation model including a master node and a plurality of processing nodes comprising the first node, the plurality of processing nodes being controlled by the master node, the first data comprising at least one of a first intermediate result of an earlier operation in a calculation plan generated from the calculation model or second data from a table upon which the calculation plan is currently operating in response to a query, the calculation plan generated at runtime and tailored to both the query and a response to the query; applying, at the first node and in response to receiving the first data, a partition specification to a column in the first data, the applying comprising: determining, at the first node and in response to receiving the first data, a number of unique values in a column in the first data; splitting, at the first node and according to the partition specification, a first table into a number of partitions equal to the number of unique values in the column, wherein each unique value has a corresponding partition; and placing records of the first table having the same unique value together in a partition; assigning, at the first node, a separate processing path for each of the number of partitions, each separate processing path being assigned for generation, by a respective processing node of the plurality of processing nodes, of one or more intermediate results based on a respective partition of the number of partitions; executing, at a second node of the plurality of processing nodes, a union operation to union the generated one or more intermediate results, wherein the unioned one or more intermediate results comprise a union result; using the union result as a subsequent intermediate result by additional execution operations of the calculation plan if additional operations are required by the calculation plan; and returning the union result as a final result for the query if no additional execution operations are required by the calculation plan. 5. A system as in claim 4 , wherein the operations further comprise: initiating the query; receiving, by a recipient node of the plurality of processing nodes, a data request in response to the initiated query, wherein the data request is received directly from a requesting machine without being handled by the master node; identifying, by the recipient node, a target node of the plurality of processing nodes to handle the data request, the identifying comprising: applying partitioning information to determine one partition of the number of partitions to which the data request should be directed and mapping information associating each of the number of partitions with an assigned node of the plurality of processing nodes; and redirecting, by the recipient node, the data request to the target node, wherein the target node acts on the one partition in response to the data request to obtain at least a portion of the received data specified by the calculation plan. 6. A system as in claim 5 , wherein the operations further comprise: accessing the partitioning information and the mapping information from at least one of a local storage accessible to the recipient node and a metadata repository accessible to each of the plurality of processing nodes. 7. A method comprising: receiving, at a first node defined within a calculation model, first data, the calculation model including a master node and a plurality of processing nodes comprising the first node, the plurality of processing nodes being controlled by the master node, the first data comprising at least one of a first intermediate result of an earlier operation in a calculation plan generated from the calculation model or second data from a table upon which the calculation plan is currently operating in response to a query, the

Assignees

Inventors

Classifications

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10146834B2 cover?
A dynamic split node defined within a calculation model can receive data being operated on by a calculation plan generated based on the calculation model. A partition specification can be applied to one or more reference columns in a table containing at least some of the received data. The applying can cause the table to be split such that a plurality of records in the table are partitioned acc…
Who is the assignee on this patent?
Baeumges Daniel, Bensberg Christian, Fricke Lars, and 1 more
What technology area does this patent fall under?
Primary CPC classification G06F17/30486. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Dec 04 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).