Processing a database query using a shared metadata store

US11120022B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11120022-B2
Application numberUS-201816123981-A
CountryUS
Kind codeB2
Filing dateSep 6, 2018
Priority dateFeb 25, 2013
Publication dateSep 14, 2021
Grant dateSep 14, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method and system for executing a query in parallel is disclosed. A master node may receive a query from a client and develop query plans from that query. The query plans may be forwarded to worker nodes for execution, and each query plan may be accompanied by query metadata. The metadata may be stored in a catalog on the master node.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving a query at a master node, the master node having access to a database catalog that comprises metadata defining database objects; in response to receiving the query, initiating a catalog server session, taking a snapshot of the metadata, and associating the snapshot of the metadata with the catalog server session, wherein a separate catalog server session is initiated and a separate snapshot of the metadata is taken for a separate query that is not the same as the query received at the master node; communicating a query plan and query metadata to a worker node, wherein the query plan is generated based at least in part on the query, the query metadata includes metadata to be used in connection with execution of the query plan, the query metadata is obtained based at least in part on the snapshot of the metadata associated with the catalog server session, and the query metadata is communicated to the worker node contemporaneous with respect to the query plan; in response to receiving the query plan and the query metadata, determining, by the worker node, that additional metadata is required for the worker node to execute the query plan; in response to determining that additional metadata is required, requesting, by the worker node, the additional metadata, wherein the worker node queries a parent in a tree structure of a plurality of worker nodes in a parallel processing database system for the additional metadata, and the parent node is a node between the master node and the worker node in relation to the tree structure; receiving the additional metadata from another worker node, wherein the additional metadata is retrieved from a same session as the catalog server session corresponding to the query; executing the query plan on the worker node; and returning, to the master node, a result associated with the execution of the query plan on the worker node. 2. The method of claim 1 , wherein the query metadata includes database table definitions that define database objects. 3. The method of claim 1 , further comprising generating a plurality of query plans based at least in part on the query, the plurality of query plans comprising the query plan that is communicated to the worker node. 4. The method of claim 3 , further comprising transmitting the plurality of query plans to the plurality of worker nodes. 5. The method of claim 4 , further comprising executing the plurality of query plans in parallel. 6. The method of claim 1 , further comprising storing the query metadata in a cache on the worker node. 7. The method of claim 6 , further comprising clearing the cache after executing the query plan. 8. The method of claim 6 , further comprising retrieving the query metadata from the cache while executing the query plan. 9. The method of claim 8 , wherein the querying a parent in the tree structure of the plurality of worker nodes comprises successively querying one or more parent nodes for the additional metadata before querying the master node for the additional metadata. 10. The method of claim 9 , wherein the successively querying one or more parent nodes for the additional metadata before querying the master node for the additional metadata comprises at least one of the one or more parent nodes forwarding a request for the additional metadata to another of the one or more parent nodes. 11. The method of claim 8 , wherein the query metadata includes one or more of a user defined database function, a system defined database function, a database view, and a database index. 12. The method of claim 8 , further comprising: receiving another query at the master node; in response to receiving the other query, initiating another catalog server session, taking, another snapshot of the metadata as the metadata existed when the other catalog server session is initiated, and associating the other snapshot of the metadata with the other catalog server session; and transmitting another query plan based on the other query and other query metadata to the worker node, wherein the other query metadata is retrieved from the other snapshot of the metadata associated with the other catalog server session. 13. The method of claim 8 , further comprising: compiling the result associated with the execution of the query plan on the worker node with another result associated with the query; and returning the compiled result and other result as a final query result to a client. 14. The method of claim 8 , wherein the query plan comprises the query metadata. 15. The method of claim 8 , wherein the request for the additional metadata by the worker node is transmitted is a multicast request. 16. The method of claim 8 , wherein the worker node maintains a list of other worker nodes, and the another worker node is selected from the list. 17. A computer program product for executing queries in a parallel processing database system, comprising a non-transitory computer readable medium having program instructions embodied therein for: receiving a query at a master node, the master node having access to a database catalog that comprises metadata defining database objects; in response to receiving the query, initiating a catalog server session, taking a snapshot of the metadata, and associating the snapshot of the metadata with the catalog server session, wherein a separate catalog server session is initiated and a separate snapshot of the metadata is taken for a separate query that is not the same as the query received at the master node; communicating a query plan and query metadata to a worker node, wherein the query plan is generated based at least in part on the query, the query metadata includes metadata to be used in connection with execution of the query plan, the query metadata is obtained based at least in part on the snapshot of the metadata associated with the catalog server session, and the query metadata is communicated to the worker node contemporaneous with respect to the query plan; in response to receiving the query plan and the query metadata, determining, by the worker node, that additional metadata is required for the worker node to execute the query plan; in response to determining that additional metadata is required, requesting, by the worker node, the additional metadata, wherein the worker node queries a parent in a tree structure of a plurality of worker nodes in a parallel processing database system for the additional metadata, and the parent node is a node between the master node and the worker node in relation to the tree structure; receiving the additional metadata from another worker node, wherein the additional metadata is retrieved from a same session as the catalog server session corresponding to the query; executing the query plan on the worker node; and returning, to the master node, a result associated with the execution of the query plan on the worker node. 18. A system for executing queries in a parallel processing database, comprising a non-transitory computer readable medium and a processor configured to: receive a query at a master node, the master node having access to a database catalog that comprises metadata defining database objects; in response to receiving the query, initiate a catalog server session, taking a snapshot of the metadata, and associating the snapshot of the metadata with the catalog server session, wherein a separate catalog server session is initiated and a separate snapshot of the metadata is taken for a separate query that is not the same as the query rece

Assignees

Inventors

Classifications

  • Details of archiving (lifecycle management in storage systems G06F3/0649; point-in-time backing up or restoration of persistent data G06F11/1446) · CPC title

  • Access plan code generation and invalidation; Reuse of access plans · CPC title

  • Parallel file systems, i.e. file systems supporting multiple processors · CPC title

  • File system administration, e.g. details of archiving or snapshots (error detection or correction of the data by redundancy in operations G06F11/14) · CPC title

  • Database tuning (G06F16/2282 takes precedence; database performance monitoring G06F11/3409) · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11120022B2 cover?
A method and system for executing a query in parallel is disclosed. A master node may receive a query from a client and develop query plans from that query. The query plans may be forwarded to worker nodes for execution, and each query plan may be accompanied by query metadata. The metadata may be stored in a catalog on the master node.
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/24542. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 14 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).