MxN dispatching in large scale distributed system

US10698891B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10698891-B2
Application numberUS-201715668861-A
CountryUS
Kind codeB2
Filing dateAug 4, 2017
Priority dateFeb 25, 2013
Publication dateJun 30, 2020
Grant dateJun 30, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

M×N dispatching in a large scale distributed system is disclosed. In various embodiments, a query is received. A query plan is generated to perform the query. A subset of query processing segments is selected, from a set of available query processing segments, to perform an assigned portion of the query plan. An assignment to perform the assigned portion of the query plan is dispatched to the selected subset of query processing segments.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving a query; generating, by a master node, a query plan to perform the query, wherein the generating of the query plan includes dividing the query plan into at least a first portion and a second portion, and wherein the master node comprises one or more hardware processors; selecting, by the master node, from a set of available query processing segments a first subset of query processing segments to perform a first assigned portion of the query plan corresponding to the first portion of the query plan, and a second subset of query processing segments to perform a second assigned portion of the query plan corresponding to the second portion of the query plan; and dispatching to the selected first subset of query processing segments an assignment to perform the first assigned portion of the query plan, wherein the dispatching of the assignment to perform the first assigned portion of the query plan includes providing to the selected first subset of query processing segments with corresponding metadata that is obtained from a central metadata store, wherein the metadata provided to the corresponding selected first subset of query processing segments is determined to be used by the selected first subset of query processing segments to perform the first assigned portion of the query plan. 2. The method of claim 1 , wherein a first number of segments selected to perform the first portion of the query plan is dynamically determined according to one or both of (1) data locality of data corresponding to the first portion of the query plan associated with the query in relation to the first subset of query processing segments, and (2) available resources. 3. The method of claim 2 , wherein the first subset of query processing segments is selected based at least in part on a co-locality of one or more of the selected query processing segments with data with which the assigned portion of the query plan is associated. 4. The method of claim 2 , wherein the first number of segments is selected to perform the first portion of the query plan and a second number of segments, different from the first number, is selected to perform the second portion of the query plan. 5. The method of claim 1 , wherein the first subset of query processing segments and the second subset of query processing segments include at least a first segment. 6. The method of claim 1 , wherein the metadata to be used by the selected first subset of query processing segments to perform the first assigned portion of the query plan is embedded in a communication from the master node to the selected first subset of query processing segments that comprises the first assigned portion of the query plan. 7. The method of claim 1 , wherein the metadata comprises data indicating where data required to perform first assigned portion of the query plan is located. 8. The method of claim 1 , wherein a first number of segments selected to perform the first portion of the query plan is dynamically determined. 9. The method of claim 1 , wherein the first subset of query processing segments executes a plurality of query execution threads. 10. The method of claim 1 , wherein the dividing the query plan into at least a first portion and a second portion includes dividing the query plan into a plurality of independently executable slices. 11. The method of claim 1 , wherein selecting the first subset of query processing segments includes receiving from a resource manager an indication of a degree of availability of processing segments included in the set of available query processing segments. 12. The method of claim 1 , wherein the first subset of query processing segments is selected according to a determination for each of a plurality of portions of the query plan a corresponding number of segments to be assigned to perform that portion. 13. The method of claim 1 , wherein the assignment comprises a network communication sent via a network interconnect. 14. The method of claim 1 , wherein the assignment includes the metadata, wherein the metadata is embedded in the assignment and is to be used to perform one or more tasks associated with the assigned portion of the query plan. 15. The method of claim 1 , wherein segments comprising the subset of query processing segments are associated with one or more segments hosts, each of which is configured to provide one or more processing segments. 16. The method of claim 1 , wherein the assignment includes the metadata, wherein the metadata indicates a location, within a distributed storage layer, of data associated with the assignment. 17. The method of claim 16 , wherein the location includes identification of a table of data to be search in connection with the assignment. 18. The method of claim 1 , wherein the first subset of query processing segments comprises one or more query processing segments, and the second subset of query processing segments comprises one or more query processing segments. 19. A system, comprising: a communication interface; and one or more hardware processors coupled to the communication interface and configured to: receive a query; generate a query plan to perform the query, wherein the query plan is generated such that the query plan is divided into at least a first portion and a second portion; select from a set of available query processing segments a first subset of query processing segments to perform a first assigned portion of the query plan, corresponding to the first portion of the query plan, and a second subset of query processing segments to perform a second assigned portion of the query plan corresponding to the second portion of the query plan; and dispatch to the selected first subset of query processing segments, via the communication interface, an assignment to perform the first assigned portion of the query plan, wherein to dispatch the assignment to perform the first assigned portion of the query plan includes providing to the selected first subset of query processing segments with corresponding metadata that is obtained from a central metadata store, wherein the metadata provided to the corresponding selected first subset of query processing segments is determined to be used by the selected first subset of query processing segments to perform the first assigned portion of the query plan. 20. A computer program product embodied in a tangible, non-transitory computer readable storage means, comprising computer instructions for: receiving a query; generating a query plan to perform the query, wherein the generating of the query plan includes dividing the query plan into at least a first portion and a second portion; selecting from a set of available query processing segments a first subset of query processing segments to perform a first assigned portion of the query plan corresponding to the first portion of the query plan, and a second subset of query processing segments to perform a second assigned portion of the query plan corresponding to the second portion of the query plan; and dispatching to the selected first subset of query processing segments an assignment to perform the first assigned portion of the query plan, wherein the dispatching of the assignment to perform the first assigned portion of the query plan includes providing to the selected first subset of query processing segments with corresponding metadata that is obtained from a central metadata store, wherein the metadata provided to the corresponding selected first subset of

Assignees

Inventors

Classifications

  • Parallel file systems, i.e. file systems supporting multiple processors · CPC title

  • Details of archiving (lifecycle management in storage systems G06F3/0649; point-in-time backing up or restoration of persistent data G06F11/1446) · CPC title

  • Plan optimisation · CPC title

  • of parallel queries · CPC title

  • Network streaming of media packets · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10698891B2 cover?
M×N dispatching in a large scale distributed system is disclosed. In various embodiments, a query is received. A query plan is generated to perform the query. A subset of query processing segments is selected, from a set of available query processing segments, to perform an assigned portion of the query plan. An assignment to perform the assigned portion of the query plan is dispatched to the s…
Who is the assignee on this patent?
Emc Ip Holding Co Llc
What technology area does this patent fall under?
Primary CPC classification G06F16/24542. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 30 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).