System and methods for collaborative query processing for large scale data processing with software defined networking

US9367366B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9367366-B2
Application numberUS-201414504434-A
CountryUS
Kind codeB2
Filing dateOct 2, 2014
Priority dateMar 27, 2014
Publication dateJun 14, 2016
Grant dateJun 14, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A system includes a task scheduler that works collaboratively with a flow scheduler; a network-aware task scheduler based on software-defined network, the task scheduler scheduling tasks according to available network bandwidth.

First claim

Opening claim text (preview).

What is claimed is: 1. A method executed by a processor for efficient execution of analytic queries, comprising: collecting network flow information from switches within a cluster; receiving analytic queries with a programming model for processing and generating large data sets with a parallel, distributed process on the cluster with collaborative software-defined networking; determining A(h) as available bandwidth for a hop on a path by determining from a capacity Cap competitive flows with Flow sharing one or more of the hops in a path p Flow as: A ⁡ ( h ) = Cap - ∑ h ∈ H ⁡ ( p Flow ′ ) ⋀ Flow ′ ∈ { Flow ⁢ \ ⁢ Flow } ⁢ Flow ′ · rate ; and wherein: A(h) denotes the Available bandwidth for the hop on the path; h denotes a member of all hops H; p denotes a specific path; Cap denotes a capacity; Flow denotes a flow; Flow′ denotes a competitive Flow; p Flow denotes the current path; and scheduling, based on the available bandwidth A(h), candidate tasks for a node when a node asks for tasks. 2. The method of claim 1 , wherein the programming model comprises MapReduce. 3. The method of claim 1 , wherein the cluster comprises a Hadoop cluster. 4. The method of claim 1 , comprising obtaining task properties including local or non-local network paths. 5. The method of claim 1 , comprising obtaining available bandwidth for candidate non-local paths. 6. The method of claim 1 , comprising selecting the best candidate task and scheduling the task to a node to execute. 7. The method of claim 1 , comprising scheduling network flow to a selected path according to a selected task. 8. The method of claim 1 , comprising collecting network flow information from switches within Hadoop cluster. 9. The method of claim 1 , comprising scheduling network flow to a specific path using OpenFlow switches. 10. The method of claim 1 , comprising updating network flow information from switches within the cluster. 11. The method of claim 1 , comprising receiving a scheduling request for a flow to a specific path from the task scheduler and fulfilling the request using OpenFlow switches. 12. The method of claim 1 , comprising Hadoop Map Reduce task scheduler works collaboratively with a flow scheduler. 13. The method of claim 1 , comprising checking if the largest bandwidth is larger than a lower bound setting, and if not skipping scheduling tasks on the current node and otherwise selecting a task and sending command to a flow scheduler to schedule the flow for a specified path. 14. The method of claim 1 , comprising network-aware task scheduler based on software-defined network, the task scheduler scheduling tasks according to available network bandwidth. 15. The method of claim 1 , comprising communicating with an application-aware flow scheduler working collaboratively with a task scheduler. 16. The method of claim 1 , comprising receiving flow schedule requests from a task scheduler for precise estimation of traffic demand. 17. The method of claim 1 , comprising dynamically updating network information and reporting to a task scheduler. 18. A system including a memory for efficient execution of analytic queries, comprising: an application-aware flow scheduler stored in the memory; a Hadoop Map Reduce task scheduler stored in the memory working collaboratively with the flow scheduler, wherein the task scheduler is network-aware and based on a software-defined network, the task scheduler scheduling tasks according to available network bandwidth, wherein the flow scheduler receives flow schedule requests from the task scheduler and the flow scheduler dynamically updating the network information and reports to the task scheduler and determining A(h) as available bandwidth for a hop on a path by determining from a capacity Cap competitive flows with Flow sharing one or more of the hops in a path p Flow as: A ⁡ ( h ) = Cap - ∑ h ∈ H ⁡ ( p Flow ′ ) ⋀ Flow ′ ∈ { Flow ⁢ \ ⁢ Flow } ⁢ Flow ′ · rate

Assignees

Inventors

Classifications

  • G06F9/5066Primary

    Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs (mappping at compile time, see G06F8/451) · CPC title

  • Exploiting fine grain parallelism, i.e. parallelism at instruction level (run-time instruction scheduling G06F9/3836) · CPC title

  • Optimisation · CPC title

  • G06F9/52Primary

    Program synchronisation; Mutual exclusion, e.g. by means of semaphores · CPC title

  • Dependency analysis; Data or control flow analysis · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9367366B2 cover?
A system includes a task scheduler that works collaboratively with a flow scheduler; a network-aware task scheduler based on software-defined network, the task scheduler scheduling tasks according to available network bandwidth.
Who is the assignee on this patent?
Nec Lab America Inc, Nec Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/5066. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 14 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).