Method for scheduling data flow task and apparatus

US10558498B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10558498-B2
Application numberUS-201715598696-A
CountryUS
Kind codeB2
Filing dateMay 18, 2017
Priority dateNov 19, 2014
Publication dateFeb 11, 2020
Grant dateFeb 11, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for scheduling a data flow task and an apparatus. The method includes: preprocessing a data flow task to obtain at least one subtask; classifying the subtask into a central processing unit (CPU) task group, a graphics processing unit (GPU) task group, or a to-be-determined task group; allocating the subtask to a working node; when the subtask belongs to the CPU task group, determining that a CPU executes the subtask; when the subtask belongs to the GPU task group, determining that a GPU executes the subtask; or when the subtask belongs to the to-be-determined task group, determining, according to costs of executing the subtask by a CPU and a GPU, a running platform (e.g., the CPU or the GPU) executes the subtask, where the cost includes duration of executing the subtask.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for scheduling a data flow task in a distributed heterogeneous system comprising at least one working node, the method comprising: preprocessing the data flow task to obtain at least one subtask; classifying a first subtask in the at least one subtask into a task group, wherein task groups of the distributed heterogeneous system comprise a central processing unit (CPU) task group, a graphics processing unit (GPU) task group, and a to-be-determined task group, wherein classifying a first subtask in the at least one subtask into a task group comprises: classifying the first subtask into the CPU task group when the first subtask comprises indication information indicating that the first subtask is executed by a CPU; and classifying the first subtask into the GPU task group when the first subtask comprises indication information indicating that the first subtask is executed by a GPU; allocating the first subtask to a first working node in the at least one working node according to the task group to which the first subtask belongs and a resource status of the at least one working node; determining that a CPU corresponding to the first working node executes the first subtask when the first subtask belongs to the CPU task group; determining that a GPU corresponding to the first working node executes the first subtask when the first subtask belongs to the GPU task group; and determining, according to costs of executing the first subtask by a CPU and a GPU, a running platform that executes the first subtask when the first subtask belongs to the to-be-determined task group, wherein each of the costs comprises duration of executing the first subtask. 2. The method according to claim 1 , wherein classifying a first subtask in the at least one subtask into a task group comprises: estimating first duration of executing the first subtask by the CPU and second duration of executing the first subtask by the GPU when the first subtask does not comprise indication information, and classifying the first subtask into a task group according to the first duration and the second duration. 3. The method according to claim 2 , wherein classifying the first subtask into a task group according to the first duration and the second duration comprises: classifying the first subtask into the CPU task group if a ratio of the first duration to the second duration is less than a first preset threshold; classifying the first subtask into the GPU task group if the ratio of the first duration to the second duration is greater than a second preset threshold; and classifying the first subtask into the to-be-determined task group if the ratio of the first duration to the second duration is not less than the first preset threshold and is not greater than the second preset threshold. 4. The method according to claim 1 , further comprising: recording execution log information of the first subtask into a performance database, wherein the execution log information comprises a data volume of the first subtask, required waiting duration before the first subtask is executed, and a running platform and running duration of the first subtask. 5. The method according to claim 4 , further comprising: querying the performance database, and calculating first average duration of executing subtasks by the CPU corresponding to the first working node and second average duration of executing the subtasks by the GPU corresponding to the first working node; and adjusting, according to the first average duration and the second average duration, subtasks distribution on the first working node. 6. A method for scheduling a data flow task in a distributed heterogeneous system comprising at least one working node, the method comprising: preprocessing the data flow task to obtain at least one subtask; classifying a first subtask in the at least one subtask into a task group, wherein task groups of the distributed heterogeneous system comprise a central processing unit (CPU) task group, a graphics processing unit (GPU) task group, and a to-be-determined task group; allocating the first subtask to a first working node in the at least one working node according to the task group to which the first subtask belongs and a resource status of the at least one working node; determining that a CPU corresponding to the first working node executes the first subtask when the first subtask belongs to the CPU task group; determining that a GPU corresponding to the first working node executes the first subtask when the first subtask belongs to the GPU task group; determining, according to costs of executing the first subtask by a CPU and a GPU, a running platform that executes the first subtask when the first subtask belongs to the to-be-determined task group, wherein each of the costs comprises duration of executing the subtask; calculating first average duration of executing subtasks by the CPU corresponding to the first working node and second average duration of executing the subtasks by the GPU corresponding to the first working node; and adjusting, according to the first average duration and the second average duration, subtasks distribution on the first working node; wherein adjusting, according to the first average duration and the second average duration, subtasks distribution on the first working node comprises: allocating first O subtasks that are in the to-be-determined task group on the first working node and that have greatest acceleration ratios to the GPU task group on the first working node if the first average duration is greater than the second average duration, wherein the acceleration ratio is a ratio of a time of executing a subtask by the CPU to a time of executing the subtask by the GPU; and allocating first P subtasks that are in the to-be-determined task group on the first working node and that have smallest acceleration ratios to the CPU task group on the first working node if the first average duration is less than the second average duration, wherein O and P are positive integers. 7. The method according to claim 5 , wherein adjusting, according to the first average duration and the second average duration, subtasks distribution on the first working node comprises: allocating first O subtasks that are in the CPU task group on the first working node and that have greatest acceleration ratios to the to-be-determined task group on the first working node if the first average duration is greater than the second average duration, wherein the acceleration ratio is a ratio of a time of executing a subtask by the CPU to a time of executing the subtask by the GPU; and allocating first P subtasks that are in the GPU task group on the first working node and that have smallest acceleration ratios to the to-be-determined task group on the first working node if the first average duration is less than the second average duration, wherein O and P are positive integers. 8. An apparatus for use in a distributed heterogeneous system comprising at least one working node, the apparatus comprising: a memory configured to store instructions; and a processor coupled to the memory and configured to execute the instructions to: preprocess a data flow task to obtain at least one subtask; classify a first subtask in the at least one subtask into a task group, wherein task groups of the distributed heterogeneous system comprise a central processing unit (CPU) task group, a graphics processing unit (GPU) task group, and a to-be-determined task group, wherein to classify a first subtask in the at least one subtask into a task group comprises: classify the first subtask into the CPU task group when the first subtask comprises indication information indicating that the first subta

Assignees

Inventors

Classifications

  • G06F9/5027Primary

    the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title

  • Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • G06F9/5044Primary

    considering hardware capabilities · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10558498B2 cover?
A method for scheduling a data flow task and an apparatus. The method includes: preprocessing a data flow task to obtain at least one subtask; classifying the subtask into a central processing unit (CPU) task group, a graphics processing unit (GPU) task group, or a to-be-determined task group; allocating the subtask to a working node; when the subtask belongs to the CPU task group, determining …
Who is the assignee on this patent?
Huawei Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F9/5027. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Feb 11 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).