Scheduling homogeneous and heterogeneous workloads with runtime elasticity in a parallel processing environment

US2017139749A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2017139749-A1
Application numberUS-201715418828-A
CountryUS
Kind codeA1
Filing dateJan 30, 2017
Priority dateMay 20, 2013
Publication dateMay 18, 2017
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods are provided for scheduling homogeneous workloads including batch jobs, and heterogeneous workloads including batch and dedicated jobs, with run-time elasticity wherein resource requirements for a given job can change during run-time execution of the job.

First claim

Opening claim text (preview).

What is claimed is: 1 . A computing system, comprising: a memory device to store program instructions for scheduling jobs in a HPC (high-performance computing) system; and a processor coupled to the memory, wherein the processor executes the program instructions stored in the memory to cause the computing system to perform a method comprising: maintaining a batch jobs queue to temporarily store batch jobs received by a HPC (high-performance computing) system; performing a scheduling process cycle at a given time to schedule one or more batch jobs pending in the batch jobs queue for execution by the HPC system, wherein performing the scheduling process cycle comprises: determining an available processor capacity of the HPC system at the given time; determining an assigned processor capacity for executing a head batch job in the batch jobs queue; determining a number of previous scheduling process cycles that the head batch job was skipped and not scheduled for execution by the HPC system; scheduling the head batch job for execution by the HPC system at the given time when (i) the assigned processor capacity for executing the head batch job is less than or equal to the available processor capacity of the HPC system and (ii) the number of previous scheduling process cycles that the head batch job was skipped and not scheduled for execution by the HPC system reaches a predetermined skip count threshold; skipping a scheduling of the head batch job for execution by the HPC system at the given time when (i) the assigned processor capacity for executing the head batch job is less than or equal to the available processor capacity of the HPC system, (ii) the number of previous scheduling process cycles that the head batch job was skipped and not scheduled for execution by the HPC system is less than the predetermined skip count threshold, and (iii) a set of one or more batch jobs exists in the batch jobs queue which can be scheduled for execution at the given time to maximize utilization of the processor capacity of the HPC system without scheduling execution of the head batch job; and scheduling a future time for executing the head batch job by the HPC system when the assigned processor capacity for executing the head batch job exceeds the available processor capacity of the HPC system. 2 . The computing system of claim 1 , wherein the assigned processor capacity for executing the head batch job, and the number of previous scheduling process cycles that the head batch job was skipped and not scheduled for execution by the HPC system comprise parameters that are stored in association with the head batch job in the batch jobs queue. 3 . The computing system of claim 1 , wherein skipping a scheduling of the head batch job for execution by the HPC system at the given time comprises: determining the set of one or more batch jobs in the batch jobs queue which can be scheduled for execution at the given time based on an assigned processor capacity of each of the batch jobs in the batch jobs queue; and increasing by one, the number of previous scheduling process cycles that the head batch job was skipped and not scheduled for execution by the HPC system. 4 . The computing system of claim 1 , wherein performing the scheduling process cycle further comprises: scheduling the head batch job for execution by the HPC system at the given time along with one or more additional batch jobs in the batch jobs queue when (i) the assigned processor capacity for executing the head batch job is less than or equal to the available processor capacity of the HPC system, (ii) the number of previous scheduling process cycles that the head batch job was skipped and not scheduled for execution by the HPC system is less than the predetermined skip count threshold, and (iii) the one or more additional batch jobs can be scheduled for execution at the given time along with the head batch job to maximize utilization of the processor capacity of the HPC system. 5 . The computing system of claim 1 , wherein scheduling the future time for executing the head batch job by the HPC system when the assigned processor capacity for executing the head batch job exceeds the available processor capacity of the HPC system comprises: making a reservation time for executing the head batch job at the future time based on a remaining execution time of each active job being executed in the HPC system; and selecting a set of one or more batch jobs in the batch jobs queue which can be scheduled for execution before the reservation time of the head batch job. 6 . The computing system of claim 1 , wherein making the reservation time for executing the head batch job comprises: accessing a list of active jobs in which all active jobs executing in the HPC system are sorted starting from an active job with a smallest remaining execution time to an active job with a largest remaining execution time; utilizing the list of active jobs to determine a set of active jobs, starting from the active job with the smallest remaining execution time, which will result in a sufficient amount of available processor capacity for the head batch job when execution of each active job in the set of active jobs is finished; computing a first value by adding to the given time a remaining execution time of an active job in the set of active jobs which has a greatest remaining execution time; computing a second value as a sum of (i) the available processor capacity of the HPC system at the given time and (ii) a total of each assigned processor capacity of each active job in the set of active jobs, less the assigned processor capacity for the head batch job; for each batch job in the batch jobs queue with an assigned processor capacity that is less than or equal to the available processor capacity of the HPC system at the given time, computing a third value which represents a total process or capacity of the HPC system that is required by the batch job at the computed first value; and making a reservation time for executing the head batch job based on the computed second value and the computed third value of each batch job. 7 . The computing system of claim 6 , wherein the third value of a given batch job is set equal to 0 when the given time plus an estimated execution time of the given batch job is less than the first value, otherwise the third value of a given batch job is set equal to an assigned processor capacity for executing the given batch job. 8 . The computing system of claim 1 , wherein performing the scheduling process cycle comprises commencing the scheduling process cycle in response to a triggering event. 9 . The computing system of claim 8 , wherein the triggering event comprises an arrival of a new batch job in the batch jobs queue or termination of an executing batch job in the HPC system. 10 . The computing system of claim 8 , wherein the triggering event comprises an arrival of a command that triggers a change in an estimated execution time of a batch job that is pending in the batch jobs queue or an active batch job that is executing in the HPC system. 11 . A computing system, comprising: a memory device to store program instructions for scheduling jobs in a HPC (high-performance computing) system; and a processor coupled to the memory, wherein the processor executes the program instructions stored in the memory to cause the computing system to perform a method comprising: maintaining a batch jobs queue to temporarily store batch jobs received by a HPC (high-performance computing) system; maintaining a dedicated jobs queue to temporarily dedicated jobs received by the HPC system; performing a scheduling process cycle at a giv

Assignees

Inventors

Classifications

  • G06F9/5038Primary

    considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title

  • G06F9/4887Primary

    involving deadlines, e.g. rate based, periodic · CPC title

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • Priority · CPC title

  • Resource availability · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2017139749A1 cover?
Systems and methods are provided for scheduling homogeneous workloads including batch jobs, and heterogeneous workloads including batch and dedicated jobs, with run-time elasticity wherein resource requirements for a given job can change during run-time execution of the job.
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/5038. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu May 18 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 3 related publications on this page (citations in our corpus or others sharing the same primary CPC).