Deep learning job scheduling method and system and related device

US11954521B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11954521-B2
Application numberUS-202017038720-A
CountryUS
Kind codeB2
Filing dateSep 30, 2020
Priority dateMar 30, 2018
Publication dateApr 9, 2024
Grant dateApr 9, 2024

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A deep learning job scheduling method includes obtaining a job request of a deep learning job, determining a target job description file template from a plurality of pre-stored job description file templates based on the job request, determining an identifier of a target job basic image from identifiers of a plurality of pre-stored job basic images based on the job request, generating a target job description file based on the target job description file template and the identifier of the target job basic image, sending the target job description file to a container scheduler, and selecting the target job basic image from the pre-stored job base images based on the target job description file, and creating at least one container for executing the job request.

First claim

Opening claim text (preview).

What is claimed is: 1. A deep learning job scheduling method, comprising: obtaining a job request of a deep learning job comprising a deep learning library type and a job type; determining a target job description file template from a plurality of pre-stored job description file templates based on the deep learning library type and the job type; determining an identifier of a target job basic image from identifiers of a plurality of pre-stored job basic images based on the deep learning library type and the job type, wherein the pre-stored job basic images comprise an image of a deep learning library, an image of a dependency library, and an image of a deep learning program; generating a target job description file based on the target job description file template and the identifier; sending the target job description file to a container scheduler; selecting, by the container scheduler, the target job basic image from the pre-stored job basic images based on the target job description file; and creating a container for executing the job request. 2. The deep learning job scheduling method of claim 1 , wherein the deep learning job comprises a task, wherein the job request further comprises one of the following two options: (1) at least one of a job name, a deep learning program storage location, an application boot file, a dataset storage location, a type of the task, a quantity of the task, a job command line parameter, or a resource requirement of the task, or (2) at least one of the job name, the deep learning program, the application boot file, the dataset storage location, the type, the quantity, the job command line parameter, or the resource requirement, and wherein the deep learning scheduling method further comprises generating the target job description file based on the job request, the target job description file template, and the identifier. 3. The deep learning job scheduling method of claim 2 , further comprising filling the target job description file template with information comprised in the job request and the identifier to obtain the target job description file. 4. The deep learning job scheduling method of claim 1 , wherein the dependency library is used when the deep learning job is executed, and wherein an instantiation of the deep learning program is the deep learning job. 5. The deep learning job scheduling method of claim 1 , wherein the pre-stored job description file templates is based on deep learning library types and job types, wherein each of the pre-stored job description file templates corresponds to one deep learning library type and one job type, wherein the pre-stored job basic images are based on the deep learning library types and the job types, and wherein each of the pre-stored job basic images corresponds to one deep learning library type and one job type. 6. The deep learning job scheduling method of claim 1 , wherein after sending the target job description file to the container scheduler, the deep learning job scheduling method further comprises storing a job identifier indicating the job request in a queue when the container scheduler fails in scheduling, wherein the job identifier comprises at least one of the job request, information comprised in the job request, the target job description file, a pointer, or a data structure, wherein the pointer points to at least one of the job request, the information comprised in the job request, and the target job description file, and wherein the data structure points to at least one of the job request, the information carried in the job request, and the target job description file. 7. The deep learning job scheduling method of claim 6 , wherein after storing the job identifier, the deep learning job scheduling method further comprises: determining that the container scheduler has a condition for resubmitting a job request; extracting the job identifier from the queue; and resubmitting the job request to the container scheduler based on the job identifier. 8. A deep learning job scheduling system, comprising: one or more processors; a memory coupled to the one or more processors and configured to store instructions that, when executed by the one or more processors, cause a job scheduler to: obtain a job request of a deep learning job comprising a deep learning library type and a job type; determine a target job description file template from a plurality of pre-stored job description file templates based on the deep learning library type and the job type; determine an identifier of a target job basic image from identifiers of a plurality of pre-stored job basic images based on the deep learning library type and the job type, wherein the pre-stored job basic images comprise an image of a deep learning library, an image of a dependency library, and an image of a deep learning program; generate a target job description file based on the target job description file template and the identifier; and send the target job description file; and a container scheduler coupled to the job scheduler and configured to: receive the target job description file from the job scheduler; select the target job basic image from the pre-stored job basic images based on the target job description file; and create at least one container for executing the job request. 9. The deep learning job scheduling system of claim 8 , wherein the deep learning job comprises a task, wherein the job request further comprises one of the following two options: (1) at least one of a job name, a deep learning program storage location, an application boot file, a dataset storage location, a type of the task, a quantity of the task, a job command line parameter, and a resource requirement of the task, or (2) at least one of the job name, the deep learning program, the application boot file, the dataset storage location, the type of the task, the quantity of the task, the job command line parameter, and the resource requirement, and wherein the job scheduler is further configured to generate the target job description file based on the job request, the target job description file template, and the identifier. 10. The deep learning job scheduling system of claim 9 , wherein the one or more processors are further configured to execute the instructions to cause the job scheduler is to fill the target job description file template with information comprised in the job request and the identifier to obtain the target job description file. 11. The deep learning job scheduling system of claim 8 , wherein the dependency library is used when the deep learning job is executed, and wherein an instantiation of the deep learning program is the deep learning job. 12. The deep learning job scheduling system of claim 8 , wherein the plurality of pre-stored job description file templates is based on deep learning library types and job types, wherein each of the pre-stored job description file templates corresponds to one deep learning library type and one job type, wherein the pre-stored job basic images are based on the deep learning library types and the job types, and wherein each of the pre-stored job basic images corresponds to one deep learning library type and one job type. 13. The deep learning job scheduling system of claim 8 , wherein the container scheduler is further configured to store a job identifier indicating the job request in a queue in response to the container scheduler fails in scheduling, wherein the job identifier comprises at least one of the job request, information comprised in the job request, the target job description file, a pointer, and a data structure, wherein the pointer points to at least

Assignees

Inventors

Classifications

  • G06F9/4881Primary

    Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title

  • Techniques for rebalancing the load in a distributed system · CPC title

  • Learning methods · CPC title

  • Machine learning · CPC title

  • G06F9/5027Primary

    the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11954521B2 cover?
A deep learning job scheduling method includes obtaining a job request of a deep learning job, determining a target job description file template from a plurality of pre-stored job description file templates based on the job request, determining an identifier of a target job basic image from identifiers of a plurality of pre-stored job basic images based on the job request, generating a target …
Who is the assignee on this patent?
Huawei Cloud Computing Tech Co Ltd
What technology area does this patent fall under?
Primary CPC classification G06F9/4881. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 09 2024 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 5 related publications on this page (citations in our corpus or others sharing the same primary CPC).