Coordinated, topology-aware CPU-GPU-memory scheduling for containerized workloads
US-10896064-B2 · Jan 19, 2021 · US
US12399745B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-12399745-B2 |
| Application number | US-202217744396-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 13, 2022 |
| Priority date | Jul 7, 2021 |
| Publication date | Aug 26, 2025 |
| Grant date | Aug 26, 2025 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A resource scheduling method and apparatus, a device, and a storage medium are provided, and relates to the field of computer technology, and in particular to the field of deep learning technology. The method includes: acquiring a graphics processing unit (GPU) topology relationship of a cluster according to GPU connection information of each of computing nodes in the cluster; and in a case where a task request, for applying for a GPU resource, for a target task is received, determining a target computing node of the target task and a target GPU in the target computing node according to the task request and the GPU topology relationship, to complete GPU resource scheduling of the target task. The present disclosure can optimize the resource scheduling.
Opening claim text (preview).
What is claimed is: 1. A resource scheduling method, comprising: acquiring a graphics processing unit (GPU) topology relationship of a cluster according to GPU connection information of each of computing nodes in the cluster; in a case where a task request, for applying for a GPU resource, for a target task is received, determining a target computing node of the target task and a target GPU in the target computing node according to task information of the task request and the GPU topology relationship, to complete GPU resource scheduling of the target task, wherein the task information comprises a task type of the target task and/or a number of GPUs applied for; and executing the target task by using the target GPU, in a case where the task type of the target task comprised in the task information is single-task and the number of GPUs applied for is 1, traversing each of the computing nodes according to the GPU topology relationship, and selecting a computing node with a lowest transmission performance as the target computing node; in a case where the task type of the target task comprised in the task information is single-task and the number of GPUs applied for is larger than 1, traversing each of the computing nodes according to the GPU topology relationship, and selecting a computing node with a highest transmission performance as the target computing node; in a case where the task type of the target task comprised in the task information is multi-task and the number of GPUs applied for is larger than 1, traversing each of the computing nodes according to the GPU topology relationship, and determining whether there are candidate computing nodes with GPU resources that are able to satisfy the target task in the cluster, and determining the target computing node according to whether the candidate computing nodes exist; wherein after determining the target computing node of the target task, a GPU connection level selected for the target task is written into an annotation of the target task, and determining the target GPU in the target computing node according to the task information of the task request and the GPU topology relationship comprises: reading the GPU connection level selected for the target task; and selecting a corresponding GPU as the target GPU according to the GPU connection level and the GPU topology relationship, wherein the GPU connection information comprises a GPU connection relationship and a GPU connection manner, and the acquiring the GPU topology relationship of the cluster according to the GPU connection information of each of the computing nodes in the cluster, comprises: acquiring the GPU connection manner, the GPU connection relationship, and a node connection relationship of each of the computing nodes in the cluster; determining a GPU connection level of each of the computing nodes according to the GPU connection manner of each of the computing nodes; and obtaining the GPU topology relationship according to the GPU connection level, the GPU connection relationship, and the node connection relationship of each of the computing nodes, wherein the GPU connection level is used to represent a transmission performance between GPUs. 2. The method of claim 1 , further comprising: in a case where the task type of the target task comprised in the task information is multi-task and the number of GPUs applied for is larger than 1, when a traversal result indicates that there are candidate computing nodes with GPU resources that are able to satisfy the target task, selecting a computing node with a highest transmission performance from the candidate computing nodes as the target computing node. 3. The method of claim 1 , further comprising: in a case where the task type of the target task comprised in the task information is multi-task and the number of GPUs applied for is larger than 1, when a traversal result indicates that there are no candidate computing nodes with GPU resources that are able to satisfy the target task, for each task in the target task, selecting a computing node, in which a number of GPUs is able to satisfy the task, from the computing nodes as the target computing node, according to a descending order of transmission performances of the computing nodes. 4. The method of claim 1 , wherein the determining the target GPU in the target computing node according to the task request and the GPU topology relationship, further comprises: acquiring a GPU connection level and a GPU connection relationship of the target computing node according to the GPU topology relationship; and selecting the target GPU from GPUs comprised in the target computing node according to the GPU connection level of the target task, and the GPU connection level and the GPU connection relationship of the target computing node. 5. An electronic device, comprising: at least one processor; and a memory connected communicatively to the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions, when executed by the at least one processor, enable the at least one processor to perform operations of: acquiring a graphics processing unit (GPU) topology relationship of a cluster according to GPU connection information of each of computing nodes in the cluster; in a case where a task request, for applying for a GPU resource, for a target task is received, determining a target computing node of the target task and a target GPU in the target computing node according to task information of the task request and the GPU topology relationship, to complete GPU resource scheduling of the target task, wherein the task information comprises a task type of the target task and/or a number of GPUs applied for; and executing the target task by using the target GPU, in a case where the task type of the target task comprised in the task information is single-task and the number of GPUs applied for is 1, traversing each of the computing nodes according to the GPU topology relationship, and selecting a computing node with a lowest transmission performance as the target computing node; in a case where the task type of the target task comprised in the task information is single-task and the number of GPUs applied for is larger than 1, traversing each of the computing nodes according to the GPU topology relationship, and selecting a computing node with a highest transmission performance as the target computing node; in a case where the task type of the target task comprised in the task information is multi-task and the number of GPUs applied for is larger than 1, traversing each of the computing nodes according to the GPU topology relationship, and determining whether there are candidate computing nodes with GPU resources that are able to satisfy the target task in the cluster, and determining the target computing node according to whether the candidate computing nodes exist; wherein after determining the target computing node of the target task, a GPU connection level selected for the target task is written into an annotation of the target task, and determining the target GPU in the target computing node according to the task information of the task request and the GPU topology relationship comprises: reading the GPU connection level selected for the target task; and selecting a corresponding GPU as the target GPU according to the GPU connection level and the GPU topology relationship, wherein the GPU connection information comprises a GPU connection relationship and a GPU connection manner, and the acquiring the GPU topology relationship of the cluster according to the GPU connection information of each of the computing nodes in the cluster, comprises: acquiring the GPU connection manner, the GPU connection relationship, and a node connecti
considering hardware capabilities · CPC title
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Distributed learning, e.g. federated learning · CPC title
Offload · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.