Who is the assignee on this patent?

Beijing Baidu Netcom Sci & Tech Co Ltd

What technology area does this patent fall under?

Primary CPC classification G06F9/5044. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 26 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

Resource scheduling method, device, and storage medium

US12399745B2 · US · B2

Patent metadata
Field	Value
Publication number	US-12399745-B2
Application number	US-202217744396-A
Country	US
Kind code	B2
Filing date	May 13, 2022
Priority date	Jul 7, 2021
Publication date	Aug 26, 2025
Grant date	Aug 26, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A resource scheduling method and apparatus, a device, and a storage medium are provided, and relates to the field of computer technology, and in particular to the field of deep learning technology. The method includes: acquiring a graphics processing unit (GPU) topology relationship of a cluster according to GPU connection information of each of computing nodes in the cluster; and in a case where a task request, for applying for a GPU resource, for a target task is received, determining a target computing node of the target task and a target GPU in the target computing node according to the task request and the GPU topology relationship, to complete GPU resource scheduling of the target task. The present disclosure can optimize the resource scheduling.

First claim

Opening claim text (preview).

What is claimed is: 1. A resource scheduling method, comprising: acquiring a graphics processing unit (GPU) topology relationship of a cluster according to GPU connection information of each of computing nodes in the cluster; in a case where a task request, for applying for a GPU resource, for a target task is received, determining a target computing node of the target task and a target GPU in the target computing node according to task information of the task request and the GPU topology relationship, to complete GPU resource scheduling of the target task, wherein the task information comprises a task type of the target task and/or a number of GPUs applied for; and executing the target task by using the target GPU, in a case where the task type of the target task comprised in the task information is single-task and the number of GPUs applied for is 1, traversing each of the computing nodes according to the GPU topology relationship, and selecting a computing node with a lowest transmission performance as the target computing node; in a case where the task type of the target task comprised in the task information is single-task and the number of GPUs applied for is larger than 1, traversing each of the computing nodes according to the GPU topology relationship, and selecting a computing node with a highest transmission performance as the target computing node; in a case where the task type of the target task comprised in the task information is multi-task and the number of GPUs applied for is larger than 1, traversing each of the computing nodes according to the GPU topology relationship, and determining whether there are candidate computing nodes with GPU resources that are able to satisfy the target task in the cluster, and determining the target computing node according to whether the candidate computing nodes exist; wherein after determining the target computing node of the target task, a GPU connection level selected for the target task is written into an annotation of the target task, and determining the target GPU in the target computing node according to the task information of the task request and the GPU topology relationship comprises: reading the GPU connection level selected for the target task; and selecting a corresponding GPU as the target GPU according to the GPU connection level and the GPU topology relationship, wherein the GPU connection information comprises a GPU connection relationship and a GPU connection manner, and the acquiring the GPU topology relationship of the cluster according to the GPU connection information of each of the computing nodes in the cluster, comprises: acquiring the GPU connection manner, the GPU connection relationship, and a node connection relationship of each of the computing nodes in the cluster; determining a GPU connection level of each of the computing nodes according to the GPU connection manner of each of the computing nodes; and obtaining the GPU topology relationship according to the GPU connection level, the GPU connection relationship, and the node connection relationship of each of the computing nodes, wherein the GPU connection level is used to represent a transmission performance between GPUs. 2. The method of claim 1 , further comprising: in a case where the task type of the target task comprised in the task information is multi-task and the number of GPUs applied for is larger than 1, when a traversal result indicates that there are candidate computing nodes with GPU resources that are able to satisfy the target task, selecting a computing node with a highest transmission performance from the candidate computing nodes as the target computing node. 3. The method of claim 1 , further comprising: in a case where the task type of the target task comprised in the task information is multi-task and the number of GPUs applied for is larger than 1, when a traversal result indicates that there are no candidate computing nodes with GPU resources that are able to satisfy the target task, for each task in the target task, selecting a computing node, in which a number of GPUs is able to satisfy the task, from the computing nodes as the target computing node, according to a descending order of transmission performances of the computing nodes. 4. The method of claim 1 , wherein the determining the target GPU in the target computing node according to the task request and the GPU topology relationship, further comprises: acquiring a GPU connection level and a GPU connection relationship of the target computing node according to the GPU topology relationship; and selecting the target GPU from GPUs comprised in the target computing node according to the GPU connection level of the target task, and the GPU connection level and the GPU connection relationship of the target computing node. 5. An electronic device, comprising: at least one processor; and a memory connected communicatively to the at least one processor, wherein the memory stores instructions executable by the at least one processor, and the instructions, when executed by the at least one processor, enable the at least one processor to perform operations of: acquiring a graphics processing unit (GPU) topology relationship of a cluster according to GPU connection information of each of computing nodes in the cluster; in a case where a task request, for applying for a GPU resource, for a target task is received, determining a target computing node of the target task and a target GPU in the target computing node according to task information of the task request and the GPU topology relationship, to complete GPU resource scheduling of the target task, wherein the task information comprises a task type of the target task and/or a number of GPUs applied for; and executing the target task by using the target GPU, in a case where the task type of the target task comprised in the task information is single-task and the number of GPUs applied for is 1, traversing each of the computing nodes according to the GPU topology relationship, and selecting a computing node with a lowest transmission performance as the target computing node; in a case where the task type of the target task comprised in the task information is single-task and the number of GPUs applied for is larger than 1, traversing each of the computing nodes according to the GPU topology relationship, and selecting a computing node with a highest transmission performance as the target computing node; in a case where the task type of the target task comprised in the task information is multi-task and the number of GPUs applied for is larger than 1, traversing each of the computing nodes according to the GPU topology relationship, and determining whether there are candidate computing nodes with GPU resources that are able to satisfy the target task in the cluster, and determining the target computing node according to whether the candidate computing nodes exist; wherein after determining the target computing node of the target task, a GPU connection level selected for the target task is written into an annotation of the target task, and determining the target GPU in the target computing node according to the task information of the task request and the GPU topology relationship comprises: reading the GPU connection level selected for the target task; and selecting a corresponding GPU as the target GPU according to the GPU connection level and the GPU topology relationship, wherein the GPU connection information comprises a GPU connection relationship and a GPU connection manner, and the acquiring the GPU topology relationship of the cluster according to the GPU connection information of each of the computing nodes in the cluster, comprises: acquiring the GPU connection manner, the GPU connection relationship, and a node connecti

Assignees

Beijing Baidu Netcom Sci & Tech Co Ltd

Inventors

Classifications

G06F9/5044Primary
considering hardware capabilities · CPC title
G06T1/20
Processor architectures; Processor configuration, e.g. pipelining · CPC title
Y02D10/00
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
G06N3/098
Distributed learning, e.g. federated learning · CPC title
G06F2209/509
Offload · CPC title

Patent family

Related publications grouped by family.

View patent family 77581320

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12399745B2 cover?: A resource scheduling method and apparatus, a device, and a storage medium are provided, and relates to the field of computer technology, and in particular to the field of deep learning technology. The method includes: acquiring a graphics processing unit (GPU) topology relationship of a cluster according to GPU connection information of each of computing nodes in the cluster; and in a case whe…
Who is the assignee on this patent?: Beijing Baidu Netcom Sci & Tech Co Ltd
What technology area does this patent fall under?: Primary CPC classification G06F9/5044. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 26 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).

How to read this patent

Abstract

First claim

Assignees

Inventors

Classifications

Patent family

External sources

Related patents

Coordinated, topology-aware CPU-GPU-memory scheduling for containerized workloads

Topology-aware provisioning of hardware accelerator resources in a distributed environment

Topology aware grouping and provisioning of GPU resources in GPU-as-a-Service platform

Topology-aware processor scheduling

Frequently asked questions