Topology aware grouping and provisioning of GPU resources in GPU-as-a-Service platform
US-10325343-B1 · Jun 18, 2019 · US
US10983828B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10983828-B2 |
| Application number | US-201916245500-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jan 11, 2019 |
| Priority date | Jan 18, 2018 |
| Publication date | Apr 20, 2021 |
| Grant date | Apr 20, 2021 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Embodiments of the present disclosure relate to a method, apparatus and computer program product for scheduling dedicated processing resources. The method comprises: in response to receiving a scheduling request for a plurality of dedicated processing resources, obtaining a topology of the plurality of dedicated processing resources, the topology being determined based on connection attributes related to connections among the plurality of dedicated processing resources; and determining, based on the topology, a target dedicated processing resource satisfying the scheduling request from the plurality of dedicated processing resources. In this manner, the performance and the resource utilization rate of scheduling the dedicated processing resources are improved.
Opening claim text (preview).
What is claimed is: 1. A method of scheduling dedicated processing resources, comprising: in response to receiving a scheduling request for a plurality of dedicated processing resources, obtaining a topology of the plurality of dedicated processing resources, the topology being determined based on connection attributes related to connections among the plurality of dedicated processing resources; determining, based on the topology, a target dedicated processing resource satisfying the scheduling request from the plurality of dedicated processing resources; and scheduling the target dedicated processing resource satisfying the scheduling request; wherein the plurality of dedicated processing resources are distributed across a plurality of dedicated processing resource servers, and obtaining the topology comprises: obtaining a first connection attribute relating to multiple ones of the dedicated processing resource servers, the first connection attribute indicating at least one of delays, bandwidths, throughputs, transmission rates, transmission qualities and network utilization rates of connections among the plurality of dedicated processing resource servers; obtaining a second connection attribute relating to a particular one of the dedicated processing resource servers, the second connection attributes indicating types of connections among dedicated processing resources and relevant hardware resources in the particular dedicated processing resource server; and determining the topology based at least in part on the first and second connection attributes. 2. The method according to claim 1 , wherein determining the topology comprises: determining the plurality of dedicated processing resource servers as a plurality of nodes in the topology; and determining, based on the first connection attribute, a distance between two nodes of the plurality of nodes that are connected, the distance indicating a performance of a connection between the two nodes. 3. The method according to claim 2 , wherein determining the topology further comprises: determining the dedicated processing resources and the relevant hardware resources in the particular dedicated processing resource server as a plurality of sub-nodes of a respective node in the plurality of nodes; and determining, based on the second connection attribute, a distance between two sub-nodes of the plurality of sub-nodes that are connected, the distance indicating a performance of a connection between the two sub-nodes. 4. The method according to claim 1 , wherein determining the topology comprises: determining the plurality of dedicated processing resources as a plurality of nodes in the topology; and determining, based on the first and second connection attributes, a distance between two nodes of the plurality of nodes that are connected, the distance indicating a performance of a connection between the two nodes. 5. The method according to claim 1 , wherein obtaining the topology comprises: obtaining a utilization rate of each of the plurality of dedicated processing resources; and determining the topology based on the utilization rates and the connection attributes. 6. The method according to claim 1 , wherein determining the target dedicated processing resource comprises: determining a group of available dedicated processing resources from the plurality of dedicated processing resources; obtaining a utilization rate of each dedicated processing resource in the group of available dedicated processing resources; selecting, based on the utilization rate, a set of dedicated processing resource candidates from the group of available dedicated processing resources; and determining, based on a required resource amount in the scheduling request, the target dedicated processing resource from the set of dedicated processing resource candidates. 7. The method according to claim 6 , wherein selecting the set of dedicated processing resource candidates comprises: selecting, from the group of available dedicated processing resources, available dedicated processing resources having utilization rates below a predetermined threshold, as the set of dedicated processing resource candidates. 8. The method according to claim 6 , wherein determining, based on the required resource amount, the target dedicated processing resource comprises: selecting, based on the topology, dedicated processing resources of which the connections with the set of dedicated processing resource candidates have performances above a predetermined threshold, until the resource amount of the set of dedicated processing resource candidates and the selected dedicated processing resources satisfies the required resource amount. 9. The method according to claim 1 , wherein determining the target dedicated processing resource comprises: determining a first group of dedicated processing resources and a second group of dedicated processing resources being different from the first group of dedicated processing resources; and determining the target dedicated processing resource from the first and second groups of dedicated processing resources based on at least one of global load balance, a connection cost, and cross-rack traffic. 10. An apparatus for scheduling dedicated processing resources, comprising: at least one processing unit; at least one memory coupled to the at least one processing unit and storing instructions to be executed by the at least one processing unit, the instructions, when executed by the at least one processing unit, causing the apparatus to perform acts comprising: in response to receiving a scheduling request for a plurality of dedicated processing resources, obtaining a topology of the plurality of dedicated processing resources, the topology being determined based on connection attributes related to connections among the plurality of dedicated processing resources; determining, based on the topology, a target dedicated processing resource satisfying the scheduling request from the plurality of dedicated processing resources; and scheduling the target dedicated processing resource satisfying the scheduling request; wherein the plurality of dedicated processing resources are distributed across a plurality of dedicated processing resource servers, and obtaining the topology comprises: obtaining a first connection attribute relating to multiple ones of the dedicated processing resource servers, the first connection attribute indicating at least one of delays, bandwidths, throughputs, transmission rates, transmission qualities and network utilization rates of connections among the plurality of dedicated processing resource servers; obtaining a second connection attribute relating to a particular one of dedicated processing resource servers, the second connection attribute indicating types of connections among dedicated processing resources and relevant hardware resources in the particular dedicated processing resource server; and determining the topology based at least in part on the first and second connection attributes. 11. The apparatus according to claim 10 , wherein determining the topology comprises: determining the plurality of dedicated processing resource servers as a plurality of nodes in the topology; and determining, based on the first connection attribute, a distance between two nodes of the plurality of nodes that are connected, the distance indicating a performance of a connection between the two nodes. 12. The apparatus according to claim 11 , wherein determining the topology further comprises: determining the dedicated processing resources and the relevant hardware resources in the particular dedicat
Discovery or management of network topologies · CPC title
Routing or path finding of packets in data switching networks (routing or path finding in wireless networks H04W40/00) · CPC title
involving deadlines, e.g. rate based, periodic · CPC title
the resource being a machine, e.g. CPUs, Servers, Terminals · CPC title
Routing a service request depending on the request content or context · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.