Resource-utilization-based workload re-allocation system
US-2019325554-A1 · Oct 24, 2019 · US
US11720408B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11720408-B2 |
| Application number | US-201916392668-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 24, 2019 |
| Priority date | May 8, 2018 |
| Publication date | Aug 8, 2023 |
| Grant date | Aug 8, 2023 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Disclosed are aspects of task assignment for systems that include graphics processing units (GPUs) that are virtual GPU (vGPU) enabled. In some examples, an algorithm is determined based on predetermined virtual machine assignment algorithms. The algorithm optimizes for a predetermined cost function. A virtual machine is queued in an arrival queue for assignment. A graphics configuration of a system is determined. The graphics configuration specifies a number of graphics processing units (GPUs) in the system. The system includes a vGPU enabled GPU. The algorithm is selected based on a correlation between the algorithm and the graphics configuration of the system. The virtual machine is assigned to a run queue based on the selected algorithm.
Opening claim text (preview).
What is claimed is: 1. A computer-implemented method, comprising: identifying, by a scheduler service executed by at least one processor, a predetermined set of assignment algorithms; modifying, by the scheduler service, at least one of the predetermined set of assignment algorithms to generate a plurality of trained assignment algorithms that are trained to maximize a cost function comprising: a ratio of an average execution time for a plurality of virtual machines, and an average total time corresponding to execution time and run queue wait time for the plurality of virtual machines; generating, by the scheduler service, a data structure that correlates, using the cost function, a particular one of the plurality of trained assignment algorithms with a plurality of graphics configuration parameters; identifying, by the scheduler service, a virtual machine that is assigned a virtual graphics processing unit (vGPU) profile from an arrival queue; identifying, by the scheduler service, a graphics configuration of a system comprising a plurality of host computers, the graphics configuration specifying a total number of vGPU-enabled graphics processing units (GPUs) installed in the plurality of host computers in the system and a virtual machine arrival rate for the arrival queue of the system; determining that an existing run queue of a vGPU-enabled GPU of the system matches the vGPU profile of the virtual machine; receiving, by the scheduler service, data specifying a plurality of pre-existing virtual machines in the existing run queue of the vGPU-enabled GPU of the system; selecting, by the scheduler service, the particular one of the trained assignment algorithms that is correlated, in the data structure, with the total number of vGPU-enabled GPUs and the virtual machine arrival rate specified by the graphics configuration of the system; suspending, by the scheduler service, a particular one of the plurality of pre-existing virtual machines in the run queue in order to free up capacity for the virtual machine, and inserting the virtual machine in a particular position in the run queue to arrange a set of virtual machines in the run queue into an updated order provided by the trained assignment algorithm that is trained to optimize the cost function; and executing the virtual machine and the pre-existing virtual machines according to the updated order of the run queue. 2. The computer-implemented method of claim 1 , wherein the virtual machine is assigned to the run queue further based on expected virtual machine execution time of the virtual machine, and virtual machine arrival queue wait time of the virtual machine. 3. The computer-implemented method of claim 1 , wherein the-vGPU profile is assigned to the virtual machine based on a process list of the virtual machine. 4. The computer-implemented method of claim 1 , wherein the virtual machine is assigned to the run queue further based on expected virtual machine execution time of the virtual machine. 5. The computer-implemented method of claim 1 , wherein a number of run queues of the vGPU-enabled GPU is less than a maximum number of queues. 6. The computer-implemented method of claim 1 , wherein the virtual machine is associated with a task comprising a group of virtual machines. 7. The computer-implemented method of claim 6 , further comprising: determining that the vGPU-enabled GPU supports the vGPU profile. 8. A non-transitory computer-readable medium comprising executable instructions, wherein the instructions, when executed by at least one processor, cause at least one computing device to at least: identify, by a scheduler service executed by at least one processor, a plurality of trained assignment algorithms that are trained to maximize a cost function comprising: a ratio of an average execution time for a plurality of virtual machines, and an average total time corresponding to execution time and run queue wait time for the plurality of virtual machines; generate, by the scheduler service, a data structure that correlates, using the cost function, a particular one of the plurality of trained assignment algorithms with at least one of graphics configuration parameter; identify, by the scheduler service, a virtual machine that is assigned a virtual graphics processing unit (vGPU) profile from an arrival queue; identify, by the scheduler service, a graphics configuration of a system comprising a plurality of host computers, the graphics configuration specifying a total number of vGPU-enabled graphics processing units (GPUs) installed in the plurality of host computers in the system and a virtual machine arrival rate for an arrival queue of the system; determine that an existing run queue of a vGPU-enabled GPU of the system matches the vGPU profile of the virtual machine; receive, by the scheduler service, data specifying a plurality of pre-existing virtual machines in the existing run queue of the vGPU-enabled GPU of the system; select, by the scheduler service, a particular one of the trained assignment algorithms that is correlated, by the data structure, with at least one of: the total number of vGPU-enabled GPUs and the virtual machine arrival rate specified by the graphics configuration of the system; suspend, by the scheduler service, a particular one of the plurality of pre-existing virtual machines in the run queue in order to free up capacity for the virtual machine, and insert the virtual machine in a particular position in the run queue to arrange a set of virtual machines in the run queue into an updated order provided by the trained assignment algorithm that is trained to optimize the cost function; and execute the virtual machine and the pre-existing virtual machines according to the updated order of the run queue. 9. The non-transitory computer-readable medium of claim 8 , wherein the virtual machine is assigned to the run queue further based on expected virtual machine execution time of the virtual machine, and virtual machine arrival queue wait time of the virtual machine. 10. The non-transitory computer-readable medium of claim 8 , wherein the vGPU profile is assigned to the virtual machine based on a process list of the virtual machine. 11. The non-transitory computer-readable medium of claim 8 , wherein the virtual machine is assigned to the run queue further based on expected virtual machine execution time of the virtual machine. 12. The non-transitory computer-readable medium of claim 8 , wherein a number of run queues of the vGPU-enabled GPU is less than a maximum number of queues. 13. The non-transitory computer-readable medium of claim 8 , wherein the virtual machine is associated with a task comprising a group of virtual machines. 14. A system, comprising: at least one computing device comprising at least one processor; and a memory comprising executable instructions, wherein the instructions, when executed by the at least one processor, cause the at least one computing device to at least: identify, by a scheduler service executed by at least one processor, a plurality of trained assignment algorithms that are trained to maximize a cost function comprising: a ratio of an average execution time for a plurality of virtual machines, and an average total time corresponding to execution time and run queue wait time for the plurality of virtual machines; generate, by the scheduler service, a data structure that correlates, using the cost function, a particular one of the plurality of trained assignment algorithms with at least one of a plurality of graphics configuration parameters; identify, by the scheduler service, a virtual machine that i
Convolutional networks [CNN, ConvNet] · CPC title
considering hardware capabilities · CPC title
Hypervisor-specific management and integration aspects · CPC title
Correlation function computation {including computation of convolution operations (arithmetic circuits for sum of products per se, e.g. multiply-accumulators G06F7/5443; digital filters, e.g. FIR, IIR, adaptive filters H03H17/00)} · CPC title
Machine learning · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.