Infrastructure driven auto-scaling of workloads
US-2024419470-A1 · Dec 19, 2024 · US
US9727385B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9727385-B2 |
| Application number | US-201213495597-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 13, 2012 |
| Priority date | Jul 18, 2011 |
| Publication date | Aug 8, 2017 |
| Grant date | Aug 8, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Techniques and structures relating to virtual graphics processing units (VGPUs) are disclosed. A VGPU may appear to software as an independent hardware GPU. However, two or more VGPUs can be implemented on the same GPU through the use of control structures and by duplicating some (but not all) hardware elements of the GPU. For example, additional registers and storage space may be added in a GPU supporting multiple VGPUs. Different execution priorities may be set for tasks and threads that correspond to the different supported VGPUs. Memory address space for the VGPUs may also be managed, including use of virtual address space for different VGPUs. Halting and resuming execution of different VGPUs allows for fine-grained execution control in various embodiments.
Opening claim text (preview).
What is claimed is: 1. An apparatus, comprising: a graphics processing unit (GPU) comprising a plurality of storage locations and a plurality of execution units and configured to implement a plurality of virtual GPUs having corresponding variable priority levels, wherein: the GPU is further configured to modify the corresponding variable priority levels of at least one of the plurality of virtual GPUs based on an arbitration scheme; the GPU is further configured to receive a plurality of GPU commands and a corresponding plurality of virtual GPU indicators that indicate, for the plurality of GPU commands, corresponding ones of the plurality of virtual GPUs; the plurality of execution units are configured to execute the GPU commands for the plurality of virtual GPUs; a priority level for a given GPU command is based on a priority level of a corresponding one of the plurality of virtual GPUs; and the GPU is further configured to store, in at least one of the plurality of storage locations, intermediate results that are usable to resume execution of one or more incomplete operations corresponding to one or more of the plurality of virtual GPUs. 2. The apparatus of claim 1 , wherein first and second storage areas of the plurality of storage locations are configured to store intermediate results corresponding to respective virtual GPUs of the plurality of virtual GPUs, and wherein the first and second storage areas for each of the plurality of virtual GPUs are each distributed across a respective plurality of different locations in the GPU. 3. The apparatus of claim 1 , wherein the GPU further comprises: a command buffer configured to store a plurality of the GPU commands; and one or more execution units; wherein the apparatus is configured to forward, based on a priority level for a given GPU command, one or more instructions corresponding to the given GPU command to the one or more execution units. 4. The apparatus of claim 3 , wherein each of the plurality of virtual GPUs has its own priority level. 5. The apparatus of claim 3 , further comprising one or more instruction buffers configured to store instructions corresponding to one or more of the plurality of the GPU commands and configured to store information indicating an identity of a virtual GPU to which each of the instructions corresponds. 6. The apparatus of claim 1 , wherein the GPU is configured to resume the one or more incomplete operations on a per-thread basis. 7. The apparatus of claim 1 , wherein the GPU is configured to allocate, to a first one of the plurality of virtual GPUs, an amount of physical memory corresponding to an entire virtual address space of the first virtual GPU, and to allocate, to a second one of the plurality of virtual GPUs, an amount of the physical memory corresponding to only part of a virtual address space of the second virtual GPU. 8. The apparatus of claim 1 , wherein a particular GPU command of the plurality of GPU commands includes a corresponding virtual GPU indicator of the plurality of virtual GPU indicators. 9. A system, comprising: a graphics processing unit (GPU) configured to receive GPU commands from a central processing unit (CPU) and configured to implement a first virtual GPU and a second virtual GPU having corresponding variable priority levels, wherein: the GPU is further configured to implement a first virtual GPU and a second virtual GPU having corresponding variable priority levels; the GPU is further configured to modify the variable priority levels based on an arbitration scheme; the GPU is further configured to receive a first virtual GPU indicator that identifies, for a first GPU command, a first virtual GPU; the GPU is further configured to receive a second virtual GPU indicator that identifies, for a second GPU command, a second virtual GPU; the GPU is further configured, based on a priority level for the first GPU command being higher than a priority level for the second GPU command, to execute a first thread corresponding to the first GPU command before a second thread to the second GPU command; and the priority level for the first GPU command is based on a priority level of the first virtual GPU and the priority level for the second GPU command is based on a priority level of the second virtual GPU. 10. The system of claim 9 , wherein the GPU is configured to receive, from the CPU, information indicating: the priority level for the first GPU command and the priority level for the second GPU command. 11. The system of claim 10 , wherein instructions corresponding to the first virtual GPU are given execution preference over instructions corresponding to the second virtual GPU; and wherein instructions corresponding to the second virtual GPU are given execution preference over instructions corresponding to a third virtual GPU. 12. The system of claim 9 , wherein the GPU comprises one or more execution units that are configured to execute a first type of thread; wherein the GPU is configured to execute a low-priority thread of the first type at the one or more execution units based on an indication that no other threads of the first type are ready to be executed and have a higher priority level than the low-priority thread. 13. The system of claim 9 , wherein the GPU is configured to split a given GPU command into one or more tasks, wherein each of the one or more tasks comprises one or more threads; wherein the GPU is configured to assess a priority level of the given GPU command prior to executing a thread for a given one of the one or more tasks. 14. The system of claim 9 , wherein the GPU is configured to allocate amounts of physical memory to each of a plurality of virtual GPUs based on a respective priority level for each of the virtual GPUs. 15. The system of claim 9 , further comprising the CPU; wherein the CPU supports a plurality of CPU threads; and wherein priority levels for GPU commands are based on CPU priority levels for each of the plurality of CPU threads. 16. A graphics processing unit (GPU), comprising: one or more circuits configured to implement a plurality of virtual GPUs having corresponding variable priority levels based on an arbitration scheme, wherein each of the one or more circuits includes: one or more corresponding instruction buffers configured to store one or more GPU commands; and one or more corresponding storage locations configured to store execution results; a task manager circuit configured to generate one or more threads corresponding to a first GPU command, wherein the first GPU command corresponds to a virtual GPU indicator that indicates a corresponding virtual GPU of the plurality of virtual GPUs; a memory manager circuit; one or more execution units; and a feeding unit circuit configured to forward a given thread to at least one of the one or more execution units in response to a priority level for the given thread, wherein the priority level for the given thread is based on a priority level of the corresponding virtual GPU of the plurality of virtual GPUs. 17. The graphics processing unit of claim 16 , wherein the one or more execution units comprise a first execution unit of a first type and a second execution unit of a second type; wherein the feeding unit circuit is configured to forward the given thread based on information indicating a type of execution unit used to execute the given thread. 18. The graphics processing unit of claim 16 , wherein the feeding unit circuit is configured to forward a thread having a lower priority level to a first one of the o
Multiprogramming arrangements · CPC title
Memory management · CPC title
involving task migration · CPC title
Logical partitioning of resources; Management or configuration of virtualized resources (specific details on emulation or internal functioning of virtual machines G06F9/455) · CPC title
Hypervisors; Virtual machine monitors · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.