Platform and software framework for data intensive applications in the cloud
US-9513934-B2 · Dec 6, 2016 · US
US9916636B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9916636-B2 |
| Application number | US-201615093965-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 8, 2016 |
| Priority date | Apr 8, 2016 |
| Publication date | Mar 13, 2018 |
| Grant date | Mar 13, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Server resources in a data center are disaggregated into shared server resource pools, including a graphics processing unit (GPU) pool. Servers are constructed dynamically, on-demand and based on workload requirements, by allocating from these resource pools. According to this disclosure, GPU utilization in the data center is managed proactively by assigning GPUs to workloads in a fine granularity and agile way, and de-provisioning them when no longer needed. In this manner, the approach is especially advantageous to automatically provision GPUs for data analytic workloads. The approach thus provides for a “micro-service” enabling data analytic workloads to automatically and transparently use GPU resources without providing (e.g., to the data center customer) the underlying provisioning details. Preferably, the approach dynamically determines the number and the type of GPUs to use, and then during runtime auto-scales the GPUs based on workload.
Opening claim text (preview).
The invention claimed is: 1. A method for processing a workload in a compute environment having a pool of graphics processing units (GPUs), comprising: receiving a request to process the workload; responsive to receipt of the request, determining a GPU configuration anticipated to be required to process the workload, the GPU configuration comprising a set of GPU requirements including a number of GPUs and a type of GPU; based on the set of GPU requirements, selecting GPUs from the pool that are available and assigning the selected GPUs to process the workload; and as the workload is being processed by the GPUs assigned, dynamically adjusting the GPU configuration as determined by monitored resource consumption of the workload. 2. The method as described in claim 1 wherein the GPU configuration is determined at least in part by determining whether a profile of the workload matches a profile associated with another workload that has been processed in the compute environment. 3. The method as described in claim 1 wherein the GPU configuration is determined at least in part by executing a test GPU configuration. 4. The method as described in claim 1 wherein dynamically adjusting the GPU configuration comprises: monitoring resource consumption associated with the GPUs assigned to process the workload; and based at least in part on the monitored resource consumption, modifying the number of assigned GPUs. 5. The method as described in claim 1 wherein the set of GPU requirements also include a value representing an extent to which the workload is suitable for processing on the GPUs. 6. The method as described in claim 1 wherein the GPU requirements are adjusted in accordance with one or more tasks in the workload. 7. The method as described in claim 1 wherein the GPU configuration is dynamically adjusted by provisioning or de-provisioning GPUs based on a current workload requirement. 8. The method as described in claim 1 wherein the compute environment is a disaggregated compute system comprising the GPUs assigned. 9. Apparatus for processing a workload in a compute environment having a pool of graphics processing units (GPUs), comprising: one or more hardware processors; computer memory holding computer program instructions executed by the hardware processors and operative to: receive a request to process the workload; responsive to receipt of the request, determine a GPU configuration anticipated to be required to process the workload, the GPU configuration comprising a set of GPU requirements including a number of GPUs and a type of GPU; based on the set of GPU requirements, select GPUs from the pool that are available and assign the selected available GPUs to process the workload; and as the workload is being processed by the GPUs assigned, dynamically adjust the GPU configuration as determined by monitored resource consumption of the workload. 10. The apparatus as described in claim 9 wherein the GPU configuration is determined at least in part by determining whether a profile of the workload matches a profile associated with another workload that has been processed in the compute environment. 11. The apparatus as described in claim 9 wherein the GPU configuration is determined at least in part by executing a test GPU configuration. 12. The apparatus as described in claim 9 wherein the computer program code to dynamically adjust the GPU configuration comprises computer program code to: monitor resource consumption associated with the GPUs assigned to process the workload; and based at least in part on the monitored resource consumption, modify the number of assigned GPUs. 13. The apparatus as described in claim 9 wherein the set of GPU requirements also include a value representing an extent to which the workload is suitable for processing on the GPUs. 14. The apparatus as described in claim 9 wherein the GPU requirements are adjusted in accordance with one or more tasks in the workload. 15. The apparatus as described in claim 9 wherein the GPU configuration is dynamically adjusted by provisioning or de-provisioning GPUs based on a current workload requirement. 16. The apparatus as described in claim 9 wherein the compute environment is a disaggregated compute system comprising the GPUs assigned. 17. A computer program product in a non-transitory computer readable medium for use in a data processing system for processing a workload in a compute environment having a pool of graphics processing units (GPUs), the computer program product holding computer program instructions executed in the data processing system and operative to: receive a request to process the workload; responsive to receipt of the request, determine a GPU configuration anticipated to be required to process the workload, the GPU configuration comprising a set of GPU requirements including a number of GPUs and a type of GPU; based on the set of GPU requirements, select GPUs from the pool that are available and assign the selected available GPUs to process the workload; and as the workload is being processed by the GPUs assigned, dynamically adjust the GPU configuration as determined by monitored resource consumption of the workload. 18. The computer program product as described in claim 17 wherein the GPU configuration is determined at least in part by determining whether a profile of the workload matches a profile associated with another workload that has been processed in the compute environment. 19. The computer program product as described in claim 17 wherein the GPU configuration is determined at least in part by executing a test GPU configuration. 20. The computer program product as described in claim 17 wherein the computer program code to dynamically adjust the GPU configuration comprises computer program code to: monitor resource consumption associated with the GPUs assigned to process the workload; and based at least in part on the monitored resource consumption, modify the number of assigned GPUs. 21. The computer program product as described in claim 17 wherein the set of GPU requirements also include a value representing an extent to which the workload is suitable for processing on the GPUs. 22. The computer program product as described in claim 17 wherein the GPU requirements are adjusted in accordance with one or more tasks in the workload. 23. The computer program product as described in claim 17 wherein the GPU configuration is dynamically adjusted by provisioning or de-provisioning GPUs based on a current workload requirement. 24. The computer program product as described in claim 17 wherein the compute environment is a disaggregated compute system comprising the GPUs assigned. 25. A data center facility, comprising: a set of server resource pools, the server resource pools comprising at least a graphics processing unit (GPU) resource pool; a GPU sizing component executing in a hardware processor responsive to receipt of a request to process a workload to determine a GPU configuration that includes a number of GPUs and a type of GPU; at least one disaggregated compute system comprising GPUs selected from the GPU resource pool to satisfy the GPU configuration; and a GPU scaling component executing in a hardware processor and responsive to receipt of resource consumption information as the workload is executing to scale-up or scale-down the GPU configuration. 26. The d
Techniques for rebalancing the load in a distributed system · CPC title
Processor architectures; Processor configuration, e.g. pipelining · CPC title
General purpose rendering architectures · CPC title
Grid computing · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.