Technologies for kernel scale-out
US-2019065260-A1 · Feb 28, 2019 · US
US11275622B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-11275622-B2 |
| Application number | US-201816204653-A |
| Country | US |
| Kind code | B2 |
| Filing date | Nov 29, 2018 |
| Priority date | Nov 29, 2018 |
| Publication date | Mar 15, 2022 |
| Grant date | Mar 15, 2022 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Server resources in a data center are disaggregated into shared server resource pools, including an accelerator (e.g., FPGA) pool. Servers are constructed dynamically, on-demand and based on workload requirements, by allocating from these resource pools. According to this disclosure, accelerator utilization in the data center is managed proactively by assigning accelerators to workloads in a fine granularity and agile way, and de-provisioning them when no longer needed. In this manner, the approach is especially advantageous to automatically provision accelerators for data analytic workloads. The approach thus provides for a “micro-service” enabling data analytic workloads to automatically and transparently use FPGA resources without providing (e.g., to the data center customer) the underlying provisioning details. Preferably, the approach dynamically determines the number and the type of FPGAs to use, and then during runtime auto-scales the FPGAs based on workload.
Opening claim text (preview).
The invention claimed is: 1. A method for dynamically provisioning and scaling accelerators for data analytic workloads in a disaggregated computing system, comprising: receiving a request to process a data analytic workload; responsive to receipt of the request, dynamically determining an accelerator configuration anticipated to be required to process the data analytic workload, the accelerator configuration comprising a set of accelerator requirements, wherein determining the accelerator configuration further comprises determining a first portion of the data analytic workload is most suitable for processing by one or more Field Programmable Gate Arrays (FPGAs) and a second portion of the data analytic workload is most suitable for processing by one or more Central Processing Units (CPUs); responsive to determining the accelerator configuration, instantiating in real-time, a dynamically constructed server entity composed of individual components selected from a plurality of resource pools each physically maintaining a plurality of like-typed resources therein, the plurality of resource pools including at least a compute pool having the one or more CPUs, a memory pool, and an accelerator pool having the one or more FPGAs; based on the set of accelerator requirements, assigning available accelerators from the accelerator pool that match the set of accelerator requirements to the dynamically constructed server entity executing the data analytic workload; as the data analytic workload is being processed by the assigned accelerators, dynamically adjusting a number of individual accelerators from the accelerator pool provisioned to the dynamically constructed server entity to perform the data analytic workload as determined by monitoring resource consumption during a progression of the data analytic workload, wherein the adjusting of the number of individual accelerators provisioned to the dynamically constructed server entity is performed, at least in part, according to a determination that the first portion or the second portion of the data analytic workload has varied from being most suitably processed by the one or more FGPAs or the CPUs; and grouping accelerators in the accelerator pool into one or more preassigned groups of accelerators, wherein a particular group is preconfigured to process a particular type of data analytic workload prior to assigning any of the one or more preassigned groups of accelerators to process a subsequent data analytic workload of the particular type, and wherein provisioning of the accelerators to process the data analytic workload of the particular type is performed according to an analyzation of an efficiency of those accelerators selected for the particular group to process the data analytic workload of the particular type as compared to being processed by alternative accelerators in the pool of accelerators. 2. The method of claim 1 , further including: monitoring accelerators during processing of a first data analytic workload; using information derived from the monitoring to generate a model of accelerator performance; and using the accelerator performance model to determine an accelerator configuration for a second data analytic workload having an application type similar to an application type associated with the data analytic workload. 3. The method of claim 1 , wherein the accelerator configuration comprises a number of accelerators, and one or more types of accelerator. 4. The method of claim 1 , wherein the set of accelerator requirements also include a value representing an extent to which the workload is suitable for processing on the accelerators. 5. The method of claim 1 , wherein the one or more FGPAs are selected from a set of FGPAs, wherein at least one subset of the set of FGPAs is programmed according to a type of workload. 6. The method of claim 1 , wherein the accelerator configuration is dynamically adjusted based on a determined number of replicated kernel pipelines. 7. A system for dynamically provisioning and scaling accelerators for data analytic workloads in a disaggregated computing system, comprising: one or more hardware processors; computer memory holding computer program instructions executed by the hardware processors and operative to: receive a request to process a data analytic workload; responsive to receipt of the request, dynamically determine an accelerator configuration anticipated to be required to process the data analytic workload, the accelerator configuration comprising a set of accelerator requirements, wherein determining the accelerator configuration further comprises determining a first portion of the data analytic workload is most suitable for processing by one or more Field Programmable Gate Arrays (FPGAs) and a second portion of the data analytic workload is most suitable for processing by one or more Central Processing Units (CPUs); responsive to determining the accelerator configuration, instantiate in real-time, a dynamically constructed server entity composed of individual components selected from a plurality of resource pools each physically maintaining a plurality of like-typed resources therein, the plurality of resource pools including at least a compute pool having the one or more CPUs, a memory pool, and an accelerator pool having the one or more FPGAs; based on the set of accelerator requirements, assign available accelerators from the accelerator pool that match the set of accelerator requirements to the dynamically constructed server entity executing the data analytic workload; as the data analytic workload is being processed by the assigned accelerators, dynamically adjust a number of individual accelerators from the accelerator pool provisioned to the dynamically constructed server entity to perform the data analytic workload as determined by monitoring resource consumption during a progression of the data analytic workload, wherein the adjusting of the number of individual accelerators provisioned to the dynamically constructed server entity is performed, at least in part, according to a determination that the first portion or the second portion of the data analytic workload has varied from being most suitably processed by the one or more FGPAs or the CPUs; and group accelerators in the accelerator pool into one or more preassigned groups of accelerators, wherein a particular group is preconfigured to process a particular type of data analytic workload prior to assigning any of the one or more preassigned groups of accelerators to process a subsequent data analytic workload of the particular type, and wherein provisioning of the accelerators to process the data analytic workload of the particular type is performed according to an analyzation of an efficiency of those accelerators selected for the particular group to process the data analytic workload of the particular type as compared to being processed by alternative accelerators in the pool of accelerators. 8. The system of claim 7 , wherein the computer program instructions are further operative to: monitor accelerators during processing of a first data analytic workload; use information derived from the monitoring to generate a model of accelerator performance; and use the accelerator performance model to determine an accelerator configuration for a second data analytic workload having an application type similar to an application type associated with the data analytic workload. 9. The system of claim 7 , wherein the accelerator configuration comprises a number of accelerators, and one or more types of accelerator. 10. The system of claim 7 , wherein the set of accelerator requirements also include a value representing an extent to which the workload i
Pool · CPC title
considering the load · CPC title
using a secondary processor, e.g. coprocessor (peripheral processor G06F13/12) · CPC title
considering hardware capabilities · CPC title
where tasks reside in different layers, e.g. user- and kernel-space · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.