Utilizing accelerators to accelerate data analytic workloads in disaggregated systems

US11275622B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11275622-B2
Application numberUS-201816204653-A
CountryUS
Kind codeB2
Filing dateNov 29, 2018
Priority dateNov 29, 2018
Publication dateMar 15, 2022
Grant dateMar 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Server resources in a data center are disaggregated into shared server resource pools, including an accelerator (e.g., FPGA) pool. Servers are constructed dynamically, on-demand and based on workload requirements, by allocating from these resource pools. According to this disclosure, accelerator utilization in the data center is managed proactively by assigning accelerators to workloads in a fine granularity and agile way, and de-provisioning them when no longer needed. In this manner, the approach is especially advantageous to automatically provision accelerators for data analytic workloads. The approach thus provides for a “micro-service” enabling data analytic workloads to automatically and transparently use FPGA resources without providing (e.g., to the data center customer) the underlying provisioning details. Preferably, the approach dynamically determines the number and the type of FPGAs to use, and then during runtime auto-scales the FPGAs based on workload.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for dynamically provisioning and scaling accelerators for data analytic workloads in a disaggregated computing system, comprising: receiving a request to process a data analytic workload; responsive to receipt of the request, dynamically determining an accelerator configuration anticipated to be required to process the data analytic workload, the accelerator configuration comprising a set of accelerator requirements, wherein determining the accelerator configuration further comprises determining a first portion of the data analytic workload is most suitable for processing by one or more Field Programmable Gate Arrays (FPGAs) and a second portion of the data analytic workload is most suitable for processing by one or more Central Processing Units (CPUs); responsive to determining the accelerator configuration, instantiating in real-time, a dynamically constructed server entity composed of individual components selected from a plurality of resource pools each physically maintaining a plurality of like-typed resources therein, the plurality of resource pools including at least a compute pool having the one or more CPUs, a memory pool, and an accelerator pool having the one or more FPGAs; based on the set of accelerator requirements, assigning available accelerators from the accelerator pool that match the set of accelerator requirements to the dynamically constructed server entity executing the data analytic workload; as the data analytic workload is being processed by the assigned accelerators, dynamically adjusting a number of individual accelerators from the accelerator pool provisioned to the dynamically constructed server entity to perform the data analytic workload as determined by monitoring resource consumption during a progression of the data analytic workload, wherein the adjusting of the number of individual accelerators provisioned to the dynamically constructed server entity is performed, at least in part, according to a determination that the first portion or the second portion of the data analytic workload has varied from being most suitably processed by the one or more FGPAs or the CPUs; and grouping accelerators in the accelerator pool into one or more preassigned groups of accelerators, wherein a particular group is preconfigured to process a particular type of data analytic workload prior to assigning any of the one or more preassigned groups of accelerators to process a subsequent data analytic workload of the particular type, and wherein provisioning of the accelerators to process the data analytic workload of the particular type is performed according to an analyzation of an efficiency of those accelerators selected for the particular group to process the data analytic workload of the particular type as compared to being processed by alternative accelerators in the pool of accelerators. 2. The method of claim 1 , further including: monitoring accelerators during processing of a first data analytic workload; using information derived from the monitoring to generate a model of accelerator performance; and using the accelerator performance model to determine an accelerator configuration for a second data analytic workload having an application type similar to an application type associated with the data analytic workload. 3. The method of claim 1 , wherein the accelerator configuration comprises a number of accelerators, and one or more types of accelerator. 4. The method of claim 1 , wherein the set of accelerator requirements also include a value representing an extent to which the workload is suitable for processing on the accelerators. 5. The method of claim 1 , wherein the one or more FGPAs are selected from a set of FGPAs, wherein at least one subset of the set of FGPAs is programmed according to a type of workload. 6. The method of claim 1 , wherein the accelerator configuration is dynamically adjusted based on a determined number of replicated kernel pipelines. 7. A system for dynamically provisioning and scaling accelerators for data analytic workloads in a disaggregated computing system, comprising: one or more hardware processors; computer memory holding computer program instructions executed by the hardware processors and operative to: receive a request to process a data analytic workload; responsive to receipt of the request, dynamically determine an accelerator configuration anticipated to be required to process the data analytic workload, the accelerator configuration comprising a set of accelerator requirements, wherein determining the accelerator configuration further comprises determining a first portion of the data analytic workload is most suitable for processing by one or more Field Programmable Gate Arrays (FPGAs) and a second portion of the data analytic workload is most suitable for processing by one or more Central Processing Units (CPUs); responsive to determining the accelerator configuration, instantiate in real-time, a dynamically constructed server entity composed of individual components selected from a plurality of resource pools each physically maintaining a plurality of like-typed resources therein, the plurality of resource pools including at least a compute pool having the one or more CPUs, a memory pool, and an accelerator pool having the one or more FPGAs; based on the set of accelerator requirements, assign available accelerators from the accelerator pool that match the set of accelerator requirements to the dynamically constructed server entity executing the data analytic workload; as the data analytic workload is being processed by the assigned accelerators, dynamically adjust a number of individual accelerators from the accelerator pool provisioned to the dynamically constructed server entity to perform the data analytic workload as determined by monitoring resource consumption during a progression of the data analytic workload, wherein the adjusting of the number of individual accelerators provisioned to the dynamically constructed server entity is performed, at least in part, according to a determination that the first portion or the second portion of the data analytic workload has varied from being most suitably processed by the one or more FGPAs or the CPUs; and group accelerators in the accelerator pool into one or more preassigned groups of accelerators, wherein a particular group is preconfigured to process a particular type of data analytic workload prior to assigning any of the one or more preassigned groups of accelerators to process a subsequent data analytic workload of the particular type, and wherein provisioning of the accelerators to process the data analytic workload of the particular type is performed according to an analyzation of an efficiency of those accelerators selected for the particular group to process the data analytic workload of the particular type as compared to being processed by alternative accelerators in the pool of accelerators. 8. The system of claim 7 , wherein the computer program instructions are further operative to: monitor accelerators during processing of a first data analytic workload; use information derived from the monitoring to generate a model of accelerator performance; and use the accelerator performance model to determine an accelerator configuration for a second data analytic workload having an application type similar to an application type associated with the data analytic workload. 9. The system of claim 7 , wherein the accelerator configuration comprises a number of accelerators, and one or more types of accelerator. 10. The system of claim 7 , wherein the set of accelerator requirements also include a value representing an extent to which the workload i

Assignees

Inventors

Classifications

  • Pool · CPC title

  • G06F9/505Primary

    considering the load · CPC title

  • using a secondary processor, e.g. coprocessor (peripheral processor G06F13/12) · CPC title

  • G06F9/5044Primary

    considering hardware capabilities · CPC title

  • where tasks reside in different layers, e.g. user- and kernel-space · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11275622B2 cover?
Server resources in a data center are disaggregated into shared server resource pools, including an accelerator (e.g., FPGA) pool. Servers are constructed dynamically, on-demand and based on workload requirements, by allocating from these resource pools. According to this disclosure, accelerator utilization in the data center is managed proactively by assigning accelerators to workloads in a fi…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/505. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 2 related publications on this page (citations in our corpus or others sharing the same primary CPC).