Reliability-aware resource allocation method and apparatus in disaggregated data centers

US12379971B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-12379971-B2
Application numberUS-202217586818-A
CountryUS
Kind codeB2
Filing dateJan 28, 2022
Priority dateJan 28, 2022
Publication dateAug 5, 2025
Grant dateAug 5, 2025

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for resource allocation in a disaggregated data center (DDC), comprising: a reliability model to determine an achievable reliability for a service request to the DDC; a integer linear programming (ILP) model to perform a resource allocation for the service request to the DDC such that maximizing total number of service requests received by the DDC accepted for execution is maximized, while the number of the accepted service requests allocated with backup computing resources is minimized; and a heuristic process to perform a resource allocation for the service request to the DDC such that the least reliable node of each needed computing resource type is allocated but still meeting the reliability requirement of the service request.

First claim

Opening claim text (preview).

What is claimed is: 1. A disaggregated data center (DDC), comprising: a plurality of working nodes, each of the working nodes comprises one or more computing resources of only one computing resource type, the computing resource type selected from a group of computing resource types comprising: central processing unit (CPU), graphical processing unit (GPU), transient memory circuitry, and non-transient memory circuitry; a plurality of backup nodes, each of the backup nodes comprising one or more computing resources of only one computing resource type, the computing resource type selected from a group of computing resource types comprising: a central processing unit (CPU), a graphical processing unit (GPU), transient memory circuitry, and non-transient memory circuitry; a first processor configured to execute a reliability model to determine an achievable reliability for a service request to the DDC; and a second processor configured to execute an integer linear programming (ILP) model to perform a resource allocation for the service request to the DDC; wherein the DDC comprises multiple computing resource types, and the execution of the service request received by the DDC requires performance of computing resource of at least one of the computing resource types; and wherein nodes of same computing resource type are configured to form a parallel system such that as long as at least one of the nodes in the parallel system is available, the parallel system is available for performance in the execution of the service request received by the DDC; wherein each service request received by the DDC is executed by one or more working nodes corresponding to one or more necessary computing resource type respectively for an execution of the service request; and if a reliability of the one or more working nodes is lower than a reliability requirement for the service request, the service request is executed by one or more backup nodes corresponding to the one or more necessary computing resource types respectively; wherein the performance of the ILP model comprises: maximizing total number of service requests received by the DDC accepted for execution; minimizing number of the accepted service requests allocated with backup nodes; and subjecting to one or more of constraints comprising: a working node and a backup node allocated to the service request do not share a same computing resource; if the reliability of the one or more working nodes allocated to the service request is equal or higher than the reliability requirement for the service request, the service request is accepted for execution regardless of whether one or more backup nodes are allocated to the service request; a total resource demand of all service requests allocated with a node and pending for execution is not higher than a resource capacity of the node; and a reliability of computing resources of a computing resource type in the DDC must equal or higher than the reliability requirement of the service request before its acceptance for execution. 2. The disaggregated data center (DDC) of claim 1 , wherein the reliability model determines the achievable reliability by computing a product of reliabilities of all of the parallel systems; wherein a reliability is a probability of normal working of a system or a node; wherein each of the parallel systems comprises a working node r W and a backup node r B of computing resource type r; wherein the working node r W having a reliability r W ; wherein the backup node r B having a reliability r B ; wherein the working node r W and the backup node r B are arranged to form the parallel system of computing resource type r; and wherein a reliability of the parallel system of computing resource type r is obtained by computing: 1−(1− r W )·(1− r B ). 3. A method for autonomously allocating resources in the disaggregated data center (DDC) of claim 1 in order to improve reliability of the disaggregated data center (DDC), comprising: executing a reliability model in the first processor to determine an achievable reliability for a service request to the DDC, and executing an integer linear programming (ILP) model in the second processor to perform a resource allocation for the service request to the DDC; wherein the DDC comprises multiple computing resource types, and the execution of the service requested received by the DDC requires performance of computing resource of at least one of the computing resource types; and wherein nodes of same computing resource type are configured to form a parallel system such that as long as at least one of the nodes in the parallel system is available, the parallel system is available for performance in the execution of the service requested received by the DDC; wherein each service request received by the DDC is executed by one or more working nodes corresponding to one or more necessary computing resource type respectively for an execution of the service request; and if a reliability of the one or more working nodes is lower than a reliability requirement for the service request, the service request is executed by one or more backup nodes corresponding to the one or more necessary computing resource types respectively; wherein the performance of the ILP model comprises: maximizing total number of service requests received by the DDC accepted for execution; minimizing number of the accepted service requests allocated with backup nodes; and subjecting to one or more of constraints comprising: a working node and a backup node allocated to the service request do not share a same computing resource; if the reliability of the one or more working nodes allocated to the service request is equal or higher than the reliability requirement for the service request, the service request is accepted for execution regardless of whether one or more backup nodes are allocated to the service request; a total resource demand of all service requests allocated with a node and pending for execution is not higher than a resource capacity of the node; and a reliability of computing resources of a computing resource type in the DDC must equal or higher than & the reliability requirement of the service request before its acceptance for execution. 4. The method of claim 3 , wherein the reliability model determines the achievable reliability by computing a product of reliabilities of all of the parallel systems; wherein a reliability is a probability of normal working of a system or a node; wherein each of the parallel systems comprises a working node r W and a backup node r B of computing resource type r; wherein the working node r W having a reliability r W ; wherein the backup node r B having a reliability r B ; wherein the working node r W and the backup node r B are arranged to form the parallel system of computing resource type r; and wherein a reliability of the parallel system of computing resource type r is obtained by computing: 1−(1− r W )·(1− r B ). 5. A disaggregated data center (DDC), comprising: a plurality of working nodes, each of the working nodes comprises one or more computing resources of only one computing resource type, the computing resource type selected from a group of computing resource types comprising: central processing unit (CPU), graphical processing unit (GPU), transient memory circuitry, and non-transient memory circuitry; a plurality of backup nodes, each of the backup nodes comprising one or more computing resources of only one computing resource type, the computing resource type selected from a group of computing resource types comprisi

Assignees

Inventors

Classifications

  • Partitioning or combining of resources · CPC title

  • G06F9/5083Primary

    Techniques for rebalancing the load in a distributed system · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US12379971B2 cover?
A method for resource allocation in a disaggregated data center (DDC), comprising: a reliability model to determine an achievable reliability for a service request to the DDC; a integer linear programming (ILP) model to perform a resource allocation for the service request to the DDC such that maximizing total number of service requests received by the DDC accepted for execution is maximized, w…
Who is the assignee on this patent?
Univ City Hong Kong
What technology area does this patent fall under?
Primary CPC classification G06F9/5083. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 05 2025 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).