Reducing power consumption of a data center utilizing reinforcement learning framework

US11275429B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11275429-B2
Application numberUS-202016915133-A
CountryUS
Kind codeB2
Filing dateJun 29, 2020
Priority dateJun 29, 2020
Publication dateMar 15, 2022
Grant dateMar 15, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus comprises a processing device configured to obtain first parameters characterizing an operating state of information technology (IT) resources of a data center and second parameters characterizing an operating state of cooling systems of the data center, to determine an overall operating state of the data center by aggregating the first and second parameters, to identify a power consumption profile based on the overall operating state, and to perform a joint training of first and second reinforcement learning agents based on the overall operating state and the power consumption profile. The processing device is also configured to generate first controls for the heterogeneous IT resources utilizing the first reinforcement learning agent and second controls for the cooling systems utilizing the second reinforcement learning agent, the first and second controls being configured to reduce power consumption while maintaining specified performance benchmarks for workloads executing in the data center.

First claim

Opening claim text (preview).

What is claimed is: 1. An apparatus comprising: at least one processing device comprising a processor coupled to a memory; the at least one processing device being configured to perform steps of: obtaining a first set of parameters characterizing an operating state of a plurality of heterogeneous information technology resources of a data center and a second set of parameters characterizing an operating state of one or more cooling systems of the data center; determining an overall operating state of the data center by aggregating the first and second sets of parameters; identifying a power consumption profile of the data center based at least in part on the determined overall operating state of the data center; performing a joint training of a first set of one or more reinforcement learning agents and a second set of one or more reinforcement learning agents based at least in part on the determined overall operating state of the data center and the identified power consumption profile; generating a first set of controls for the plurality of heterogeneous information technology resources of the data center utilizing the trained first set of one or more reinforcement learning agents and a second set of controls for the one or more cooling systems of the data center utilizing the trained second set of one or more reinforcement learning agents, the first and second sets of controls being configured to reduce power consumption by the data center while maintaining specified performance benchmarks for workloads executing in the data center; and controlling operation of the data center based at least in part on the first and second sets of controls. 2. The apparatus of claim 1 wherein the first set of parameters comprises telemetry information obtained from the plurality of heterogeneous information technology resources of the data center, the telemetry information comprising: temperature measurements for one or more hardware components of each of the plurality of heterogeneous information technology resources for a given period of time; and power consumption measurements for each of the plurality of heterogeneous information technology resources for the given period of time. 3. The apparatus of claim 1 wherein the first set of parameters comprises resource management information obtained from the plurality of heterogeneous information technology resources of the data center, the resource management information comprising two or more of: average central processing unit (CPU) speed measurements for each of the plurality of heterogeneous information technology resources for a given period of time; CPU load measurements for each of the plurality of heterogeneous information technology resources for the given period of time; average uptime measurements for each of the plurality of heterogeneous information technology resources for the given period of time; and average memory measurements for each of the plurality of heterogeneous information technology resources for the given period of time. 4. The apparatus of claim 1 wherein the first set of parameters comprises task management information for a plurality of workloads scheduled for execution on the plurality of heterogeneous information technology resources of the data center, the task management information comprising two or more of: expected central processing unit (CPU) requirements for at least a subset of the plurality of workloads scheduled for execution on the plurality of heterogeneous information technology resources for a given upcoming period of time; expected memory requirements for at least a subset of the plurality of workloads scheduled for execution on the plurality of heterogeneous information technology resources for the given upcoming period of time; expected time for completion for at least a subset of the plurality of workloads scheduled for execution on the plurality of heterogeneous information technology resources; a most recent wait time for the plurality of workloads scheduled for execution on the plurality of heterogeneous information technology resources; and a most recent execution time for the plurality of workloads scheduled for execution on the plurality of heterogeneous information technology resources. 5. The apparatus of claim 1 wherein the second set of parameters comprise telemetry information obtained from the one or more cooling systems, the telemetry information comprising two or more of: air flow measurements for each of a plurality of air conditioning units of the one or more cooling systems for a given period of time; input temperature measurements for each of the plurality of air conditioning units of the one or more cooling systems for the given period of time; output temperature measurements for each of the plurality of air conditioning units of the one or more cooling systems for the given period of time; and power consumption measurements for each of the plurality of air conditioning units of the one or more cooling systems for the given period of time. 6. The apparatus of claim 1 wherein obtaining the first and second sets of parameters, determining the overall operating state of the data center, identifying the power consumption profile, generating the first and second sets of controls, and controlling operation of the data center are performed for each of two or more time periods, each of the two or more time periods being associated with a change in the operating state of the plurality of heterogeneous information technology resources of the data center. 7. The apparatus of claim 6 wherein the change in the operating state of the plurality of heterogeneous information technology resources of the data center comprises at least one of: arrival of one or more new workloads in a queue of workloads to be scheduled on the plurality of heterogeneous information technology resources of the data center; and completion of one or more workloads currently operating on one or more of the plurality of heterogeneous information technology resources of the data center. 8. The apparatus of claim 1 wherein identifying the power consumption profile comprises identifying a joint reward characterizing power consumption by the data center as a weighted summation of reward components identified from the first and second sets of parameters. 9. The apparatus of claim 8 wherein the weighted summation comprises reward components for: at least one of central processing unit (CPU) speed measurements, CPU load measurements, uptime measurements and memory measurements for the plurality of heterogeneous information technology resources in the first set of parameters; at least one of a most recent wait time and a most recent execution time for workloads scheduled for execution on the plurality of heterogeneous information technology resources in the first set of parameters; power consumption measurements for the plurality of heterogeneous information technology resources in the first set of parameters; and power consumption measurements for each of a plurality of air conditioning units of the one or more cooling systems in the second set of parameters. 10. The apparatus of claim 1 wherein the first set of controls comprises identification of workloads to be assigned to respective ones of the plurality of heterogeneous information technology resources for execution in an upcoming period of time. 11. The apparatus of claim 1 wherein the second set of controls comprises temperature setpoint information for each of a plurality of air conditioning units of the one or more cooling system for an upcoming period of time. 12. The apparatus of claim 1 wherein jointly training the fi

Assignees

Inventors

Classifications

  • Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title

  • G06F1/329Primary

    by task scheduling · CPC title

  • comprising thermal management · CPC title

  • Supervision thereof, e.g. detecting power-supply failure by out of limits supervision · CPC title

  • the criterion being a learning criterion · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11275429B2 cover?
An apparatus comprises a processing device configured to obtain first parameters characterizing an operating state of information technology (IT) resources of a data center and second parameters characterizing an operating state of cooling systems of the data center, to determine an overall operating state of the data center by aggregating the first and second parameters, to identify a power co…
Who is the assignee on this patent?
Dell Products Lp
What technology area does this patent fall under?
Primary CPC classification G06F1/329. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 15 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).