Optimal operating point estimator for hardware operating under a shared power/thermal constraint

US11106261B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11106261-B2
Application numberUS-201816179620-A
CountryUS
Kind codeB2
Filing dateNov 2, 2018
Priority dateNov 2, 2018
Publication dateAug 31, 2021
Grant dateAug 31, 2021

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Integrated circuits, or computer chips, typically include multiple hardware components (e.g. memory, processors, etc.) operating under a shared power (e.g. thermal) constraint that is sourced by one or more power sources for the chip. Typically, the hardware components can be individually configured to operate at certain states (e.g. to operate at a certain frequency by setting a clock speed for a clock dedicated to the hardware component). Thus, each hardware component can be configured to operate at an operating point that is determined to be optimal, usually in terms of achieving some desired goal for a specific application (e.g. frame rates for gaming, etc.). In the context of chip hardware that operates under a shared power/thermal constraint, a method, computer readable medium, and system are provided for determining the optimal operating point for the chip that takes into consideration both performance of the chip and power consumption by the chip.

First claim

Opening claim text (preview).

What is claimed is: 1. A method, comprising: receiving input associated with workloads executed on a chip having two or more hardware components that operate under a shared power constraint, the input including: descriptions of the workloads, performance metrics of each of the two or more hardware components when executing each of the workloads, and power consumption metrics for each of the workloads; training an artificial intelligence (AI) network that correlates the descriptions of the workloads, the performance metrics for each of the workloads, and the power consumption by each of the workloads; receiving a selection of an optimization mode for the chip that considers both performance and power consumption; determining using the AI network an optimal operating point of the chip, according to the selected optimization mode; and configuring the chip to operate at the determined optimal operating point. 2. The method of claim 1 , wherein the two or more hardware components include memory and a processor. 3. The method of claim 2 , wherein the memory is DRAM and the processor is a GPU. 4. The method of claim 1 , wherein the input is accumulated by capture logic over a predefined window of time. 5. The method of claim 4 , wherein for graphics workloads, the input is captured over an entire frame. 6. The method of claim 4 , wherein for compute or deep learning workloads, the input is captured at a function call level or over a fixed period of time. 7. The method of claim 1 , wherein each of the descriptions of the workloads includes at least one of: an indication of an application executing a workload, operations performed within the workload, or data on which the workload is performed. 8. The method of claim 1 , wherein the performance metrics include a frame time. 9. The method of claim 1 , wherein the performance metrics include instructions per second. 10. The method of claim 1 , wherein the power consumption metrics for each of the workloads includes an amount of power consumed by each hardware component to perform the workload. 11. The method of claim 1 , wherein the power consumption metrics for each of the workloads includes a total amount of power consumed by the two or more hardware components to perform the workload. 12. The method of claim 1 , wherein the selection of the optimization mode for the chip indicates: a parameter for which to optimize operation of the chip, the parameter being one of performance, power, or efficiency. 13. The method of claim 12 , wherein determining the optimal operating point of the chip, according to the selected optimization mode, includes: when the selected optimization mode is to optimize operation of the chip for performance: identifying the target value for the performance, identifying a power threshold, using the AI network to determine the optimal operating point for the chip to maximize performance without exceeding the power threshold, the optimal operating point for the chip including operating states for each hardware component of the two or more hardware components and a voltage state for the two or more hardware components. 14. The method of claim 12 , wherein determining the optimal operating point of the chip, according to the selected optimization mode, includes: when the selected optimization mode is to optimize operation of the chip for power: identifying the target value for the power, identifying a performance threshold, using the AI network to determine the optimal operating point for the chip to minimize power consumption without falling below the performance threshold, the optimal operating point for the chip including operating states for each hardware component of the two or more hardware components and a voltage state for the two or more hardware components. 15. The method of claim 12 , wherein determining the optimal operating point of the chip, according to the selected optimization mode, includes: when the selected optimization mode is to optimize operation of the chip for efficiency: identifying an efficiency threshold, wherein the efficiency threshold is defined based on a change in performance in relation to a change in power consumption, using the AI network to determine the optimal operating point for the chip to maximize performance without falling below the efficiency threshold, the optimal operating point for the chip including operating states for each hardware component of the two or more hardware components and a voltage state for the two or more hardware components. 16. The method of claim 1 , wherein the optimal operating point of the chip includes operating states for each hardware component of the two or more hardware components, the operating states including clock frequencies for each hardware component of the two or more hardware components and a voltage state for the two or more hardware components. 17. The method of claim 1 , where the optimal operating point of the chip is determined based on a description of one or more prior workloads executed by the chip, and is employed when executing a subsequent workload on the chip. 18. The method of claim 1 , wherein the determining of the optimal operating point of the chip and the configuring the chip to operate at the determined optimal operating point are repeated: for each new workload executed on the chip, or during workload execution at various predefined operating points within an application. 19. The method of claim 1 , wherein the determining of the optimal operating point of the chip and the configuring the chip to operate at the determined optimal operating point are performed by software executed by a CPU of the chip. 20. The method of claim 1 , wherein the determining of the optimal operating point of the chip and the configuring the chip to operate at the determined optimal operating point are performed by dedicated hardware on the chip. 21. A non-transitory computer readable medium storing code executable by a processor to perform a method comprising: receiving input associated with workloads executed on a chip having two or more hardware components that operate under a shared power constraint, the input including: descriptions of the workloads, performance metrics of each of the two or more hardware components when executing each of the workloads, and power consumption metrics for each of the workloads; training an artificial intelligence (AI) network that correlates the descriptions of the workloads, the performance metrics for each of the workloads, and the power consumption by each of the workloads; receiving a selection of an optimization mode for the chip that considers both performance and power consumption; determining using the AI network an optimal operating point of the chip, according to the selected optimization mode; and configuring the chip to operate at the determined optimal operating point. 22. The non-transitory computer readable medium of claim 21 , wherein the two or more hardware components include memory and a processor. 23. The non-transitory computer readable medium of claim 21 , wherein the optimal operating point of the chip includes operating states for each hardware component of the two or more hardware components, the operating states including clock frequencies for each hardware component of the two or more hardware components and a voltage state for the two or more hardware components. 24. The non-transitor

Assignees

Inventors

Classifications

  • Combinations of networks · CPC title

  • Supervised learning · CPC title

  • for performance assessment · CPC title

  • G06F9/4893Primary

    taking into account power or heat criteria (power management in computers in general G06F1/3203; thermal management in computers in general G06F1/206) · CPC title

  • by lowering the supply or operating voltage · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11106261B2 cover?
Integrated circuits, or computer chips, typically include multiple hardware components (e.g. memory, processors, etc.) operating under a shared power (e.g. thermal) constraint that is sourced by one or more power sources for the chip. Typically, the hardware components can be individually configured to operate at certain states (e.g. to operate at a certain frequency by setting a clock speed fo…
Who is the assignee on this patent?
Nvidia Corp
What technology area does this patent fall under?
Primary CPC classification G06F11/3409. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Aug 31 2021 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 12 related publications on this page (citations in our corpus or others sharing the same primary CPC).