Dynamic compute composition
US-2024311210-A1 · Sep 19, 2024 · US
US9606842B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9606842-B2 |
| Application number | US-201313889577-A |
| Country | US |
| Kind code | B2 |
| Filing date | May 8, 2013 |
| Priority date | May 8, 2013 |
| Publication date | Mar 28, 2017 |
| Grant date | Mar 28, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A multi-core processor provides circuitry for jointly scaling the number of operating cores and the amount of resources per core in order to maximize processing performance in a power-constrained environment. Such scaling is advantageously provided without the need for scaling voltage and frequency. Selection of the number of operating cores and the amount of resources per core is made by examining the degree of instruction and thread level parallelism available for a given application. Accordingly, performance counters (and other characteristics) implemented in by a processor may be sampled on-line (in real time) and/or performance counters for a given application may be profiled and characterized off-line. As a result, improved processing performance may be achieved despite decreases in core operating voltages and increases in technology process variability over time.
Opening claim text (preview).
What is claimed is: 1. A processor for operating in a power-constrained environment comprising: a plurality of central processing unit cores for executing one or more applications, each core comprising compute resources including caches and execution units, each core being configured to receive an operating voltage; circuitry for selectively enabling one or more of the plurality of cores; and circuitry for selectively scaling one or more of the resources available in each core independent of disabling and enabling the entire cores, wherein the processor is configured to determine an amount of instruction level parallelism and an amount of thread level parallelism presented by the one or more applications, wherein the amount of instruction level parallelism and thread level parallelism are determined by sampling one or more counters for a given application, the one or more counters indicating instructions per cycle and execution time, and wherein the processor is configured to disable one or more of the cores according to the determined amount of thread level parallelism while scaling one or more of the resources available in one or more of the cores according to the determined amount of instruction level parallelism independent of scaling the operating voltage and frequency. 2. The processor of claim 1 , wherein a first performance counter indicates the instructions per cycle. 3. The processor of claim 1 , wherein a second performance counter indicates the execution time. 4. The processor of claim 1 , wherein a second performance counter indicates cache hit rate. 5. The processor of claim 1 , wherein resources are increased with greater instruction level parallelism. 6. The processor of claim 1 , wherein the number of cores is increased with greater thread level parallelism. 7. The processor of claim 1 , wherein execution units comprise arithmetic logic units. 8. The processor of claim 1 , wherein execution units comprise floating point logic units. 9. The processor of claim 1 , wherein resources further include a branch target buffer and a translation look-aside buffer. 10. The processor of claim 1 , wherein, resources further include an instruction queue, a physical register file, a re-order buffer and a load-store-queue. 11. A method for operating a processor in a power-constrained environment, the processor including a plurality of central processing unit cores for executing one or more applications, each core comprising compute resources including caches and execution units, and each core receiving an operating voltage, the method comprising: determining an amount of instruction level parallelism and an amount of thread level parallelism presented by the one or more applications; selectively enabling one or more of the cores for executing the one or more applications; and disabling at least one of the cores according to the determined amount of thread level parallelism while selectively scaling one or more of the resources available in the one or more enabled cores according to the determined amount of instruction level parallelism independent of scaling the operating voltage and frequency, wherein the amount of instruction level parallelism and thread level parallelism are determined by sampling one or more counters for a given application, the one or more counters indicating instructions per cycle and execution time. 12. The method of claim 11 , further comprising sampling performance first and second counters that indicate instructions per cycle and execution time, respectively. 13. The method of claim 11 , further comprising sampling performance counters for a given application off-line. 14. The method of claim 11 , further comprising dividing instructions per cycle by execution time.
Cross-Sectional Technologies · mapped topic
where the allocation takes into account power or heat criteria (power management in computers in general G06F1/3203; thermal management in computers in general G06F1/206) · CPC title
Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.