Compiler method and compiler apparatus
US-2015277874-A1 · Oct 1, 2015 · US
US9619290B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9619290-B2 |
| Application number | US-201514641220-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 6, 2015 |
| Priority date | Mar 6, 2015 |
| Publication date | Apr 11, 2017 |
| Grant date | Apr 11, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A method of balancing execution rates for a plurality of parallel program loops being executed concurrently by a processor may include estimating a completion time for each program loop of the plurality of program loops, determining a difference between the estimated completion time of a first program loop of the plurality of program loops and the estimated completion time of a second program loop of the plurality of program loops, and decreasing the difference by adjusting an execution rate of the first program loop.
Opening claim text (preview).
What is claimed is: 1. A method, comprising: estimating a completion time for each program loop of a plurality of program loops executed by a processor; determining a difference between the estimated completion time of a first program loop of the plurality of program loops and the estimated completion time of a second program loop of the plurality of program loops, wherein the first program loop is executed concurrently with the second program loop; and decreasing the difference by adjusting an execution rate of the first program loop. 2. The method of claim 1 , further comprising storing loop progress information for each of the plurality of program loops in one or more loop progress buffers, wherein the estimated completion time for each program loop of the plurality of program loops is determined based on the loop progress information. 3. The method of claim 2 , wherein the plurality of program loops is defined in an executable program executed by the processor, and wherein the method further comprises: identifying a dependent instruction block in the executable program to be executed after completion of all of the program loops of the plurality of program loops; and in response to identifying the dependent instruction block and in response to a backwards branch in the program corresponding to the first program loop, storing a program counter value of the backwards branch in the one or more loop progress buffers. 4. The method of claim 2 , wherein for each program loop of the plurality of program loops, the loop progress information comprises a number of instructions, a number of completed cycles, and a number of expected total cycles for the program loop. 5. The method of claim 2 , wherein for each program loop of the plurality of program loops, the loop progress information comprises a power state of a processing core executing the program loop and an estimated power consumption for the program loop. 6. The method of claim 2 , further comprising: for each program loop of the plurality of program loops, storing values in one or more loop termination buffers including a speculative iteration count, a non-speculative iteration count, a trip count, and a confidence bit, wherein estimating the completion time is based on the values stored in the one or more loop termination buffers. 7. The method of claim 1 , wherein the processor comprises a plurality of processing cores, and wherein adjusting the execution rate of the first program loop comprises adjusting an amount of power supplied to a processing core executing the first program loop. 8. The method of claim 1 , further comprising: comparing the difference with a sleep threshold; and transitioning a processing core assigned to execute the first program loop to a sleep state after completion of the first program loop if the difference is greater than the sleep threshold, wherein the first program loop has an earlier estimated completion time than the second program loop. 9. The method of claim 8 , wherein the sleep threshold is longer than a duration for the first program loop to enter the sleep state plus a duration for the first program loop to exit the sleep state. 10. The method of claim 8 , further comprising causing the processing core assigned to execute the first program loop to exit the sleep state prior to the estimated completion time of the second loop. 11. An apparatus, comprising: one or more loop progress buffers configured to store loop progress information for each of a plurality of program loops executed by a set of processing cores; a runtime system coupled with one or more loop progress buffers and configured to: estimate a completion time for each of the plurality of program loops based on the loop progress information; determine a difference between the estimated completion time of a first program loop of the plurality of program loops and the estimated completion time of a second program loop of the plurality of program loops, wherein the first program loop is executed concurrently with the second program loop; and a power management unit coupled with the runtime system, wherein the power management unit is configured to decrease the difference by adjusting an execution rate of the first program loop. 12. The apparatus of claim 11 , wherein the set of processing cores comprises multiple processing cores each configured to execute one of the plurality of program loops. 13. The apparatus of claim 12 , wherein the power management unit is further configured to adjust the execution rate of the first program loop by redistributing power between two or more of the multiple processing cores. 14. The apparatus of claim 12 , wherein the power management unit is further configured to cause at least one of the multiple cores to enter a sleep state based on the difference between the estimated completion time of the first program loop and the estimated completion time of the second program loop. 15. The apparatus of claim 11 , wherein for each program loop of the plurality of program loops, the loop progress information comprises a number of instructions, a number of completed cycles, and a number of expected total cycles, a power state of a processing core executing the program loop, and an estimated power consumption for the program loop. 16. The apparatus of claim 11 , further comprising one or more loop termination buffers coupled with the runtime system, wherein the one or more loop termination buffers are configured to, for each of the plurality of program loops, store loop termination information comprising a speculative iteration count, a non-speculative iteration count, a trip count, and a confidence bit, wherein the runtime system is further configured to estimate the completion time based on the loop termination information. 17. A non-transitory computer-readable medium storing instructions that when executed by a processor, cause the processor to perform a method comprising: estimating a completion time for each program loop of a plurality of program loops executed by the processor; determining a difference between the estimated completion time of a first program loop of the plurality of program loops and the estimated completion time of a second program loop of the plurality of program loops, wherein the first program loop is executed concurrently with the second program loop; and decreasing the difference by adjusting an execution rate of the first program loop. 18. The non-transitory computer-readable medium of claim 17 , wherein the method further comprises storing loop progress information for each of the plurality of program loops in one or more loop progress buffers, wherein the estimated completion time for each program loop of the plurality of program loops is determined based on the loop progress information. 19. The non-transitory computer-readable medium of claim 17 , wherein the processor comprises a plurality of processing cores, and wherein adjusting the execution rate of the first program loop comprises adjusting an amount of power supplied to a processing core executing the first program loop. 20. The non-transitory computer-readable medium of claim 17 , wherein the method further comprises: comparing the difference with a sleep threshold; and transitioning a processing core assigned to execute the first program loop to a sleep state after completion of the first program loop if the difference is greater than the sleep threshold, wherein the first program loop has an earlier estimated completion time than the sec
by task scheduling · CPC title
where the allocation takes into account power or heat criteria (power management in computers in general G06F1/3203; thermal management in computers in general G06F1/206) · CPC title
considering the load · CPC title
Optimisation · CPC title
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.