Progress meters in parallel computing

US2016188380A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016188380-A1
Application numberUS-201414583254-A
CountryUS
Kind codeA1
Filing dateDec 26, 2014
Priority dateDec 26, 2014
Publication dateJun 30, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Systems and methods may provide a set of cores capable of parallel execution of threads. Each of the cores may run code that is provided with a progress meter that calculates the amount of work remaining to be performed on threads as they run on their respective cores. The data may be collected continuously, and may be used to alter the frequency, speed or other operating characteristic of the cores as well as groups of cores. The progress meters may be annotated into existing code.

First claim

Opening claim text (preview).

1 . A method of controlling a computational resource, comprising: globally synchronizing a plurality of tasks across a plurality of computational resources; computing an amount of work to complete at least one task of the plurality of tasks; processing the plurality of tasks in parallel to accomplish work corresponding to each task of the plurality of tasks; repeatedly computing a work fraction that corresponds to one or more of a fraction of work completed or work remaining to be completed with respect to the amount of work to complete the at least one task of the plurality of tasks; and calculating a skew of a plurality of measurements taken from the plurality of computational resources; and modifying a characteristic of at least one computational resource of the plurality of computational resources based on the work fraction and the skew. 2 . The method of claim 1 , wherein the plurality of computational resources includes a plurality of cores, and wherein a frequency of at least one core of the plurality of cores is varied based on the work fraction. 3 . The method of claim 1 , wherein the plurality of computational resources includes one or more of a core, a processor, a multi-core processor, a node, a cabinet, a cluster, a row, or a grid, and wherein at least a portion of the plurality of computational resources are in communication with one another. 4 . The method of claim 1 , wherein the plurality of tasks includes a plurality of threads, and wherein the plurality of computational resources includes a plurality of cores. 5 . The method of claim 1 , further including: reporting the work fraction by one or more of an application or an Application Programming Interface (API); and receiving an indication of the work fraction at a runtime monitor. 6 . The method of claim 1 , further including modifying one or more of a number, a distribution, a speed, or a frequency of at least one of the plurality of computational resources. 7 . The method of claim 1 , wherein the characteristic includes a speed, and wherein the speed of at least one computational resource of the plurality of computational resources is modified by changing an amount of electrical power provided to the at least one computation resource. 8 . The method of claim 1 , wherein the plurality of computational resources includes a plurality of nodes, and wherein the method further includes: calculating a skew of a plurality of measurements taken from the plurality of nodes; and modifying a speed of at least one node of the plurality of nodes based a comparison of a characteristic of the at least one node to the skew. 9 . The method of claim 1 , further including synchronizing the plurality of tasks at a barrier, wherein each task of the plurality of tasks includes a waiting time at the barrier, and wherein the method further includes repeatedly modifying the characteristic to reduce the waiting time for the at least one task. 10 . An apparatus to process tasks, comprising: a plurality of computational resources to process a plurality of tasks in parallel, wherein the plurality of tasks are to be globally synchronized across the plurality of computational resources; progress meter logic, implemented at least partly in fixed functionality hardware, to: compute an amount of work to complete at least one task of the plurality of tasks; and repeatedly compute a work fraction that is to correspond to one or more of a fraction of work completed or work remaining to be completed with respect to the amount of work to complete the at least one task; skew calculator logic to compute a skew of a plurality of measurements taken from the plurality of computational resources; and performance balancer logic, implemented at least partly in fixed functionality hardware, to modify a characteristic of at least one computational resource of the plurality of computational resources based on the work fraction and the skew. 11 . The apparatus of claim 10 , wherein the plurality of computational resources is to include a plurality of cores, and wherein the performance balancer logic is to vary a frequency of at least one core of the plurality of cores based on the work function. 12 . The apparatus of claim 10 , wherein the performance balancer logic is to vary a speed of at least one of the plurality of computational resources by varying an amount of power supplied to the at least one of the plurality of computational resources. 13 . The apparatus of claim 10 , wherein the performance balancer logic is to vary a speed of at least two of the plurality of computational resources by steering power from a relatively faster one of the plurality of computational resources toward a relatively slower one of the plurality of computational resources. 14 . The apparatus of claim 10 , wherein the computational resources are to include a plurality of cores, and wherein the performance balancer logic is to vary a speed of at least one of the plurality of cores by varying an amount of power provided to the at least one of the plurality of cores. 15 . The apparatus of claim 10 , further including runtime monitor logic, implemented at least partly in fixed functionality hardware, to receive information from the progress meter logic that is to be indicative of the work fraction. 16 . The apparatus of claim 10 , wherein the plurality of computational resources are to include one or more of a core, a processor, a multi-core processor, a node, a cabinet, a cluster, a row, or a grid, and wherein at least a portion of the plurality of computational resources are to have a communications channel therebetween. 17 . The apparatus of claim 10 , further including: a plurality of nodes; and skew calculator logic to compute a skew of a plurality of measurements taken from the plurality of nodes, wherein the performance balancer logic is to vary a speed of at least one of the nodes based on the skew. 18 . The apparatus of claim 10 , wherein the performance balancer logic is to modify one or more of a number, a distribution, a speed, or a frequency of at least one of the plurality of computational resources. 19 . At least one non-transitory computer readable storage medium comprising one or more instructions that when executed on a computing device cause the computing device to: globally synchronize a plurality of tasks across a plurality of computational resources; compute an amount of work to complete at least one task of the plurality of tasks; repeatedly compute a work fraction that corresponds to one or more of a fraction of work completed or work remaining to be completed with respect to the amount of work to complete the at least one task of the plurality of tasks; compute a skew of a plurality of measurements taken from the plurality of computational resources; and modify a characteristic of at least one computational resource of the plurality of computational resources based on the work fraction and the skew. 20 . The at least one non-transitory computer readable storage medium of claim 19 , wherein the plurality of computational resources is to include a plurality of cores, and wherein the instructions, when executed on a computing device, cause the computing device to modify a frequency of at least one of the plurality of cores. 21 . The at least one non-transitory computer readable storage medium of claim 19 , wherein the instructions, when executed, cause the computing device to: compute the work fraction; and receive inform

Assignees

Inventors

Classifications

  • G06F9/522Primary

    Barrier synchronisation · CPC title

  • where the allocation takes into account power or heat criteria (power management in computers in general G06F1/3203; thermal management in computers in general G06F1/206) · CPC title

  • Monitor · CPC title

  • Techniques for rebalancing the load in a distributed system · CPC title

  • Energy efficient computing, e.g. low power processors, power management or thermal management · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016188380A1 cover?
Systems and methods may provide a set of cores capable of parallel execution of threads. Each of the cores may run code that is provided with a progress meter that calculates the amount of work remaining to be performed on threads as they run on their respective cores. The data may be collected continuously, and may be used to alter the frequency, speed or other operating characteristic of the …
Who is the assignee on this patent?
Intel Corp
What technology area does this patent fall under?
Primary CPC classification G06F9/522. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Jun 30 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).