Method for convergence analysis based on thread variance analysis

US9292265B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9292265-B2
Application numberUS-201213467765-A
CountryUS
Kind codeB2
Filing dateMay 9, 2012
Priority dateMay 9, 2012
Publication dateMar 22, 2016
Grant dateMar 22, 2016

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Basic blocks within a thread program are characterized for convergence based on variance analysis or corresponding instructions. Each basic block is marked as divergent based on transitive control dependence on a block that is either divergent or comprising a variant branch condition. Convergent basic blocks that are defined by invariant instructions are advantageously identified as candidates for scalarization by a thread program compiler.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for characterizing a thread program, the method comprising: marking each basic block associated with the thread program as being convergent, wherein each basic block includes a plurality of instructions and starts with a label instruction and is terminated by a control transfer instruction; marking a set of instructions associated with each basic block as being invariant; initializing a work list that includes instructions that are known to be variant relative to the set of instructions; selecting a first instruction from the work list; marking the first instruction as variant; adding successor instructions to the work list based on the first instruction; and propagating a divergence attribute to identify associated basic blocks as divergent, and to identify instructions within the associated basic blocks as variant. 2. The method of claim 1 , wherein initializing comprises: visiting each basic block associated with the thread program; for each basic block, visiting each instruction associated with the basic block; determining that the instruction is variant; and adding the instruction to the work list. 3. The method of claim 2 , wherein an instruction that is variant accesses a thread identification register. 4. The method of claim 2 , wherein an instruction that is variant performs an atomic operation. 5. The method of claim 1 , wherein propagating comprises: determining that the first instruction is a conditional branch instruction; marking as divergent a set of basic blocks having a control dependence on the selected instruction; and adding each instruction associated with the set of basic blocks to the work list. 6. The method of claim 1 , wherein selecting an instruction comprises popping the instruction from the work list. 7. The method of claim 1 , wherein adding an instruction comprises pushing a unique instance of the instruction onto the work list. 8. The method of claim 1 , further comprising determining that a first basic block is a candidate for scalarization based on the first basic block being convergent. 9. The method of claim 8 , further comprising generating scalarized code for the first basic block for scalar execution. 10. A non-transitory computer-readable storage medium including instructions that, when executed by a processing unit, cause the processing unit to characterize a thread program, by performing the steps of: marking each basic block associated with the thread program as being convergent, wherein each basic block includes a plurality of instructions and starts with a label instruction and is terminated by a control transfer instruction; marking a set of instructions associated with each basic block as being invariant; initializing a work list that includes instructions that are known to be variant relative to the set of instructions; selecting a first instruction from the work list; marking the first instruction as variant; adding successor instructions to the work list based on the first instruction; and propagating a divergence attribute to identify associated basic blocks as divergent, and to identify instructions within the associated basic blocks as variant. 11. The non-transitory computer-readable storage medium of claim 10 , wherein initializing comprises: visiting each basic block associated with the thread program; for each basic block, visiting each instruction associated with the basic block; determining that the instruction is variant; and adding the instruction to the work list. 12. The non-transitory computer-readable storage medium of claim 11 , wherein an instruction that is variant accesses a thread identification register. 13. The non-transitory computer-readable storage medium of claim 11 , wherein an instruction that is variant performs an atomic operation. 14. The non-transitory computer-readable storage medium of claim 10 , wherein propagating comprises: determining that the first instruction is a conditional branch instruction; marking as divergent a set of basic blocks having a control dependence on the selected instruction; and adding each instruction associated with the set of basic blocks to the work list. 15. The non-transitory computer-readable storage medium of claim 10 , wherein selecting an instruction comprises popping the instruction from the work list. 16. The non-transitory computer-readable storage medium of claim 10 , wherein adding an instruction comprises pushing a unique instance of the instruction onto the work list. 17. The non-transitory computer-readable storage medium of claim 10 , further comprising determining that a first basic block is a candidate for scalarization based on the first basic block being convergent and including only invariant instructions. 18. The non-transitory computer-readable storage medium of claim 17 , further comprising scheduling the first basic block for scalar execution. 19. A computing device, comprising: a mass storage system configured to store at least a thread program; a processing unit coupled to the mass storage system and configured to: mark each basic block associated with the thread program as being convergent, wherein each basic block includes a plurality of instructions and starts with a label instruction and is terminated by a control transfer instruction; mark a set of instructions associated with each basic block as being invariant; initialize a work list that includes instructions that are known to be variant relative to the set of instructions; select a first instruction from the work list; mark the first instruction as variant; add successor instructions to the work list based on the first instruction; and propagate a divergence attribute to identify associated basic blocks as divergent, and to identify instructions within the associated basic blocks as variant. 20. The computing device of claim 19 , wherein the processing unit is further configured to: determine that a first basic block is a candidate for scalarization based on the first basic block being convergent and including only invariant instructions; and schedule the first basic block for scalar execution within a thread program executable.

Assignees

Inventors

Classifications

  • Optimisation · CPC title

  • Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions · CPC title

  • G06F8/456Primary

    Parallelism detection · CPC title

  • Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title

  • G06F8/41Primary

    Compilation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9292265B2 cover?
Basic blocks within a thread program are characterized for convergence based on variance analysis or corresponding instructions. Each basic block is marked as divergent based on transitive control dependence on a block that is either divergent or comprising a variant branch condition. Convergent basic blocks that are defined by invariant instructions are advantageously identified as candidates …
Who is the assignee on this patent?
Grover Vinod, Lee Yunsup, Kong Xiangyun, and 3 more
What technology area does this patent fall under?
Primary CPC classification G06F8/456. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 22 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).