Path selection based acceleration of conditionals in coarse grain reconfigurable arrays (cgras)

US2016246602A1 · US · A1

Patent metadata
FieldValue
Publication numberUS-2016246602-A1
Application numberUS-201615048680-A
CountryUS
Kind codeA1
Filing dateFeb 19, 2016
Priority dateFeb 19, 2015
Publication dateAug 25, 2016
Grant date

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

The present invention discloses a solution to accelerate control flow loops by utilizing the branch outcome. The embodiments of the invention eliminate fetching and execution of unnecessary operations and also the overhead due to predicate communication thus overcoming the inefficiencies associated with existing techniques. Experiments on several benchmarks are also disclosed, demonstrating that the present invention achieves optimal acceleration with minimum hardware overhead.

First claim

Opening claim text (preview).

What is claimed is: 1 . A method, comprising: receiving at least one function executed by a computer program; resolving a branching condition to produce an outcome wherein at least one path of the branch is to be executed by a coarse grain reconfigurable array and at least one path of the branch is false; executing the branch condition in at least one processing element of the Coarse Grain Reconfigurable Array and communicating the branch outcome to an Instruction Fetch Unit; and selectively issuing instructions from an instruction memory for at least one path to be executed. 2 . The method of claim 1 , further comprising communicating the branching condition outcome and a number of cycles required to execute the at least one paths to the instruction fetch unit by at least one processing element of the Coarse-grain Reconfigurable Array 3 . The method of claim 1 , further comprising utilizing a delay slot after the step of resolving the branching condition outcome to communicate the branching condition outcome to the instruction fetch unit. 4 . The method of claim 3 , further comprising performing operations that are independent of the branching condition outcome in the delay slot. 5 . The method of any of claim 1 , further comprising mapping operations to processing elements in the coarse grain reconfigurable array from both an if-path of the branching condition and an else-path of the branching condition with a compiler. 6 . The method of claim 5 , further comprising pairing the mapped operations from the if-path and the else-path based on a common variable. 7 . The method of claim 6 , further comprising pairing a no op instruction with an if-path instruction when the else-path contains fewer operations than the if-path. 8 . The method of claim 6 , further comprising pairing a no op instruction with an else-path instruction when the if-path contains few operations than the else-path. 9 . The method of claim 8 , further comprising eliminating a select instruction or a phi operation based on the pairing by the common variable and based on the step of communicating which of the at least one paths of the branching condition is to be executed to the instruction fetch unit. 10 . A computer program product, comprising: a non-transitory computer readable medium comprising code for performing the steps of: receiving at least one function executed by a computer program; resolving a branching condition to produce an outcome wherein at least one path of the branch is to be executed by a coarse grain reconfigurable array and at least one path of the branch is false; executing the branch condition in at least one processing element of the Coarse Grain Reconfigurable Array and communicating the branch outcome to an Instruction Fetch Unit; and selectively issuing instructions from an instruction memory for at least one path to be executed. 11 . The computer program product of claim 10 , wherein the non-transitory computer readable medium further comprises code for performing the step of communicating the branching condition outcome and a number of cycles required to execute the at least one paths to the instruction fetch unit by at least one processing element of the Coarse-grain Reconfigurable Array 12 . The computer program product of claim 10 , wherein the non-transitory computer readable medium further comprises code for performing the step of utilizing a delay slot after the step of resolving the branching condition outcome to communicate the branching condition outcome to the instruction fetch unit. 13 . The computer program product of claim 12 , wherein the non-transitory computer readable medium further comprises code for performing the step of performing operations that are independent of the branching condition outcome in the delay slot. 14 . The computer program product of any of claim 10 , wherein the non-transitory computer readable medium further comprises code for performing the step of mapping operations to processing elements in the coarse grain reconfigurable array from both an if-path of the branching condition and an else-path of the branching condition. 15 . The computer program product of claim 14 , wherein the non-transitory computer readable medium further comprises code for performing the step of pairing the mapped operations from the if-path and the else-path based on a common variable. 16 . The computer program product of claim 15 , wherein the non-transitory computer readable medium further comprises code for performing the step of pairing a no op instruction with an if-path instruction when the else-path contains fewer operations than the if-path. 17 . The computer program product of claim 15 , wherein the non-transitory computer readable medium further comprises code for performing the step of pairing a no op instruction with an else-path instruction when the if-path contains few operations than the else-path. 18 . The computer program product of claim 17 , wherein the non-transitory computer readable medium further comprises code for performing the step of eliminating a select instruction or a phi operation based on the pairing by the common variable and based on the step of communicating which of the at least one paths of the branching condition is to be executed to the instruction fetch unit. 19 . An apparatus, comprising: a memory; and a processor coupled to the memory, wherein the processor is configured to execute the steps of: receiving at least one function executed by a computer program; resolving a branching condition to produce an outcome wherein at least one path of the branch is to be executed by a coarse grain reconfigurable array and at least one path of the branch is false; executing the branch condition in at least one processing element of the Coarse Grain Reconfigurable Array and communicating the branch outcome to an Instruction Fetch Unit; and selectively issuing instructions from an instruction memory for at least one path to be executed. 20 . The apparatus of claim 19 , wherein the processor is further configured to execute the step of communicating the branching condition outcome and a number of cycles required to execute the at least one paths to the instruction fetch unit by at least one processing element of the coarse-grain reconfigurable array. 21 . The apparatus of claim 19 , wherein the processor is further configured to execute the step of utilizing a delay slot after the step of resolving the branching condition outcome to communicate the branching condition outcome to the instruction fetch unit. 22 . The apparatus of claim 21 , wherein the processor is further configured to execute the step of performing operations that are independent of the branching condition outcome in the delay slot. 23 . The apparatus of any of claim 19 , wherein the processor is further configured to execute the step of mapping operations to processing elements in the coarse grain reconfigurable array from both an if-path of the branching condition and an else-path of the branching condition. 24 . The apparatus of claim 23 , wherein the processor is further configured to execute the step of pairing the mapped operations from the if-path and the else-path based on a common variable. 25 . The apparatus of claim 24 , wherein the processor is further configured to execute the step of pairing a no op instruction with an if-path instruction wh

Assignees

Inventors

Classifications

  • for branches, e.g. hedging, branch folding · CPC title

  • Reducing the execution time required by the program code · CPC title

  • to perform conditional operations, e.g. using predicates or guards · CPC title

  • Software pipelining · CPC title

  • Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US2016246602A1 cover?
The present invention discloses a solution to accelerate control flow loops by utilizing the branch outcome. The embodiments of the invention eliminate fetching and execution of unnecessary operations and also the overhead due to predicate communication thus overcoming the inefficiencies associated with existing techniques. Experiments on several benchmarks are also disclosed, demonstrating tha…
Who is the assignee on this patent?
Radhika Shri Hari Rajendran, Shrivastava Aviral, Univ Arizona State
What technology area does this patent fall under?
Primary CPC classification G06F9/30058. Mapped technology areas include Physics.
When was this patent published?
Publication date Thu Aug 25 2016 00:00:00 GMT+0000 (Coordinated Universal Time) (A1). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).