Data processing device and method of controlling the same
US-9223573-B2 · Dec 29, 2015 · US
US2016246602A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016246602-A1 |
| Application number | US-201615048680-A |
| Country | US |
| Kind code | A1 |
| Filing date | Feb 19, 2016 |
| Priority date | Feb 19, 2015 |
| Publication date | Aug 25, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The present invention discloses a solution to accelerate control flow loops by utilizing the branch outcome. The embodiments of the invention eliminate fetching and execution of unnecessary operations and also the overhead due to predicate communication thus overcoming the inefficiencies associated with existing techniques. Experiments on several benchmarks are also disclosed, demonstrating that the present invention achieves optimal acceleration with minimum hardware overhead.
Opening claim text (preview).
What is claimed is: 1 . A method, comprising: receiving at least one function executed by a computer program; resolving a branching condition to produce an outcome wherein at least one path of the branch is to be executed by a coarse grain reconfigurable array and at least one path of the branch is false; executing the branch condition in at least one processing element of the Coarse Grain Reconfigurable Array and communicating the branch outcome to an Instruction Fetch Unit; and selectively issuing instructions from an instruction memory for at least one path to be executed. 2 . The method of claim 1 , further comprising communicating the branching condition outcome and a number of cycles required to execute the at least one paths to the instruction fetch unit by at least one processing element of the Coarse-grain Reconfigurable Array 3 . The method of claim 1 , further comprising utilizing a delay slot after the step of resolving the branching condition outcome to communicate the branching condition outcome to the instruction fetch unit. 4 . The method of claim 3 , further comprising performing operations that are independent of the branching condition outcome in the delay slot. 5 . The method of any of claim 1 , further comprising mapping operations to processing elements in the coarse grain reconfigurable array from both an if-path of the branching condition and an else-path of the branching condition with a compiler. 6 . The method of claim 5 , further comprising pairing the mapped operations from the if-path and the else-path based on a common variable. 7 . The method of claim 6 , further comprising pairing a no op instruction with an if-path instruction when the else-path contains fewer operations than the if-path. 8 . The method of claim 6 , further comprising pairing a no op instruction with an else-path instruction when the if-path contains few operations than the else-path. 9 . The method of claim 8 , further comprising eliminating a select instruction or a phi operation based on the pairing by the common variable and based on the step of communicating which of the at least one paths of the branching condition is to be executed to the instruction fetch unit. 10 . A computer program product, comprising: a non-transitory computer readable medium comprising code for performing the steps of: receiving at least one function executed by a computer program; resolving a branching condition to produce an outcome wherein at least one path of the branch is to be executed by a coarse grain reconfigurable array and at least one path of the branch is false; executing the branch condition in at least one processing element of the Coarse Grain Reconfigurable Array and communicating the branch outcome to an Instruction Fetch Unit; and selectively issuing instructions from an instruction memory for at least one path to be executed. 11 . The computer program product of claim 10 , wherein the non-transitory computer readable medium further comprises code for performing the step of communicating the branching condition outcome and a number of cycles required to execute the at least one paths to the instruction fetch unit by at least one processing element of the Coarse-grain Reconfigurable Array 12 . The computer program product of claim 10 , wherein the non-transitory computer readable medium further comprises code for performing the step of utilizing a delay slot after the step of resolving the branching condition outcome to communicate the branching condition outcome to the instruction fetch unit. 13 . The computer program product of claim 12 , wherein the non-transitory computer readable medium further comprises code for performing the step of performing operations that are independent of the branching condition outcome in the delay slot. 14 . The computer program product of any of claim 10 , wherein the non-transitory computer readable medium further comprises code for performing the step of mapping operations to processing elements in the coarse grain reconfigurable array from both an if-path of the branching condition and an else-path of the branching condition. 15 . The computer program product of claim 14 , wherein the non-transitory computer readable medium further comprises code for performing the step of pairing the mapped operations from the if-path and the else-path based on a common variable. 16 . The computer program product of claim 15 , wherein the non-transitory computer readable medium further comprises code for performing the step of pairing a no op instruction with an if-path instruction when the else-path contains fewer operations than the if-path. 17 . The computer program product of claim 15 , wherein the non-transitory computer readable medium further comprises code for performing the step of pairing a no op instruction with an else-path instruction when the if-path contains few operations than the else-path. 18 . The computer program product of claim 17 , wherein the non-transitory computer readable medium further comprises code for performing the step of eliminating a select instruction or a phi operation based on the pairing by the common variable and based on the step of communicating which of the at least one paths of the branching condition is to be executed to the instruction fetch unit. 19 . An apparatus, comprising: a memory; and a processor coupled to the memory, wherein the processor is configured to execute the steps of: receiving at least one function executed by a computer program; resolving a branching condition to produce an outcome wherein at least one path of the branch is to be executed by a coarse grain reconfigurable array and at least one path of the branch is false; executing the branch condition in at least one processing element of the Coarse Grain Reconfigurable Array and communicating the branch outcome to an Instruction Fetch Unit; and selectively issuing instructions from an instruction memory for at least one path to be executed. 20 . The apparatus of claim 19 , wherein the processor is further configured to execute the step of communicating the branching condition outcome and a number of cycles required to execute the at least one paths to the instruction fetch unit by at least one processing element of the coarse-grain reconfigurable array. 21 . The apparatus of claim 19 , wherein the processor is further configured to execute the step of utilizing a delay slot after the step of resolving the branching condition outcome to communicate the branching condition outcome to the instruction fetch unit. 22 . The apparatus of claim 21 , wherein the processor is further configured to execute the step of performing operations that are independent of the branching condition outcome in the delay slot. 23 . The apparatus of any of claim 19 , wherein the processor is further configured to execute the step of mapping operations to processing elements in the coarse grain reconfigurable array from both an if-path of the branching condition and an else-path of the branching condition. 24 . The apparatus of claim 23 , wherein the processor is further configured to execute the step of pairing the mapped operations from the if-path and the else-path based on a common variable. 25 . The apparatus of claim 24 , wherein the processor is further configured to execute the step of pairing a no op instruction with an if-path instruction wh
for branches, e.g. hedging, branch folding · CPC title
Reducing the execution time required by the program code · CPC title
to perform conditional operations, e.g. using predicates or guards · CPC title
Software pipelining · CPC title
Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.