Apparatus and method for handling registers in pipeline processing
US-2016328236-A1 · Nov 10, 2016 · US
US9632978B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9632978-B2 |
| Application number | US-201313827280-A |
| Country | US |
| Kind code | B2 |
| Filing date | Mar 14, 2013 |
| Priority date | Mar 16, 2012 |
| Publication date | Apr 25, 2017 |
| Grant date | Apr 25, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A reconfigurable processor based on mini-cores (MCs) includes a plurality of MCs, each MC of the MCs including a group of function units (FUs), the group of FUs having a capability of executing a loop iteration independently. The MCs include a first MC configured to execute a first loop iteration, and a second MC configured to execute a second loop iteration.
Opening claim text (preview).
What is claimed is: 1. A reconfigurable processor based on mini-cores (MCs), the reconfigurable processor comprising: MCs, each MC of the MCs comprising a group of function units (FUs), the group of FUs being configured to execute a loop iteration independently; a sub memory connected to the MCs and having a hierarchical structure with each section comprising a different code portion provided to the each MC of the MCs at delayed intervals between their respective loop iterations; wherein the MCs comprise: a first MC configured to execute a first loop iteration; and a second MC configured to execute a second loop iteration, wherein each MC executes different loop iterations according to loop scheduling based on software pipelining. 2. The reconfigurable processor of claim 1 , wherein the second MC is further configured to start executing the second loop iteration in response to the first MC executing the first loop iteration in response to there being a dependency between the first loop iteration and the second loop iteration. 3. The reconfigurable processor of claim 1 , wherein the FUs of each MC of the MCs are homogeneous FUs. 4. The reconfigurable processor of claim 1 , wherein the FUs of each MC of the MCs are heterogeneous FUs. 5. The reconfigurable processor of claim 1 , further comprising an external link configured to connect the MCs to one other; wherein each MC of the MCs further comprises an internal link configured to connect the FUs of the MC to one other. 6. A schedule apparatus based on mini-cores, (MCs), the schedule apparatus comprising a local scheduler configured to: map a first loop iteration to a first MC of MCs; and map a second loop iteration to a second MC of the MCs, each MC of the MCs comprising a group of function units (FUs), the group of FUs being configured to execute a loop iteration independently, wherein each MC executes different loop iterations according to loop scheduling based on software pipelining and each MC is connected to a sub memory having a hierarchical structure with each section comprising a different code portion provided to the each MC of the MCs at delayed intervals between their respective loop iterations. 7. The schedule apparatus of claim 6 , further comprising a global scheduler configured to adjust a mapping relationship between the first loop iteration and the second loop iteration to generate a loop skew in response to there being a dependency between the first loop iteration and the second loop iteration. 8. The schedule apparatus of claim 7 , wherein the global scheduler is further configured to: delay the second loop iteration relative to the first loop iteration; and map the delayed second loop iteration to the second MC. 9. The schedule apparatus of claim 7 , wherein the global scheduler is further configured to: map the first loop iteration to the first MC; delay the second loop iteration relative to the first loop iteration; and map the delayed second loop iteration to the second MC. 10. The schedule apparatus of claim 6 , further comprising an MC configuration unit configured to group FUs in a reconfigurable processor into the group of FUs of each of the MCs. 11. A schedule method for a reconfigurable processor based on mini-cores (MCs), the reconfigurable processor comprising function units (FUs), the schedule method comprising: grouping the FUs in the reconfigurable processor into MCs, each MC of the MCs comprising a group of FUs, the group of FUs having a capability of executing a loop iteration independently; mapping a first loop iteration to a first MC of the MCs; mapping a second loop iteration to a second MC of the MCs; and connecting, to the MCs, a sub memory having a hierarchical structure with each section comprising a different code portion provided to the each MC of the MCs at delayed intervals between their respective loop iterations, wherein each MC executes different loop iterations according to loop scheduling based on software pipelining. 12. The schedule method of claim 11 , further comprising adjusting a mapping relationship between the first loop iteration and the second loop iteration in response to there being a dependency between the first loop iteration and the second loop iteration. 13. The scheduling method of claim 12 , wherein the adjusting of the mapping relationship comprises: delaying the second loop iteration relative to the first loop iteration; and mapping the delayed second loop iteration to the second MC. 14. The scheduling method of claim 12 , wherein the adjusting of the mapping relationship comprises: mapping the first loop iteration to the first MC; delaying the second loop iteration relative to the first loop iteration; and mapping the delayed second loop iteration to the second MC. 15. A reconfigurable processor comprising: function units (FUs) having a capability of being reconfigured into groups of FUs to form mini-cores (MCs); wherein each group of the groups of FUs has a capability of executing a loop iteration independently; each MC of the MCs comprises a respective group of the groups of FUs; the each MC of the MCs is configured to execute a different loop iteration of loop iterations, wherein each MC executes different loop iterations according to loop scheduling based on software pipelining; and the each MC of the MCs is connected to a sub memory having a hierarchical structure with each section comprising a different code portion provided to the each MC of the MCs at delayed intervals between their respective loop iterations. 16. The reconfigurable processor of claim 15 , wherein the MCs are configured to start executing respective ones of the loop iterations simultaneously in response to there being no dependency between successive ones of the loop iterations. 17. The reconfigurable processor of claim 15 , wherein each MC of the MCs except one of the MCs further configured to execute a first loop iteration of the loop iterations is further configured to start executing a respective one of the loop iterations a predetermined time later than one of the MCs configured to execute an immediately preceding one of the loop iterations starts executing the immediately preceding loop iteration in response to there being a dependency between successive ones of the loop iterations. 18. The reconfigurable processor of claim 15 , wherein the FUs in each of the groups of FUs are homogeneous FUs. 19. The reconfigurable processor of claim 15 , wherein the FUs in each of the groups of FUs are heterogeneous FUs. 20. The schedule apparatus of claim 7 , wherein the global scheduler is further configured to: delay a third loop iteration relative to the second loop iteration by a same time as the delay between the second loop iteration and the first loop iteration. 21. The schedule apparatus of claim 7 , wherein the local scheduler is configured to map the first loop iteration and map the second loop iteration without using the global scheduler in response to there being no dependency between the first loop iteration and the second loop iteration. 22. The schedule apparatus of claim 7 , further comprising: a local register file configured to store results of operations performed by or context information of the FUs and data related to recurrence of the MCs; and a global register file configured to store results of operations performed by or context information of the MCs; and wherein the global scheduler is configured to adjust a mapping rel
Software pipelining · CPC title
Two dimensional arrays, e.g. mesh, torus · CPC title
comprising an array of processing units with common control, e.g. single instruction multiple data processors (G06F15/82 takes precedence {; for correlation function computation G06F17/15}) · CPC title
Organisation of register space, e.g. banked or distributed register file · CPC title
organised in groups of units sharing resources, e.g. clusters · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.