Scalar optimizations for shaders
US-9430199-B2 · Aug 30, 2016 · US
US9760355B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9760355-B2 |
| Application number | US-201414302886-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jun 12, 2014 |
| Priority date | Jun 14, 2013 |
| Publication date | Sep 12, 2017 |
| Grant date | Sep 12, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A parallelizing compile method includes, dividing a sequential program for an embedded system into multiple macro tasks, specifying (i) a starting end task and (ii) a termination end task, fusing (i) the starting end task, (ii) the termination end task, and (iii) a group of the multiple macro tasks, extracting a group of multiple new macro tasks from the multiple new macro tasks fused in the fusing based on a data dependency, performing a static scheduling assigning the multiple new macro tasks to the multiple processor units, so that the group of the multiple new macro tasks is parallelly executable by the multiple processor units, and generating a parallelizing program. In addition, a parallelizing compiler, a parallelizing compile apparatus and an onboard apparatus are provided.
Opening claim text (preview).
What is claimed is: 1. A parallelizing compile method comprising: dividing a sequential program that controls an embedded system into a plurality of macro tasks according to a predetermined method, the sequential program being executed by a single processor system; generating a macro flow graph based on analysis of a control flow and a data dependency among the macro tasks; specifying, based on the control flow indicated in the macro flow graph, at least one pair of a starting end task and a termination end task, wherein the starting end task is a macro task that has a conditional branch, and the termination end task is one of macro tasks which are commonly executed among a plurality of a series of processings of the macro tasks to be sequentially executed from the starting end task as a start point; fusing (i) the starting end task specified by the specifying, (ii) the termination end task specified by the specifying, and (iii) a group of the macro tasks executed after execution of the starting end task and before execution of the termination end task into a block task so as to hide the conditional branch of the starting end task into the block task, wherein a plurality of block tasks are provided; extracting, based on the data dependency among the macro tasks including the block tasks after the fusing, parallelly-executable macro tasks that are parallelly-executable by a plurality of processor units included in a multiprocessor system; performing a static scheduling assigning the parallelly-executable macro tasks to the processor units, causing the parallelly-executable macro tasks to be parallelly executed by the plurality of processor units; and generating parallelized program to be executed by the multiprocessor system, based on a result of the static scheduling. 2. The parallelizing compile method according to claim 1 , further comprising performing an inline expansion which transposes a description calling a function in the sequential program to another description representing the processing in the function, wherein, in the dividing, the sequential program after performing the inline expansion is divided into the macro tasks. 3. The parallelizing compile method according to claim 2 , further comprising: a procedure for finding a first processing block and a second processing block, in which local variables with an identical name are used, wherein the first processing block and the second processing block represent assemblies of descriptions for realizing a certain object in the function described in the sequential program; and a renaming procedure that the descriptions in functions are changed, so that the local variables in the first processing block and the second processing block have different names each other, wherein the specifying is performed after the names of the local variables are changed through the renaming procedure, and after the control flow between the macro tasks is analyzed. 4. A parallelizing compiler, comprising a computer or logic circuitry, the parallelizing compiler being configured to perform: receiving a sequential program that controls an embedded system, through an input portion; dividing a sequential program that controls an embedded system into a plurality of macro tasks according to a predetermined method, the sequential program being executed by a single processor system; generating a macro flow graph based on analysis of a control flow and a data dependency among the macro tasks; specifying, based on the control flow indicated in the macro flow graph, at least one pair of a starting end task and a termination end task, wherein the starting end task is a macro task that has a conditional branch, and the termination end task is one of macro tasks which are commonly executed among a plurality of series of processings of the macro tasks to be sequentially executed from the starting end task as a start point; fusing (i) the starting end task specified by the specifying, (ii) the termination end task specified by the specifying, and (iii) a group of the macro tasks executed after execution of the starting end task and before execution of the termination end task into a block task so as to hide the conditional branch of the starting end task into the block task, wherein a plurality of block tasks are provided; extracting, based on the data dependency among the macro tasks including the block tasks after the fusing, parallelly-executable macro tasks that are parallelly-executable by a plurality of processor units included in a multiprocessor system; performing a static scheduling assigning parallelly-executable macro tasks to the processor units, causing the parallelly-executable macro tasks to be parallelly executed by the plurality of processor units; generating a parallelized program to be executed by the multiprocessor system, based on a result of the static scheduling; and outputting the parallelized program generated. 5. A parallelizing compiler apparatus comprising: a division device to divide a sequential program that controls an embedded system into a plurality of macro tasks according to a predetermined method, the sequential program being executed by a single processor system; a generating device to generate a macro flow graph based on analysis of a control flow and a data dependency among the macro tasks; a specifying device to specify, based on the control flow indicated in the macro flow graph, at least one pair of a starting end task and a termination end task, wherein the starting end task is a macro task that has a conditional branch, and the termination end task is one of macro tasks which are commonly executed among a plurality of a series of processings of the macro tasks to be sequentially executed from the starting end task as a start point; a fusion device to fuse (i) the starting end task specified by the specifying, (ii) the termination end task specified by the specifying device, and (iii) a group of the macro tasks executed after execution of the starting end task and before execution of the termination end task into a block task so as to hide the conditional branch of the starting end task into the block task, wherein a plurality of block tasks are provided; an extraction device to extract, based on the data dependency among the macro tasks including the block tasks after the fusing, parallelly-executable macro tasks that are parallelly-executable by a plurality of processor units included in a multiprocessor system; a scheduling device to perform a static scheduling assigning the parallelly-executable macro tasks to the processor units, causing the parallelly-executable macro tasks to be parallelly executed by the plurality of processor units; and a generation device to generate a parallelized program executed by the multiprocessor system, based on a result of the static scheduling. 6. A vehicle onboard apparatus comprising: a multicore processor; a storage that stores a parallelized program generated by a parallelizing compile method; and a multiprocessor system that includes the multicore processor and is operated according to the parallelized program, wherein the parallelizing compile method includes: dividing a sequential program that controls an embedded system provided in the vehicle onboard apparatus into a plurality of macro tasks according to a predetermined method, the sequential program being executed by a single processor system; generating a macro flow graph based on analysis of a control flow and a data dependency among the macro tasks; specifying, based on the control flow indicated in the macro flow graph, at least one pair of a starting end task and a termination end task, wherein the starting end task is a macro task that has a conditional branch, and the termination end task is one of macro task
Related publications grouped by family.
Answers are generated from the same data shown on this page.