Parallel program generating method and parallelization compiling apparatus

US10698670B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-10698670-B2
Application numberUS-201715856306-A
CountryUS
Kind codeB2
Filing dateDec 28, 2017
Priority dateDec 28, 2016
Publication dateJun 30, 2020
Grant dateJun 30, 2020

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

There is provided a parallel program generating method capable of generating a static scheduling enabled parallel program without undermining the possibility of extracting parallelism. The parallel program generating method executed by the parallelization compiling apparatus 100 includes a fusion step (FIG. 2 /STEP 026 ) of fusing, as a new task, a task group including a reference task as a task having a conditional branch, and subsequent tasks as tasks control dependent, extended-control dependent, or indirect control dependent on respective of all branch directions of the conditional branch included in the reference task.

First claim

Opening claim text (preview).

What is claimed is: 1. A computer-implemented method for generating, from a sequential program, a parallel program executable in a system including a plurality of arithmetic processing units to perform arithmetic processing in parallel, the method comprising: dividing the sequential program into a plurality of tasks; first analyzing the plurality of tasks to determine data dependency and control dependency of each of the plurality of tasks; second analyzing an earliest executable condition of each of the plurality of tasks based on the data dependency between respective tasks and the control dependency of each task obtained from the first analyzing; and determining, based on results of the second analyzing, as a task group to be fused, a task group including, among the plurality of tasks, a reference task as a task having a conditional branch, and all subsequent tasks as tasks control dependent, extended-control dependent, or indirect control dependent on respective of all branch directions of the conditional branch included in the reference task, and fusing, as a new task, the task group to be fused, wherein the earliest executable conditions for an i-th task MTi are: the conditional branch of a i-th task MTi on which the i-th task MTi is control dependent branches to a path including the i-th task MTi; and a k-th task MTk (k≠i) on which the i-th task MTi is data dependent is fully completed, or non-execution of the k-th task MTk is determined. 2. The method according to claim 1 , further comprising: scheduling to assign each of a plurality of tasks including the new task to each of the plurality of arithmetic processing units based on the data dependency; and generating the parallel program based on the scheduling results. 3. The method according to claim 1 , wherein the determining includes identifying a task group including the reference task, and all first subsequent tasks as tasks control dependent or extended-control dependent on respective of all the branch directions of the conditional branch included in the reference task; adding, to the task group, all second subsequent tasks as tasks control dependent or extended-control dependent on respective of all branch directions of conditional branches included in the task group; repeating the adding until tasks control dependent or extended-control dependent on any of the branch directions of the conditional branches included in the task group run out; and determining the task group to be a task group to be fused. 4. The method according to claim 1 , further comprising: determining whether a plurality of tasks control dependent, indirect control dependent, or extended-control dependent on one branch direction of the conditional branch included in the reference task included in the task group to be fused satisfy a predetermined condition including such a parallelly executable condition as to have no control dependency, indirect control dependency, extended-control dependency, and data dependency on one another; and when the predetermined condition is determined not to be satisfied, fusing the task group to be fused as the new task, or when the predetermined condition is determined to be satisfied, duplicating the conditional branch included in the reference task, making the plurality of tasks having no control dependency, indirect control dependency, extended-control dependency, and data dependency on one another follow respective of a plurality of conditional branches including the duplicated conditional branch, and combining each of the plurality of conditional branches with the plurality of tasks, each of which is made to follow each of the plurality of conditional branches to generate a plurality of task groups, determining the plurality of task groups as a new plurality of task groups to be fused, and fusing, as the new task, each of the plurality of tasks groups to be fused. 5. The method according to claim 1 , wherein the second analyzing includes simplifying the earliest executable condition of each of the plurality of tasks by excluding a case that includes an earliest executable condition that is also included as the earliest executable condition of another case. 6. A parallelization compiling apparatus configured to generate, from a sequential program, a parallel program executable in a system including a plurality of arithmetic processing units to perform arithmetic processing in parallel, the parallelization compiling apparatus comprising at least one processor configured to function as: a task division element which divides the sequential program into a plurality of tasks, a dependency analysis element which analyzes the plurality of tasks divided by the task division element to determine data dependency and control dependency of each of the plurality of tasks; an earliest executable condition analysis element which analyzes an earliest executable condition of each of the plurality of tasks based on the data dependency between respective tasks and the control dependency of each task obtained from the dependency analysis element; and a fusion element which determines, based on results of the earliest executable condition analysis element, as a task group to be fused, a task group including, among the plurality of tasks, a reference task as a task having a conditional branch, and all subsequent tasks as tasks control dependent, extended-control dependent, or indirect control dependent on respective of all branch directions of the conditional branch included in the reference task, and fuses the task group to be fused as a new task, wherein the earliest executable conditions for an i-th task MTi are: the conditional branch of a i-th task MTi on which the i-th task MTi is control dependent branches to a path including the i-th task MTi; and a k-th task MTk (k≠i) on which the i-th task MTi is data dependent is fully completed, or non-execution of the k-th task MTk is determined. 7. A non-transitory computer-readable medium having stored thereon computer-readable instructions to cause a computer to execute a process to generate, from a sequential program, a parallel program executable in a system including a plurality of arithmetic processing units to perform arithmetic processing in parallel, the process comprising: dividing the sequential program into a plurality of tasks; first analyzing the plurality of tasks divided to determine data dependency and control dependency of each of the plurality of tasks; second analyzing an earliest executable condition of each of the plurality of tasks based on the data dependency between respective tasks and the control dependency of each task obtained from the first analyzing; and determining, based on results of the second analyzing, as a task group to be fused, a task group including, among the plurality of tasks, a reference task as a task having a conditional branch, and all subsequent tasks as tasks control dependent, extended-control dependent, or indirect control dependent on respective of all branch directions of the conditional branch included in the reference task, and fusing, as a new task, the task group to be fused, wherein the earliest executable conditions for an i-th task MTi are: the conditional branch of a i-th task MTi on which the i-th task MTi is control dependent branches to a path including the i-th task MTi; and a k-th task MTk (k≠i) on which the i-th task MTi is data dependent is fully completed, or non-execution of the k-th task MTk is determined. 8. The non-transitory computer-readable storage medium according to claim 7 , the process further comprising: scheduling to assign each of a plurality of tasks including the new task to each of the plurality of arithmetic processing units based on the data depe

Assignees

Inventors

Classifications

  • Lexical analysis · CPC title

  • Dependency analysis; Data or control flow analysis · CPC title

  • G06F8/456Primary

    Parallelism detection · CPC title

  • Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions · CPC title

  • G06F8/41Primary

    Compilation · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US10698670B2 cover?
There is provided a parallel program generating method capable of generating a static scheduling enabled parallel program without undermining the possibility of extracting parallelism. The parallel program generating method executed by the parallelization compiling apparatus 100 includes a fusion step (FIG. 2 /STEP 026 ) of fusing, as a new task, a task group including a reference task as a …
Who is the assignee on this patent?
Univ Waseda
What technology area does this patent fall under?
Primary CPC classification G06F8/456. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jun 30 2020 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 4 related publications on this page (citations in our corpus or others sharing the same primary CPC).