Who is the assignee on this patent?

Martinez Canedo Arquimedes, Ludwig Hartmut, Siemens Ag

What technology area does this patent fall under?

Primary CPC classification G06F8/451. Mapped technology areas include Physics.

When was this patent published?

Publication date Tue Aug 08 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.

What related patents are in patentsdb?

We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).

Reducing the scan cycle time of control applications through multi-core execution of user programs

US9727377B2 · US · B2

Patent metadata
Field	Value
Publication number	US-9727377-B2
Application number	US-201214232352-A
Country	US
Kind code	B2
Filing date	Jul 16, 2012
Priority date	Jul 14, 2011
Publication date	Aug 8, 2017
Grant date	Aug 8, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

Title
What the patent document calls the invention.
Abstract
A short plain-language summary of the technical disclosure.
Assignees and inventors
Who owns or filed the patent and who is credited as inventor.
Key dates
Filing, priority, publication, and grant dates set the timeline.
First independent claim
The legal scope of protection — read this for what is actually claimed.
CPC / IPC classifications
Technology tags used to group this patent with similar filings.
Citations and related patents
Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A method for pipeline parallelizing a control program for multi-core execution includes using ( 12 ) data dependency analysis on a control program to identify tasks that can be performed in parallel, identifying ( 13 ) a largest task Tmax requiring the most execution time of the identified tasks, identifying ( 14 ) cut-points in the largest task Tmax where data dependency delays decouple the task, inserting ( 15 ) delayed data dependencies into cut-points of the largest task Tmax to create N pipeline sub-tasks, in which N is a number of cores available to a processor on which the control program will be executed, and scheduling ( 16 ) the tasks and pipeline sub-tasks to the available processor cores.

First claim

Opening claim text (preview).

What is claimed is: 1. A method for pipeline parallelizing a control program for multi-core execution, comprising: performing a data dependency analysis on the control program to identify a plurality of tasks, each task comprising a set of operations executable over a plurality of scan cycles independent from the other tasks; identifying the largest task requiring the most execution time of the identified tasks; identifying cut-points in the largest task where data dependency delays decouple the largest task; inserting delayed data dependencies into the cut-points of the largest task to create N pipeline sub-tasks, wherein N is a number of cores available to a processor on which the control program will be executed; and scheduling the pipeline sub-tasks to the available processor cores in a manner that enables multiple scan cycles of the largest task to be executed in parallel, wherein each pipeline sub-task is scheduled to a first core for a first scan cycle, and a parallel execution of each pipeline sub-task is scheduled to one or more second cores for a second scan cycle which is interleaved with the first scan cycle to generate an intermediate output for each pipeline sub-task on the one or more second cores simultaneous with a subsequent pipeline sub-task on the first core. 2. The method of claim 1 , wherein the data dependency analysis is performed on a low-level intermediate representation of the control program. 3. The method of claim 1 , further comprising identifying a read step, a process step, and a write step of the largest task wherein the cut-points are identified in the process step. 4. The method of claim 1 , wherein the pipeline sub-tasks are load-balanced. 5. The method of claim 1 , further comprising scheduling each pipeline sub-task in its own processor core, wherein output of a first pipeline sub-task is communicated to a processor core of a subsequent pipeline sub-task to be input to the subsequent pipeline sub-task, and wherein the first pipeline sub-task can be executed in its core while a second pipeline sub-task is executing to generate an intermediate output that can be communicated to the core of the subsequent pipeline sub-task. 6. The method of claim 1 , further comprising: using multiple cores of a multi-core processor to execute intermediate iterations of a first pipeline sub-task simultaneous with an iteration of a subsequent pipeline sub-task; privatizing memory locations of sensor data for every execution cycle of the pipeline sub-tasks to relax loop carried dependencies; committing a first computation result of an intermediate iteration to one of said privatized memory locations, wherein a most recent computation result is available to subsequent executions of the control program; and committing a second computation result of the largest task to a main memory after each pipeline sub-task of said largest task has finished executing. 7. A method for pipeline parallelizing a control program for execution on a multi-core processor comprising: translating source code of the control program into a lower-level intermediate representation; analyzing a workload of a process step of a largest task to identify one or more cut-points where the process step can be decoupled into stages; inserting a data dependency delay into each cut point to divide the largest task into pipelined stages; and scheduling the pipeline stages to the processor cores of the processor in a manner that enables multiple scan cycles of the largest task to be executed in parallel, wherein each pipelined stage is scheduled to a first core for a first scan cycle, and a parallel execution of each pipelined stage is scheduled to one or more second cores for a second scan cycle which is interleaved with the first scan cycle to generate an intermediate output for each pipelined stage on the one or more second cores simultaneous with a subsequent pipelined stage on the first core. 8. The method of claim 7 , wherein the scheduling is performed statically. 9. The method of claim 7 , wherein the scheduling is performed dynamically. 10. The method of claim 7 , further comprising: using data dependency analysis on the lower-level intermediate representation to identify tasks that can be performed in parallel; and identifying the largest task from the tasks that can be performed in parallel. 11. The method of claim 7 , further comprising: using multiple cores of a multi-core processor to execute intermediate iterations of a first pipeline stage simultaneous with an iteration of a subsequent pipeline stage; privatizing memory locations of sensor data for every execution cycle of the pipeline stages to relax loop carried dependencies; committing a first computation result of an intermediate iteration to one of said privatized memory locations, wherein a most recent computation result is available to subsequent executions of the control program; and committing a second computation result of the process step of the largest task to a main memory after each pipeline stage of said process step has finished executing. 12. A non-transitory program storage device readable by a computer, tangibly embodying a program of instructions executed by the computer to perform a method for pipeline parallelizing a control program for multi-core execution, the method comprising: performing a data dependency analysis on the control program to identify a plurality of tasks, each task comprising a set of operations executable over a plurality of scan cycles independent from the other tasks; identifying the largest task requiring the most execution time of the identified tasks; identifying cut-points in the largest task where data dependency delays decouple the largest task; inserting delayed data dependencies into the cut-points of the largest task to create N pipeline sub-tasks, wherein N is a number of cores available to a processor on which the control program will be executed; and scheduling the tasks and pipeline sub-tasks to the available processor cores in a manner that enables multiple scan cycles of the largest task to be executed in parallel, wherein each pipeline sub-task is scheduled to a first core for a first scan cycle, and a parallel execution of each pipeline sub-task is scheduled to one or more second cores for a second scan cycle which is interleaved with the first scan cycle to generate an intermediate output for each pipeline sub-task on the one or more second cores simultaneous with a subsequent pipeline sub-task on the first core. 13. The non-transitory program storage device of claim 12 , wherein the data dependency analysis is performed on a low-level intermediate representation of the control program. 14. The non-transitory program storage device of claim 12 , the method further comprising identifying a read step, a process step, and a write step of the largest task wherein the cut-points are identified in the process step. 15. The non-transitory program storage device of claim 12 , wherein the pipeline sub-tasks are load-balanced. 16. The non-transitory program storage device of claim 12 , the method further comprising scheduling each pipeline sub-task in its own processor core, wherein output of a first pipeline sub-task is communicated to a processor core of a subsequent pipeline sub-task to be input to the subsequent pipeline sub-task, and wherein the first pipeline sub-task can be executed in its core while a second pipeline sub-task is executing to generate an intermediate output that can be communicated to the core of the subsequent pipeline sub-task. 17. T

Assignees

Inventors

Classifications

G06F8/451Primary
Code distribution (considering CPU load at run-time G06F9/505; load rebalancing G06F9/5083) · CPC title
G06F8/41
Compilation · CPC title
G06F9/4881Primary
Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues · CPC title
G06F9/5038
considering the execution order of a plurality of tasks, e.g. taking priority or time dependency constraints into consideration (scheduling strategies G06F9/4881 and subgroups) · CPC title
G06F8/45
Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions · CPC title

Patent family

Related publications grouped by family.

View patent family 46584378

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9727377B2 cover?: A method for pipeline parallelizing a control program for multi-core execution includes using ( 12 ) data dependency analysis on a control program to identify tasks that can be performed in parallel, identifying ( 13 ) a largest task Tmax requiring the most execution time of the identified tasks, identifying ( 14 ) cut-points in the largest task Tmax where data dependency delays decouple the ta…
Who is the assignee on this patent?: Martinez Canedo Arquimedes, Ludwig Hartmut, Siemens Ag
What technology area does this patent fall under?: Primary CPC classification G06F8/451. Mapped technology areas include Physics.
When was this patent published?: Publication date Tue Aug 08 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?: We list 1 related publication on this page (citations in our corpus or others sharing the same primary CPC).