Method and apparatus for compiling optimization using activation recalculation
US-2024303054-A1 · Sep 12, 2024 · US
US10776085B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10776085-B2 |
| Application number | US-201815951354-A |
| Country | US |
| Kind code | B2 |
| Filing date | Apr 12, 2018 |
| Priority date | Mar 27, 2006 |
| Publication date | Sep 15, 2020 |
| Grant date | Sep 15, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computer-implemented method for creating a program for a multi-processor system comprising a plurality of interspersed processors and memories. A user may specify or create source code using a programming language. The source code specifies a plurality of tasks and communication of data among the plurality of tasks. However, the source code may not (and preferably is not required to) 1) explicitly specify which physical processor will execute each task and 2) explicitly specify which communication mechanism to use among the plurality of tasks. The method then creates machine language instructions based on the source code, wherein the machine language instructions are designed to execute on the plurality of processors. Creation of the machine language instructions comprises assigning tasks for execution on respective processors and selecting communication mechanisms between the processors based on location of the respective processors and required data communication to satisfy system requirements.
Opening claim text (preview).
The invention claimed is: 1. A method, comprising: configuring communications mechanisms for a program to be executed on a multi-processor system, wherein the multi-processor system comprises an array of processors and a plurality of memories coupled to the processors, wherein the plurality of memories are interspersed among the processors within an apparatus, wherein each of the processors is coupled to at least one other processor, and wherein the configuring includes: determining, for each of a plurality of communications of the program, sending and receiving processors and a location of data being communicated; generating performance modeling data based on the determined sending and receiving processors and data locations, wherein the performance modeling data includes indications of predicted messaging congestion; selecting communications mechanisms for the plurality of communications based on the performance modeling data, including selecting message passing for at least a portion of the communications and shared memory for at least a portion of the communications; routing, based on the performance modeling data, communications paths for communications for which message passing is selected; and synthesizing the routed communications paths. 2. The method of claim 1 , wherein the selecting communications mechanisms includes selecting from among shared memory, memory to memory, memory to register, register to memory, and register to register transfers. 3. The method of claim 1 , wherein the determining is based on symbolic processor indicators in the program. 4. The method of claim 1 , wherein the synthesizing includes binding communications requirements specified in source code of the program to routing logic. 5. The method of claim 1 , further comprising: assigning tasks to processors based on the performance modeling data. 6. The method of claim 1 , wherein the selecting and routing are further based on one or more parameters associated with the performance modeling data including one or more of: latency, throughput, and power consumption. 7. The method of claim 1 , further comprising: determining a schedule for multiple ones of the message passing communications that share at least a portion of a physical route. 8. The method of claim 1 , wherein the determining includes copying data for one of the communications to multiple memories. 9. A non-transitory computer-readable medium having instructions stored thereon that are executable by a computing device to perform operations comprising: configuring communications mechanisms for a program to be executed on a multi-processor system, wherein the multi-processor system comprises an array of processors and a plurality of memories coupled to the processors, wherein the plurality of memories are interspersed among the processors within an apparatus, wherein each of the processors is coupled to at least one other processor, and wherein the configuring includes: determining, for each of a plurality of communications of the program, sending and receiving processors and a location of data being communicated; generating performance modeling data based on the determined sending and receiving processors and data locations, wherein the performance modeling data includes indications of predicted messaging congestion; selecting communications mechanisms for the plurality of communications based on the performance modeling data, including selecting message passing for at least a portion of the communications and shared memory for at least a portion of the communications; routing, based on the performance modeling data, communications paths for communications for which message passing is selected; and synthesizing the routed communications paths. 10. The non-transitory computer-readable medium of claim 9 , wherein the selecting communications mechanisms includes selecting from among shared memory, memory to memory, memory to register, register to memory, and register to register transfers. 11. The non-transitory computer-readable medium of claim 9 , wherein the determining is based on symbolic processor indicators in the program. 12. The non-transitory computer-readable medium of claim 9 , wherein the synthesizing includes binding communications requirements specified in source code of the program to routing logic. 13. The non-transitory computer-readable medium of claim 9 , wherein the operations further comprise: assigning tasks to processors based on the performance modeling data. 14. The non-transitory computer-readable medium of claim 9 , wherein the routing is further based on a latency parameter associated with the performance modeling data. 15. The non-transitory computer-readable medium of claim 9 , wherein the routing is further based on a throughput parameter associated with the performance modeling data. 16. The non-transitory computer-readable medium of claim 9 , wherein the operations further comprise: implementing a deadlock avoidance mechanism for multiple ones of the message passing communications that share at least a portion of a physical route. 17. An apparatus, comprising: one or more processors; and one or more memories having program instructions stored thereon that are executable by the one or more processors to: configure communications mechanisms for a program to be executed on a multi-processor system, wherein the multi-processor system comprises an array of processors and a plurality of memories coupled to the processors, wherein the plurality of memories are interspersed among the processors within an apparatus, wherein each of the processors is coupled to at least one other processor, and wherein the configuration includes: determine, for each of a plurality of communications of the program, sending and receiving processors and a location of data being communicated; generate performance modeling data based on the determined sending and receiving processors and data locations, wherein the performance modeling data includes indications of predicted messaging congestion; select communications mechanisms for the plurality of communications based on the performance modeling data, including selecting message passing for at least a portion of the communications and shared memory for at least a portion of the communications; route, based on the performance modeling data, communications paths for communications for which message passing is selected; and synthesize the routed communications paths. 18. The apparatus of claim 17 , wherein to selection the communications mechanisms, the instructions are executable to select from among: shared memory, memory to memory, memory to register, register to memory, and register to register transfers. 19. The apparatus of claim 17 , wherein the instructions are further executable to: assign tasks to processors based on the performance modeling data. 20. The apparatus of claim 17 , wherein the instructions are further executable to: resolve one or more conflicts between multiple ones of the message passing communications that share at least a portion of a physical route.
Code distribution (considering CPU load at run-time G06F9/505; load rebalancing G06F9/5083) · CPC title
Interprogram communication · CPC title
Message passing systems or structures, e.g. queues · CPC title
Program synchronisation; Mutual exclusion, e.g. by means of semaphores · CPC title
Compilation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.