Automated software program repair
US-2018165182-A1 · Jun 14, 2018 · US
US10853044B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-10853044-B2 |
| Application number | US-201816154560-A |
| Country | US |
| Kind code | B2 |
| Filing date | Oct 8, 2018 |
| Priority date | Oct 6, 2017 |
| Publication date | Dec 1, 2020 |
| Grant date | Dec 1, 2020 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
System and method of compiling a program having a mixture of host code and device code to enable Profile Guided Optimization (PGO) for device code execution. An exemplary integrated compiler can compile source code programmed to be executed by a host processor (e.g., CPU) and a co-processor (e.g., a GPU) concurrently. The compilation can generate an instrumented executable code which includes: profile instrumentation counters for the device functions; and instructions for the host processor to allocate and initialize device memory for the counters and to retrieve collected profile information from the device memory to generate instrumentation output. The output is fed back to the compiler for compiling the source code a second time to generate optimized executable code for the device functions defined in the source code.
Opening claim text (preview).
What is claimed is: 1. A method comprising: compiling a program a first time, wherein the program is to be performed by a co-processor and a host processor, and the compiling the program the first time generates instrumented executable code, the instrumented executable code being operable to cause the host processor to initialize one or more profile counters for updates to be made during a performance of the program; causing the performance of the program by the co-processor and the host processor after compiling the program the first time and storing profile information associated with the program resulting from the performance, wherein at least a portion of the profile information is based, at least in part, on the one or more profile counters that reflect the updates; and compiling the program a second time after storing the profile information, wherein the compiling the program the second time results in the program being executable by the co-processor and the host processor according to the profile information. 2. The method of claim 1 , wherein the host processor is a Central Processing Unit (CPU) and the co-processor is a Graphics Processing Unit (GPU). 3. The method of claim 1 , wherein the compiling the program the first time and the compiling the program the second time each comprise generating a representation of a Control Flow Graph (CFG) for the program and constructing a Minimum Spanning Tree (MST) of the Control Flow Graph (CFG). 4. The method of claim 3 , wherein the constructing the MST of the CFG is for a function of the co-processor; and the method further comprises instrumenting edges of the MST with profile counters of the one or more profile counters that are configured to increment in atomic operations when the co-processor executes the instrumented executable code. 5. The method of claim 1 , wherein the instrumented executable code is further operable when executed by the host processor to: cause the host processor to allocate a co-processor memory for the one or more profile counters. 6. The method of claim 5 , wherein the one or more profile counters are associated with functions of a kernel, wherein the instrumented executable code is further operable to cause, after the host processor initializes the one or more profile counters, the host processor to invoke the kernel for execution by the co-processor. 7. The method of claim 6 , wherein the instrumented executable code is operable to cause the host processor to copy the one or more profile counters from the co-processor memory to a host processor memory after execution completion of the kernel. 8. The method of claim 1 , wherein the instrumented executable code is operable to cause the host processor to call a library to write the profile information into a file. 9. The method of claim 1 , wherein the compiling the program the first time comprises performing a set of separate compilations for multiple portions of source code of the program, wherein the performing a separate compilation comprises: inserting instrumentation code for a portion of the source code in a separate compilation; and generating an initialized constant variable for the separate compilation, wherein the initialized constant variable comprises a partial function call list associated with the separate compilation. 10. The method of claim 9 , the compiling the program the first time further comprises linking the instrumented code resulting from the set of separate compilations to generate the instrumented executable code, and wherein the linking comprises: generating a combined call list from partial function call lists; and generating a representation of a combined Call Graph comprising partial call graphs associated with the multiple portions of the source code respectively. 11. The method of claim 9 , wherein the performing the separate compilation further comprises: sending instrumentation information for the portion from a co-processor compiler to a host-processor compiler; and declaring mirrors for counters at the host-processor compiler. 12. The method of claim 4 , wherein the compiling the program the second time comprises: setting values of profile counters for the edges in the MST; populating profile counters of edges and basic blocks of the function using instrumented counts; and during the compiling the program the second time, querying the profile information to obtain counts for the edges and the basic blocks of the function. 13. A system comprising: at least one processor; and at least one memory coupled to the at least one processor and storing instructions that, when executed by the at least one processor, cause the system to perform a method comprising: compiling a program a first time, wherein the program is to be performed by a co-processor and a host processor, and the compiling the program the first time generates instrumented executable code, the instrumented executable code being operable to cause the host processor to as part of a performance of the program: initialize one or more profile counters corresponding to a kernel for updates to be made during the performance of the program; and invoke the kernel for execution by the co-processor after initializing the one or more profile counters; causing the performance of the program by the co-processor after compiling the program the first time and storing profile information associated with the program resulting from the performance, wherein at least a portion of the profile information is based, at least in part, on the one or more profile counters that reflect the updates; and compiling the program a second time after storing the profile information, wherein the compiling the program the second time results in the program being executable by the co-processor and the host processor according to the profile information. 14. The system of claim 13 , wherein the compiling the program the first time and the compiling the program the second time each comprise generating a representation of a Control Flow Graph (CFG) for the source code and generating a Minimum Spanning Tree (MST) of the CFG for a function of the co-processor, the generating the MST including instrumenting edges of the MST with profile counters that are configured to increment in atomic operations. 15. The system of claim 13 , wherein the instrumented executable code is operable when executed by the host processor to cause the host processor to allocate a co-processor memory for the one or more profile counters, and wherein the instrumented executable code when executed by the co-processor is operable to cause the co-processor to update one or more the profile counters during execution of the kernel. 16. The system of claim 15 , wherein the instrumented executable code is operable to cause the host processor to copy the profile counters from said the co-processor memory to a host processor memory after execution completion of the kernel. 17. The system of claim 13 , wherein the compiling the program the first time comprises performing a set of separate compilations for multiple portions of source code of the program, wherein performing a separate compilation comprises: inserting instrumentation code for a portion of the source code in a separate compilation; and generating an initialized constant variable for the separate compilation, wherein the initialized constant variable comprises a partial function call list associated with the separate compilation. 18. The system of claim 17 , wherein the compiling the program the first time further comprises linking
Compilation · CPC title
Optimisation · CPC title
Graphs; Linked lists (G06F16/9027 takes precedence) · CPC title
by performing operations on the source code, e.g. via a compiler · CPC title
Synchronisation, e.g. post-wait, barriers, locks (synchronisation among tasks G06F9/52) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.