Generating object code from intermediate code that includes hierarchical sub-routine information
US-2016364216-A1 · Dec 15, 2016 · US
US10102015B1 · US · B1
| Field | Value |
|---|---|
| Publication number | US-10102015-B1 |
| Application number | US-201715630836-A |
| Country | US |
| Kind code | B1 |
| Filing date | Jun 22, 2017 |
| Priority date | Jun 22, 2017 |
| Publication date | Oct 16, 2018 |
| Grant date | Oct 16, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A computing device for just-in-time cross-compiling compiled binaries of application programs that utilize graphics processing unit (GPU) executed programs configured to be executed on a first GPU having a first instruction set architecture (ISA), the computing device including a second GPU having a second ISA different from the first ISA of the first GPU, and a processor configured to execute an application program that utilizes a plurality of GPU-executed programs configured to be executed for the first ISA of the first GPU, execute a run-time executable cross-compiler configured to, while the application program is being executed, translate compiled binary of the plurality of GPU-executed programs from the first ISA to the second ISA, and execute the translated plurality of GPU-executed programs on the second GPU.
Opening claim text (preview).
The invention claimed is: 1. A computing device for just-in-time cross-compiling compiled binaries of application programs that utilize graphics processing unit (GPU) executed programs configured to be executed on a first GPU having a first instruction set architecture (ISA), the computing device comprising: a second GPU having a second ISA different from the first ISA of the first GPU; and a processor configured to: execute an application program that utilizes a plurality of GPU-executed programs configured to be executed for the first ISA of the first GPU; execute a run-time executable cross-compiler configured to, while the application program is being executed, translate compiled binary of the plurality of GPU-executed programs from the first ISA to the second ISA; and execute the translated plurality of GPU-executed programs on the second GPU. 2. The computing device of claim 1 , wherein the run-time executable cross-compiler is configured to translate the plurality of GPU-executed programs without inflation to an intermediate representation including a control flow graph. 3. The computing device of claim 1 , wherein the run-time executable cross-compiler is configured to translate the plurality of GPU-executed programs without co-mingling first ISA instructions of the plurality of GPU-executed programs. 4. The computing device of claim 1 , wherein the run-time executable cross-compiler is configured to preprocess the plurality of GPU-executed programs before performing translation. 5. The computing device of claim 4 , wherein to preprocess the plurality of GPU-executed programs, the run-time executable cross-compiler is configured to remove instructions of the first ISA that would translate to zero instructions in the second ISA. 6. The computing device of claim 4 , wherein to preprocess the plurality of GPU-executed programs, the run-time executable cross-compiler is configured to remove instructions which would be unreachable during execution. 7. The computing device of claim 4 , wherein to preprocess the plurality of GPU-executed programs, the run-time executable cross-compiler is configured to remove flow control instructions which would always flow to themselves during execution. 8. The computing device of claim 1 , wherein the run-time executable cross-compiler is configured to, before translating each GPU-executed program of the plurality of GPU-executed programs, iterate through instructions of that GPU-executed program to gather summary data selected from the group consisting of register usage data, memory access pattern data, and implicit control flow graph data. 9. The computing device of claim 8 , wherein the run-time executable cross-compiler is configured to translate the plurality of GPU-executed programs based on rules mapping between instructions of the first ISA and the second ISA, and the summary data. 10. The computing device of claim 1 , wherein the plurality of GPU-executed programs are configured to be executed for a first application binary interface (ABI) of the first GPU; the second GPU having a second ABI different from the first ABI of the first GPU; and the run-time executable cross-compiler is configured to emulate the first ABI using hardware resources of the second GPU. 11. The computing device of claim 10 , wherein to emulate the first ABI, the run-time executable cross-compiler is configured to translate from the second ABI to the first ABI before execution of a translated GPU-executed program. 12. The computing device of claim 10 , wherein to emulate the first ABI, the run-time executable cross-compiler is configured to translate from the first ABI to the second ABI after execution of a translated GPU-executed program. 13. A method for just-in-time cross-compiling compiled binaries of application programs that utilize graphics processing unit (GPU) executed programs configured to be executed on a first GPU having a first instruction set architecture (ISA), the method comprising: providing a second GPU having a second ISA different from the first ISA of the first GPU; executing an application program that utilizes a plurality of GPU-executed programs configured to be executed for the first ISA of the first GPU; executing a run-time executable cross-compiler including, while the application program is executing, translating compiled binary of the plurality of GPU-executed programs from the first ISA to the second ISA; and executing the translated plurality of GPU-executed programs on the second GPU. 14. The method of claim 13 , further comprising preprocessing the plurality of GPU-executed programs before performing translation. 15. The method of claim 14 , wherein preprocessing the plurality of GPU-executed programs includes removing instructions of the first ISA that would translate to zero instructions in the second ISA. 16. The method of claim 14 , wherein preprocessing the plurality of GPU-executed programs includes removing instructions which would be unreachable during execution. 17. The method of claim 14 , wherein preprocessing the plurality of GPU-executed programs includes removing flow control instructions which would always flow to themselves during execution. 18. The method of claim 13 , wherein executing the run-time executable cross-compiler includes, before translating each GPU-executed program of the plurality of GPU-executed programs, iterating through instructions of that GPU-executed program to gather summary data selected from the group consisting of register usage data, memory access pattern data, and implicit control flow graph data. 19. The computing device of claim 18 , wherein executing the run-time executable cross-compiler includes translating the plurality of GPU-executed programs based on rules mapping between instructions of the first ISA and the second ISA, and the summary data. 20. A computing device for just-in-time cross-compiling compiled binaries of application programs that utilize graphics processing unit (GPU) executed programs configured to be executed on a first GPU having a first instruction set architecture (ISA) and a first application binary interface (ABI), the computing device comprising: a second GPU having a second ISA and a second ABI different from the first ISA and first ABI of the first GPU; and a processor configured to: execute an application program that utilizes a plurality of GPU-executed programs configured to be executed for the first ISA and first ABI of the first GPU; execute a run-time executable cross-compiler configured to, while the application program is being executed: preprocess the plurality of GPU-executed programs before performing translation; iterate through instructions of the plurality of GPU-executed programs to gather summary data selected from the group consisting of register usage data, memory access pattern data, and implicit control flow graph data; translate compiled binary of the plurality of GPU-executed programs from the first ISA to the second ISA based on rules mapping between instructions of the first ISA and the second ISA, and the summary data; emulate the first ABI using the hardware resources of the second GPU; and execute the translated plurality of GPU-executed programs on the second GPU.
Involving translation to a different instruction set architecture, e.g. just-in-time translation in a JVM · CPC title
Detection or removal of dead or redundant code · CPC title
for non-native instruction set, e.g. Javabyte, legacy code · CPC title
Reducing the execution time required by the program code · CPC title
Arrangements for executing machine instructions, e.g. instruction decode (for executing microinstructions G06F9/22) · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.