Implementing a jump instruction in a dynamic translator that uses instruction code translation and just-in-time compilation
US-9213563-B2 · Dec 15, 2015 · US
US9798528B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9798528-B2 |
| Application number | US-53131306-A |
| Country | US |
| Kind code | B2 |
| Filing date | Sep 13, 2006 |
| Priority date | Sep 13, 2006 |
| Publication date | Oct 24, 2017 |
| Grant date | Oct 24, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A solution for cooperative data prefetching that enables software control of a memory-side data prefetch and/or a processor-side data prefetch is provided. In one embodiment, the invention provides a solution for generating an application, in which access to application data for the application is improved (e.g., optimized) in program code for the application. In particular, a push request, for performing a memory-side data prefetch of the application data, and a prefetch request, for performing a processor-side data prefetch, are added to the program code. The memory-side data prefetch results in the application data being copied from a first data store to a second data store that is faster than the first data store while the processor-side data prefetch results in the application data being copied from the second data store to a third data store that is faster than the second data store.
Opening claim text (preview).
What is claimed is: 1. A method of generating an application, the method comprising: loading source code for an application into a memory of a computing system, compiling the source code into program code for the application, and during the compilation of the source code: for each reference to application data determined as likely to generate a memory miss, adding at a position in the program code that is several operations prior to the reference a push request into the program code for a memory-side data prefetch to the program code for the referenced application data, wherein the memory-side data prefetch causes a near memory processor (NMP) to copy referenced application data for the application from random access memory to an L2 cache; and thereafter adding a prefetch request for a processor-side data prefetch to the program code at a position in the program code that is a number operations prior to the reference equivalent to a number of operations required to complete the prefetch, wherein the processor-side data prefetch causes a main processor that is different than the NMP to copy the application data from the L2 cache to an L1 cache. 2. The method of claim 1 , further comprising analyzing a memory access pattern for the application to identify at least one of: the application data, a location in the program code for the push request, or a location in the program code for the prefetch request. 3. The method of claim 2 , wherein the analyzing includes determining that a request to access the application data is likely to incur a memory miss. 4. The method of claim 2 , wherein the analyzing includes analyzing at least one of: application data dependencies or application data structure types at compile-time. 5. The method of claim 2 , wherein the analyzing includes analyzing application data access patterns of the application during runtime. 6. The method of claim 1 , further comprising translating source code for the application into the program code. 7. The method of claim 1 , wherein the push request causes a program to be executed by an execution environment separately from the application. 8. The method of claim 7 , wherein the improving further includes defining a custom program for the program. 9. The method of claim 1 , further comprising: performing a set of high level optimizations on the program code prior to the improving; and performing a set of low level optimizations on the program code after the improving. 10. A computer hardware system for generating an application, the system comprising: at least one processor, wherein the at least one processor is configured to load source code for an application into a memory of the computing hardware system, compiling the source code into program code for the application, and during the compilation of the source code: for each reference to application data determined as likely to generate a memory miss, add at a position in the program code that is several operations prior to the reference a push request into the program code for a memory-side data prefetch to program code of the application for the referenced application data, wherein the memory-side data prefetch causes a near memory processor (NMP) to copy referenced application data for the application from random access memory to an L2 cache; and thereafter add a prefetch request for a processor-side data prefetch to the program code at a position in the program code that is a number operations prior to the reference equivalent to a number of operations required to complete the prefetch, wherein the processor-side data prefetch causes a main processor that is different than the NMP to copy the application data from the L2 cache to an L1 cache. 11. The system of claim 10 , wherein the at least one processor is further configured to analyze a memory access pattern for the application to identify at least one of: the application data, a location in the program code for the push request, and a location in the program code for the prefetch request. 12. The system of claim 11 , wherein the analyzing includes determining that a request to access the application data is likely to incur a memory miss. 13. The system of claim 11 , wherein the analyzing includes analyzing at least one of: application data dependencies and application data structure types at compile-time. 14. The system of claim 11 , wherein the system for analyzing includes a system for analyzing application data access patterns of the application during runtime. 15. The system of claim 10 , wherein the at least one processor is further configured to translate source code for the application into the program code. 16. The system of claim 10 , wherein the push request causes a program to be executed by the execution environment separately from the application. 17. The system of claim 10 , wherein the at least one processor is further configured to: perform a set of high level optimizations on the program code prior improving access to the application data; and perform a set of low level optimizations on the program code after the improving access to the application data. 18. A computer program product comprising at least one non-transitory computer-readable storage medium having stored therein computer usable program code for generating an application, which when executed by a computer hardware system, causes the computer hardware system to perform: loading source code for an application into a memory of a computing system, compiling the source code into program code for the application, and during the compilation of the source code: for each reference to application data determined as likely to generate a memory miss, adding at a position in the program code that is several operations prior to the reference a push request into the program code for a memory-side data prefetch to the program code for the referenced application data, wherein the memory-side data prefetch causes a near memory processor (NMP) to copy referenced application data for the application from random access memory to an L2 cache; and thereafter adding a prefetch request for a processor-side data prefetch to the program code at a position in the program code that is a number operations prior to the reference equivalent to a number of operations required to complete the prefetch, wherein the processor-side data prefetch causes a main processor that is different than the NMP to copy the application data from the L2 cache to an L1 cache. 19. The computer program product of claim 18 , wherein the computer hardware system is further configured to perform analyzing a memory access pattern for the application to identify at least one of: the application data, a location in the program code for the push request, and a location in the program code for the prefetch request. 20. The computer program product of claim 18 , wherein the computer hardware system is further configured to perform translating source code for the application into the program code. 21. The computer program product of claim 18 , wherein the push request causes a program to be executed by the execution environment separately from the application. 22. The computer program product of claim 21 , wherein the computer hardware system is further configured to perform defining a custom program for the program.
Reducing the number of cache misses; Data prefetching (cache prefetching G06F12/0862) · CPC title
Prefetching based on hints or prefetch instructions · CPC title
with multilevel cache hierarchies · CPC title
Transformation of program code · CPC title
with prefetch · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.