Implementing a jump instruction in a dynamic translator that uses instruction code translation and just-in-time compilation
US-9213563-B2 · Dec 15, 2015 · US
US9946523B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9946523-B2 |
| Application number | US-83055310-A |
| Country | US |
| Kind code | B2 |
| Filing date | Jul 6, 2010 |
| Priority date | Jul 15, 2009 |
| Publication date | Apr 17, 2018 |
| Grant date | Apr 17, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
A code region of an application is instrumented by a multi-pass profiler with first annotations for generating profile data. The application is executed with the first annotations, wherein executing the application with the first annotations generates first profile data for the code region. The multi-pass profiler identifies, from the first profile data, the code region as a delinquent code region. The multi-pass profiler determines second annotations based, at least in part, on the first profile data and the at least one of the first annotations that defines the delinquent code region. The multi-pass profiler instruments, based on the first profile data, a code sub-region of the delinquent code region with the second annotations for generating profile data. The application is executed with second annotations, wherein executing the application with the second annotations generates second profile data for the code sub-region.
Opening claim text (preview).
What is claimed is: 1. A method implemented with one or more processors coupled to programmed instructions respectively stored on one or more computer readable storage media, where execution of the respectively stored instructions by the one or more processors causes the method to be performed, comprising: instrumenting, by one or more processors executing a multi-pass profiler from a first computer readable storage medium, a code region of an application with first annotations for generating profile data; executing, by the one or more processors, the application with the first annotations, wherein executing the application with the first annotations generates first profile data for the code region, the first profile data stored in a second computer readable storage medium; identifying, by the one or more processors executing the multi-pass profiler, from the first profile data, a delinquent code region within the code region, wherein the delinquent code region is defined by a location of at least one of the first annotations, wherein the delinquent code region comprises a code sub-region of the code region in which a count of cache misses exceeds a first predefined threshold; determining, by the one or more processors executing the multi-pass profiler, second annotations based, at least in part, on the first profile data and the at least one of the first annotations that defines the delinquent code region; instrumenting, by the one or more processors executing the multi-pass profiler, based on the first profile data, a code sub-region of the delinquent code region with the second annotations for generating profile data, wherein said instrumenting with the second annotations is finer grained cache miss profiling than said instrumenting with the first annotations, wherein the finer grained cache miss profiling comprises instrumenting each memory reference in the code sub-region associated with cache misses, and wherein the finer grained cache miss profiling identifies one or more lines of the code sub-region likely to lead to a cache miss; executing, by the one or more processors, the application with the second annotations, wherein said executing the application with the second annotations generates second profile data for the code sub-region; and in response to determining, by the one or more processors executing the multi-pass profiler, that a second count of cache misses associated with a memory reference in the code sub-region exceeds a second predefined threshold, providing, by the one or more processors executing the multi-pass profiler, a compiler hint for one or more lines of the code sub-region associated with the memory reference. 2. The method of claim 1 further comprising identifying the code sub-region of the delinquent code region as a delinquent code sub-region, wherein the delinquent code sub-region is defined by location of at least one of the second annotations. 3. The method of claim 2 further comprising optimizing the delinquent code sub-region based on the second profile data, wherein optimizing the delinquent code sub-region comprises performing at least one of inlining, cloning, outlining, indirect call specialization, delinquent load driven data prefetching, data reorganization, and instruction scheduling. 4. The method of claim 1 , wherein the second annotations comprise one or more memory delay annotations. 5. The method of claim 4 , wherein the finer grained cache miss profiling identifies one or more lines of the code sub-region likely to lead to the cache miss based on the one or more memory delay annotations specifying an address of an instruction in the code sub-region corresponding to a delinquent memory reference of executable code of the application. 6. The method of claim 5 , wherein the one or more memory delay annotations comprise one or more function calls to at least one function that specifies one or more delay cycles expected for one or more delinquent loads from the code sub-region. 7. The method of claim 4 , wherein the finer grained cache miss profiling identifies at least two lines of the code sub-region likely to lead to at least two cache misses in response to providing, as the second annotations, markers for each of the at least two lines of the code region, wherein the markers signal locations of memory references in the code sub-region. 8. The method of claim 1 further comprising: executing the application with the first annotations on a basic block of code for the code region; before determining the second annotations, automatically determining that the first profile data is associated with the basic block of code; and selecting, as the code sub-region, only individual memory references of the basic block of code that experienced one or more cache misses. 9. The method of claim 8 further comprising converting a sequence of high-level programming function call statements from the basic block of code associated with the first annotations to an in-line sequence of low-level code operations associated with the second annotations, said in-line sequence of low-level code operations comprising one or more of load or store operations for the memory references. 10. The method of claim 9 further comprising performing one or more optimizing operations after executing the application with the second annotations, wherein the performing the one or more optimizing operations comprises providing an expected value annotation having a formal parameter list of an expected value which specifies a likely value of an expression, wherein the providing the expected value annotation makes optimization tradeoffs in favor of the expression. 11. A computer program product for multiple-pass dynamic profiling, the computer program product comprising: a computer readable storage medium having computer readable program code embodied therewith, the computer readable program code configured to instrument a code region of an application with first annotations for generating profile data, execute the application with the first annotations, wherein executing the application with the first annotations generates first profile data for the code region, identify from the first profile data, a delinquent code region within the code region, wherein the delinquent code region is defined by location of at least one of the first annotations, wherein the delinquent code region comprises a code sub-region of the code region in which a count of cache misses exceeds a first predefined threshold, determine second annotations based, at least in part, on the first profile data and the at least one of the first annotations that defines the delinquent code region, instrument based on the first profile data, a code sub-region of the delinquent code region with the second annotations for generating profile data, wherein instrumenting with the second annotations is finer grained cache miss profiling than instrumenting with the first annotations, wherein the finer grained cache miss profiling comprises instrumenting each memory reference in the code sub-region, execute the application with second annotations, wherein said executing the application with the second annotations generates second profile data for the code sub-region, and in response to a determination that a second count of cache misses associated with a memory reference in the code sub-region exceeds a second predefined threshold, provide a compiler hint for one or more lines of the code sub-region associated with the memory reference. 12. The computer program product of claim 11 further comprising computer program code configured to identify the code sub-region of the delinquent code region as a delinquent c
Monitoring involving counting · CPC title
by performing operations on the source code, e.g. via a compiler · CPC title
Reducing the number of cache misses; Data prefetching (cache prefetching G06F12/0862) · CPC title
Monitoring specific for caches · CPC title
Performance evaluation by tracing or monitoring · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.