Using an inline stack to improve performance of an applications binary
US-9009691-B1 · Apr 14, 2015 · US
US9304748B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9304748-B2 |
| Application number | US-201314014571-A |
| Country | US |
| Kind code | B2 |
| Filing date | Aug 30, 2013 |
| Priority date | Aug 7, 2013 |
| Publication date | Apr 5, 2016 |
| Grant date | Apr 5, 2016 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
The various aspects leverage the novel observation that the number of call sites in code is directly correlated with the code's compile time and provide methods implemented by a compiler operating on a computing device (e.g., a smartphone) for performing inline throttling based on a projected number of call sites in the code that would exist after performing inline expansion. The various aspects enable the compiler to improve the performance of the generated code by aggressive inlining while carefully managing increases in compile time, thereby decreasing the power required to compile the code while increasing performance of the computing device. Thus, by inlining enough call sites to reduce the costs of handling calls while accounting for the costs of inlining, the various aspects provide for an effective balance of short compile times and effective code performance.
Opening claim text (preview).
What is claimed is: 1. A method of optimizing inline code generation by a compiler operating on a computing device, comprising: selecting a call site detected during a scan of code, comprising: detecting a group of call sites during the scan of the code; setting a call-site counter equal to a number of call sites in the group of call sites; ranking the group of call sites based on a number of nested function calls included in a function called by each of the group of call sites; and selecting the call site based on rank, wherein call sites are selected beginning with a lowest rank; determining a number of nested function calls included in a called function of the call site; determining whether the call site is eligible for inlining based at least on the number of nested function calls included in the called function and the call-site counter indicating a total number of calls to any function currently included within the code; and inlining the call site in response to determining that the call site is eligible for inlining. 2. The method of claim 1 , wherein the code is bytecode. 3. The method of claim 1 , wherein determining whether the call site is eligible for inlining is based only on a sum of the call-site counter and one of the number of nested function calls and a net change in a number of call sites. 4. The method of claim 1 , further comprising: adding the number of nested function calls to the call-site counter after inlining the call site; and adding one to the call-site counter in response to determining that the call site is ineligible for inlining. 5. The method of claim 1 , wherein determining whether the call site is eligible for inlining comprises: determining whether a sum of the number of nested function calls and the call-site counter exceeds a call-site threshold; and determining that the call site is eligible for inlining in response to determining that the sum does not exceed the call-site threshold. 6. The method of claim 1 , wherein determining whether the call site is eligible for inlining comprises: determining whether the number of nested function calls is greater than one; and determining that the call site is eligible for inlining in response to determining that the number of nested function calls is not greater than one. 7. The method of claim 1 , wherein determining whether the call site is eligible for inlining is not based on any of whether the call site is on an execution path, a depth of nested function calls in the call site, how often the call site is called, a size of code in which the call site is located, effects of inlining on a size of the code, whether inlining would result in a stack overflow, and effects of inlining on execution time. 8. The method of claim 1 , wherein determining whether the call site is eligible for inlining comprises: determining whether a sum of a net change in the number of call sites and the call-site counter exceeds a call-site threshold; determining that the call site is eligible for inlining in response to determining that the sum does not exceed the call-site threshold; and determining that the call site is ineligible for inlining in response to determining that the sum exceeds the call-site threshold. 9. The method of claim 8 , further comprising adjusting the call-site counter by the net change in the number of nested function calls after inlining the call site. 10. The method of claim 1 , further comprising: determining whether each of the group of call sites has been selected; selecting an unselected call site in the group of call sites with a next lowest rank in response to determining that each of the group of call sites has not been selected; and ending inline optimization in response to determining that each of the group of call sites has been selected. 11. A computing device, comprising: a memory; and a processor coupled to the memory, wherein the processor is configured with processor-executable instructions to perform operations of an on-device compiler configured to optimize inline code generation, the operations comprising: selecting a call site detected during a scan of code, comprising: detecting a group of call sites during the scan of the code: setting a call-site counter equal to a number of call sites in the group of call sites; ranking the group of call sites based on a number of nested function calls included in a function called by each of the group of call sites; and selecting the call site based on rank, wherein call sites are selected beginning with a lowest rank; determining a number of nested function calls included in a called function of the call site; determining whether the call site is eligible for inlining based at least on the number of nested function calls included in the called function and the call-site counter indicating a total number of calls to any function currently included within the code; and inlining the call site in response to determining that the call site is eligible for inlining. 12. The computing device of claim 11 , wherein the code is bytecode. 13. The computing device of claim 11 , wherein the processor is configured with processor-executable instructions to perform operations such that determining whether the call site is eligible for inlining comprises determining whether the call site is eligible for inlining based only on a sum of the call-site counter and one of the number of nested function calls and a net change in a number of call sites. 14. The computing device of claim 11 , wherein the processor is configured with processor-executable instructions to perform operations further comprising: adding the number of nested function calls to the call-site counter after inlining the call site; and adding one to the call-site counter in response to determining that the call site is ineligible for inlining. 15. The computing device of claim 11 , wherein the processor is configured with processor-executable instructions to perform operations such that determining whether the call site is eligible for inlining comprises: determining whether a sum of the number of nested function calls and the call-site counter exceeds a call-site threshold; and determining that the call site is eligible for inlining in response to determining that the sum does not exceed the call-site threshold. 16. The computing device of claim 11 , wherein the processor is configured with processor-executable instructions to perform operations such that determining whether the call site is eligible for inlining comprises: determining whether the number of nested function calls is greater than one; and determining that the call site is eligible for inlining in response to determining that the number of nested function calls is not greater than one. 17. The computing device of claim 11 , wherein the processor is configured with processor-executable instructions to perform operations such that determining whether the call site is eligible for inlining comprises determining whether the call site is eligible for inlining not based on any of whether the call site is on an execution path, a depth of nested function calls in the call site, how often the call site is called, a size of code in which the call site is located, effects of inlining on a size of the code, whether inlining would result in a stack overflow, and effects of inlining on execution time. 18. The computing device of claim 11 , wherein the processor is configured with processor-executable instructions to perform operations such that determining wheth
Inlining · CPC title
Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators · CPC title
Optimisation · CPC title
Performance evaluation by tracing or monitoring · CPC title
Runtime code conversion or optimisation · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.