Scalar optimizations for shaders
US-9430199-B2 · Aug 30, 2016 · US
US2016139901A1 · US · A1
| Field | Value |
|---|---|
| Publication number | US-2016139901-A1 |
| Application number | US-201514620513-A |
| Country | US |
| Kind code | A1 |
| Filing date | Feb 12, 2015 |
| Priority date | Nov 18, 2014 |
| Publication date | May 19, 2016 |
| Grant date | — |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Systems, methods, and computer programs are disclosed for performing runtime auto-parallelization of application code. One embodiment of such a method comprises receiving application code to be executed in a multi-processor system. The application code comprises an injected code cost computation expression for at least one loop in the application code defining a serial workload for processing the loop. A runtime profitability check of the loop is performed based on the injected code cost computation expression to determine whether the serial workload can be profitably parallelized. If the serial workload can be profitably parallelized, the loop is executed in parallel using two or more processors in the multi-processor system.
Opening claim text (preview).
What is claimed is: 1 . A method for performing runtime auto-parallelization of application code, the method comprising: receiving application code to be executed in a multi-processor system, the application code comprising an injected code cost computation expression for at least one loop in the application code defining a serial workload for processing the loop; performing a runtime profitability check of the loop based on the injected code cost computation expression to determine whether the serial workload can be profitably parallelized; and if the serial workload can be profitably parallelized, executing the loop in parallel using two or more processors in the multi-processor system. 2 . The method of claim 1 , wherein the performing the runtime profitability check comprises: computing a parallelized workload based on an available number of processors; and determining whether a sum of the parallelized workload and a parallelization overhead parameter exceeds the serial workload. 3 . The method of claim 1 , wherein the injected code cost computation expression defines a first static portion of the serial workload defined at compile time and a second dynamic portion of the serial workload to be computed at runtime. 4 . The method of claim 3 , wherein the performing the runtime profitability check comprises: computing the second dynamic portion of the serial workload; and defining the serial workload as a sum of the first static portion and the second dynamic portion. 5 . The method of claim 4 , wherein the runtime profitability check further comprises determining whether parallelizing the serial workload exceeds a breakeven point based on a parallelization overhead parameter. 6 . The method of claim 1 , wherein the performing the runtime profitability check comprises determining profiling information related to behavior of the application code. 7 . The method of claim 1 , further comprising: if the serial workload cannot be profitably parallelized, executing the loop in serial using only one of the two or more processors in the multi-processor system. 8 . The method of claim 1 , wherein the injected code cost computation expression is computed by a code cost analysis algorithm at compile time. 9 . The method of claim 8 , wherein the code cost analysis algorithm computes the code cost computation expression by constructing a directed acyclic graph for the loop. 10 . The method of claim 1 , wherein the multi-processor system is incorporated in a portable computing device comprising one or more of a mobile phone, a tablet computer, a gaming device, and a navigation device, and the multi-processor system comprises a plurality of processors comprising one or more of a multi-core processor, a central processing unit (CPU), a graphics processor unit (GPU), and a digital signal processor (DSP). 11 . A system for performing runtime auto-parallelization of application code, the method comprising: means for receiving application code to be executed in a multi-processor system, the application code comprising an injected code cost computation expression for at least one loop in the application code defining a serial workload for processing the loop; means for performing a runtime profitability check of the loop based on the injected code cost computation expression to determine whether the serial workload can be profitably parallelized; and means for executing the loop in parallel using two or more processors in the multi-processor system if the serial workload can be profitably parallelized. 12 . The system of claim 11 , wherein the means for performing the runtime profitability check comprises: means for computing a parallelized workload based on an available number of processors; and means for determining whether a sum of the parallelized workload and a parallelization overhead parameter exceeds the serial workload. 13 . The system of claim 11 , wherein the injected code cost computation expression defines a first static portion of the serial workload defined at compile time and a second dynamic portion of the serial workload to be computed at runtime. 14 . The system of claim 13 , wherein the means for performing the runtime profitability check comprises: means for computing the second dynamic portion of the serial workload; and means for defining the serial workload as a sum of the first static portion and the second dynamic portion. 15 . The system of claim 14 , wherein the runtime profitability check further comprises means for determining whether parallelizing the serial workload exceeds a breakeven point based on a parallelization overhead parameter. 16 . The system of claim 11 , wherein the means for performing the runtime profitability check comprises means for determining profiling information related to behavior of the application code. 17 . The system of claim 11 , further comprising: means for executing the loop in serial using only one of the two or more processors in the multi-processor system if the serial workload cannot be profitably parallelized. 18 . The system of claim 11 , wherein the injected code cost computation expression is computed by a code cost analysis algorithm at compile time. 19 . The system of claim 18 , wherein the code cost analysis algorithm computes the code cost computation expression by constructing a directed acyclic graph for the loop. 20 . The system of claim 11 , wherein the multi-processor system is incorporated in a portable computing device comprising one or more of a mobile phone, a tablet computer, a gaming device, and a navigation device, and the multi-processor system comprises a plurality of processors comprising one or more of a multi-core processor, a central processing unit (CPU), a graphics processor unit (GPU), and a digital signal processor (DSP). 21 . A computer program embodied in a computer-readable medium and executable by a processor for performing runtime auto-parallelization of application code, the computer program comprising logic configured to: receive application code to be executed in a multi-processor system, the application code comprising an injected code cost computation expression for at least one loop in the application code defining a serial workload for processing the loop; perform a runtime profitability check of the loop based on the injected code cost computation expression to determine whether the serial workload can be profitably parallelized; and if the serial workload can be profitably parallelized, execute the loop in parallel using two or more processors in the multi-processor system. 22 . The computer program of claim 21 , wherein the logic configured to perform the runtime profitability check comprises logic configured to: compute a parallelized workload based on an available number of processors; and determine whether a sum of the parallelized workload and a parallelization overhead parameter exceeds the serial workload. 23 . The computer program of claim 21 , wherein the injected code cost computation expression defines a first static portion of the serial workload defined at compile time and a second dynamic portion of the serial workload to be computed at runtime. 24 . The computer program of claim 23 , wherein the logic configured to perform the runtime profitability check comprises logic configured to: compute the second dynamic portion of the serial workload; and define the
Related publications grouped by family.
Answers are generated from the same data shown on this page.