Optimizing software code
US-2015378757-A1 · Dec 31, 2015 · US
US9690552B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9690552-B2 |
| Application number | US-201414583657-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 27, 2014 |
| Priority date | Dec 27, 2014 |
| Publication date | Jun 27, 2017 |
| Grant date | Jun 27, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
Technologies for generating composable library functions include a first computing device that includes a library compiler configured to compile a composable library and second computing device that includes an application compiler configured to compose library functions of the composable library based on a plurality of abstractions written at different levels of abstractions. For example, the abstractions may include an algorithm abstraction at a high level, a blocked-algorithm abstraction at medium level, and a region-based code abstraction at a low level. Other embodiments are described and claimed herein.
Opening claim text (preview).
The invention claimed is: 1. A computing device to generate a composable library, the computing device comprising: a processor to establish a compiler module, wherein the compiler module is to generate the composable library that includes a binary representation and an intermediate representation of library functions based on source code of the library functions, and encode metadata into the composable library, wherein the metadata includes a plurality of abstractions of the library functions at different levels of abstraction and data access patterns of at least a portion of the plurality of abstractions, and wherein the plurality of abstractions comprises algorithm abstractions at a first abstraction level, blocked-algorithm abstractions at a second abstraction level, and region-based code abstractions at a third abstraction level. 2. The computing device of claim 1 , wherein the algorithm abstractions at the first abstraction level comprises algorithm abstractions at an abstraction level higher than each of the blocked-algorithm abstractions and the region-based code abstractions, wherein the blocked-algorithm abstractions at the second abstraction level comprises blocked-algorithm abstractions at an abstraction level lower than the algorithm abstractions and at an abstraction level higher than the region-based code abstractions, and wherein the region-based code abstractions at the third abstraction level comprises region-based code abstractions at an abstraction level lower than each of the algorithm abstractions and the blocked-algorithm abstractions. 3. The computing device of claim 1 , wherein the algorithm abstractions encode semantics of a library function at an abstraction level more abstract than language level. 4. The computing device of claim 1 , wherein the blocked-algorithm abstractions comprise loop nests around calls to the library functions. 5. The computing device of claim 1 , wherein the blocked-algorithm abstractions define partitions of an iteration space of the library functions. 6. The computing device of claim 1 , wherein the region-based code abstractions comprise library functions written as trees that include one or more regions, wherein each region of the trees includes a data space and an iteration space of a library function. 7. The computing device of claim 6 , wherein each region of the trees further includes one or more tuning parameters of the library function. 8. A computing device to generate an executable application, the computing device comprising: a processor to establish a compiler module to generate the executable application, wherein to generate the executable application includes to compose library functions of a composable library, wherein the composable library includes a binary representation of the library functions, an intermediate representation of the library functions, and metadata, wherein the metadata includes a plurality of abstractions for each library function and data access patterns of at least a portion of the plurality of abstractions, and wherein the plurality of abstractions comprises a plurality of algorithm abstractions, a plurality of blocked-algorithm abstractions, and a plurality of region-based code abstractions, wherein the compiler module is to use the plurality of abstractions and the data access patterns as a guide to compose the library functions. 9. The computing device of claim 8 , wherein the compiler module performs a first library function composition process using the algorithm abstractions, wherein to perform the first library function composition process comprises to perform a first loop merge operation on a first algorithm abstraction of a first library function and a second algorithm abstraction of a second library function at a mathematical level. 10. The computing device of claim 9 , wherein the compiler module performs a second library function composition process using the blocked-algorithm abstractions, wherein to perform the second library function composition process comprises to apply a second loop merge operation directly to a first loop of a first blocked-algorithm abstraction of the first library function and a second loop of a second blocked-algorithm abstraction of the second library function. 11. The computing device of claim 10 , wherein to perform the second library function composition process further comprises to perform a comparison of the data access patterns of the first and second loops of the first and second blocked-algorithm abstractions to determine whether an array element of the second loop of the second blocked-algorithm abstraction is accessed by the first loop of the first blocked-algorithm abstraction in a next iteration. 12. The computing device of claim 10 , wherein the compiler module performs a third library function composition process using the region-based code abstractions, wherein to perform the third library function composition process comprises to build a representation for each region-based code abstraction of the library functions and perform a third loop merge operation on a first loop of a first region-based code abstraction of the first library function and a second loop of a second region-based code abstraction of the second library function, and wherein the representation comprises a tree including one or more regions. 13. The computing device of claim 12 , wherein to perform the third library function composition process using the region-based code abstractions further comprises to determine whether an intermediate array will become dead after use and convert the intermediate array to a scalar in response to a determination that the intermediate array will become dead after use. 14. One or more non-transitory, computer-readable storage devices comprising a plurality of instructions stored thereon that in response to being executed cause a computing device to: compile, by a compiler module of the computing device, source code of library functions; generate, by the compiler module, the composable library as a result of the compilation of the source code; and encode, by the compiler module, metadata into the composable library, wherein the composable library includes a binary representation and an intermediate representation of library functions, wherein the metadata includes a plurality of abstractions of the library functions at different levels of abstraction and data access patterns of at least a portion of the plurality of abstractions, and wherein the plurality of abstractions comprises algorithm abstractions at a first abstraction level, blocked-algorithm abstractions at a second abstraction level, and region-based code abstractions at a third abstraction level. 15. The one or more non-transitory, computer-readable storage devices of claim 14 , wherein the algorithm abstractions at the first abstraction level comprises algorithm abstractions at an abstraction level higher than each of the blocked-algorithm abstractions and the region-based code abstractions, wherein the blocked-algorithm abstractions at the second abstraction level comprises blocked-algorithm abstractions at an abstraction level lower than the algorithm abstractions and at an abstraction level higher than the region-based code abstractions, and wherein the region-based code abstractions at the third abstraction level comprises region-based code abstractions at an abstraction level lower than each of the algorithm abstractions and the blocked-algorithm abstractions. 16. The one or more non-transitory, computer-readable storage devices of claim 14 , wherein the algorithm abstractions enc
Reducing the execution time required by the program code · CPC title
Data distribution · CPC title
Programming languages or programming paradigms · CPC title
Partial evaluation · CPC title
Checking; Contextual analysis · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.