System and method for endian correction of complex data structures in heterogeneous systems
US-2016217197-A1 · Jul 28, 2016 · US
US9626168B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9626168-B2 |
| Application number | US-201414584385-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 29, 2014 |
| Priority date | Aug 13, 2014 |
| Publication date | Apr 18, 2017 |
| Grant date | Apr 18, 2017 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An optimizing compiler includes a vector optimization mechanism that optimizes vector instructions by eliminating one or more vector element reverse operations. The compiler can generate code that includes multiple vector element reverse operations that are inserted by the compiler to account for a mismatch between the endian bias of the instruction and the endian preference indicated by the programmer or programming environment. The compiler then analyzes the code and reduces the number of vector element reverse operations to improve the run-time performance of the code.
Opening claim text (preview).
The invention claimed is: 1. An apparatus comprising: at least one processor; a memory coupled to the at least one processor; a computer program residing in the memory, the computer program including a plurality of instructions that includes at least one vector instruction; and a compiler residing in the memory and executed by the at least one processor, the compiler including a vector instruction optimization mechanism that eliminates at least one vector element reverse operation from the computer program to enhance run-time performance of the computer program, wherein the vector instruction optimization mechanism records characteristics of vector instructions and forms subgraphs of related instructions by analyzing def-use and use-def chains for the computer program in a first pass, determines whether any of the subgraphs cannot be optimized in a second pass, identifies a computation in the computer program where all operations performed on input vectors are single instruction multiple data (SIMD) instructions and marks at least one vector element reverse operation that corresponds to the computation for removal in a third pass, and deletes in a fourth pass the at least one vector element reverse operation marked for removal in the third pass. 2. The apparatus of claim 1 wherein the vector instruction optimization mechanism identifies in the third pass a first vector element reverse operation and a second vector element reverse operation in the computer program, such that the result of the first vector element reverse operation is the source of the second vector element reverse operation, and eliminates in the fourth pass at least one of the first and second vector element reverse operations. 3. The apparatus of claim 1 wherein the vector instruction optimization mechanism identifies a unary operation accompanied by at least one vector element reverse operation and changes order of instructions for the unary operation and the at least one vector element reverse operation. 4. The apparatus of claim 1 wherein the vector instruction optimization mechanism identifies in the third pass a binary operation accompanied by at least one vector element reverse operation and eliminates in the fourth pass the at least one vector element reverse operation that accompanies the binary operation. 5. The apparatus of claim 1 wherein the vector instruction optimization mechanism identifies in the third pass a first instruction that specifies an endian load followed by a second instruction that performs a vector element reverse operation, and eliminates in the fourth pass the second instruction by converting the first instruction into a third instruction that specifies an endian load that does not require the second instruction. 6. The apparatus of claim 1 wherein the vector instruction optimization mechanism identifies in the third pass a first instruction that is a vector element reverse operation that precedes a second instruction that is an endian store, and eliminates in the fourth pass the first instruction by converting the second instruction into a third instruction that specifies an endian store that does not require the first instruction. 7. The apparatus of claim 1 wherein the vector instruction optimization mechanism identifies in the third pass a first instruction that specifies a vector load of a literal value followed by a second instruction that is a vector element reverse operation, and eliminates in the fourth pass the second instruction by reversing order of the elements in the literal value in the first instruction. 8. An apparatus comprising: at least one processor; a memory coupled to the at least one processor; a computer program residing in the memory, the computer program including a plurality of instructions that includes at least one vector instruction; and a compiler residing in the memory and executed by the at least one processor, the compiler including a vector instruction optimization mechanism that eliminates at least one vector element reverse operation from the computer program to enhance run-time performance of the computer program, wherein the vector instruction optimization mechanism records characteristics of vector instructions and forms subgraphs of related instructions by analyzing def-use and use-def chains for the computer program in a first pass, determines whether any of the subgraphs cannot be optimized in a second pass, and in a third pass: identifies a first vector element reverse operation and a second vector element reverse operation in the computer program, such that the result of the first vector element reverse operation is the source of the second vector element reverse operation, and marks the first and second vector element reverse operations for removal; identifies a computation in the computer program where all operations performed on input vectors are single instruction multiple data (SIMD) instructions and marks at least one vector element reverse operation that corresponds to the computation for removal; identifies a unary operation accompanied by at least one vector element reverse operation and changes order of instructions for the unary operation and the at least one vector element reverse operation; identifies a first instruction that specifies an endian load followed by a second instruction that performs a vector element reverse operation, and marks the second instruction for removal by converting the first instruction into a third instruction that specifies an endian load that does not require the second instruction; identifies a binary operation accompanied by at least one vector element reverse operation and marks the at least one vector element reverse operation that accompanies the binary operation for removal; identifies a fourth instruction that is a vector element reverse operation that precedes a fifth instruction that is an endian store, and marks the fourth instruction for removal by converting the fifth instruction into a sixth instruction that specifies an endian store that does not require the fourth instruction; and identifies a seventh instruction that specifies a vector load of a literal value followed by an eighth instruction that is a vector element reverse operation, and marks the eighth instruction for removal by reversing order of the elements in the literal value in the seventh instruction; and deletes in a fourth pass the vector element reverse marked for removal in the third pass.
Reducing the execution time required by the program code · CPC title
Optimisation · CPC title
with data re-ordering, e.g. Endian conversion · CPC title
Compilation · CPC title
by performing operations on the source code, e.g. via a compiler · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.