Compiler optimizations for vector instructions

US9619214B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9619214-B2
Application numberUS-201414576942-A
CountryUS
Kind codeB2
Filing dateDec 19, 2014
Priority dateAug 13, 2014
Publication dateApr 11, 2017
Grant dateApr 11, 2017

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An optimizing compiler includes a vector optimization mechanism that optimizes vector instructions by eliminating one or more vector element reverse operations. The compiler can generate code that includes multiple vector element reverse operations that are inserted by the compiler to account for a mismatch between the endian bias of the instruction and the endian preference indicated by the programmer or programming environment. The compiler then analyzes the code and reduces the number of vector element reverse operations to improve the run-time performance of the code.

First claim

Opening claim text (preview).

The invention claimed is: 1. A computer-implemented method executed by at least one processor for processing a plurality of instructions in a computer program, the method comprising: providing a computer program including a plurality of instructions that includes at least one vector operation; and processing the plurality of instructions to eliminate at least one vector element reverse operation from the computer program to enhance run-time performance of the computer program by: recording characteristics of vector instructions and forming subgraphs of related instructions by analyzing def-use and use-def chains for the computer program in a first pass; determining whether any of the subgraphs cannot be optimized in a second pass; identifying a computation in the computer program where all operations performed on input vectors are single instruction multiple data (SIMD) instructions and marking for removal at least one vector element reverse operation that corresponds to the computation in a third pass; and deleting in a fourth pass the at least one vector element reverse operation marked for removal in the third pass. 2. The method of claim 1 further comprising: identifying in the third pass a first vector element reverse operation and a second vector element reverse operation in the computer program, such that the result of the first vector element reverse operation is the source of the second vector element reverse operation; and eliminating in the fourth pass at least one of the first and second vector element reverse operations. 3. The method of claim 1 further comprising: identifying a unary operation accompanied by at least one vector element reverse operation; and changing order of instructions for the unary operation and the at least one vector element reverse operation. 4. The method of claim 1 further comprising: identifying in the third pass a binary operation accompanied by at least one vector element reverse operation; and eliminating in the fourth pass the at least one vector element reverse operation that accompanies the binary operation. 5. The method of claim 1 further comprising: identifying in the third pass a first instruction that specifies an endian load followed by a second instruction that performs a vector element reverse operation; and eliminating in the fourth pass the second instruction by converting the first instruction into a third instruction that specifies an endian load that does not require the second instruction. 6. The method of claim 1 further comprising: identifying in the third pass a first instruction that is a vector element reverse operation that precedes a second instruction that is an endian store; and eliminating in the fourth pass the first instruction by converting the second instruction into a third instruction that specifies an endian store that does not require the first instruction. 7. The method of claim 1 further comprising: identifying in the third pass a first instruction that specifies a vector load of a literal value followed by a second instruction that is a vector element reverse operation; and eliminating in the fourth pass the second instruction by reversing order of the elements in the literal value in the first instruction. 8. A computer-implemented method executed by at least one processor for processing a plurality of instructions in a computer program, the method comprising: providing a computer program including a plurality of instructions that includes at least one vector instruction; in a first pass, recording characteristics of vector instructions and forming subgraphs of related instructions by analyzing def-use and use-def chains for the computer program; in a second pass, determining whether any of the subgraphs cannot be optimized; in a third pass: identifying a first vector element reverse operation and a second vector element reverse operation in the computer program, such that the result of the first vector element reverse operation is the source of the second vector element reverse operation, and marking the first and second vector element reverse operations for removal; identifying a computation in the computer program where all operations performed on input vectors are single instruction multiple data (SIMD) instructions and marking at least one vector element reverse operation that corresponds to the computation for removal; identifying a unary operation accompanied by at least one vector element reverse operation and changing order of instructions for the unary operation and the at least one vector element reverse operation; identifying a first instruction that specifies an endian load followed by a second instruction that performs a vector element reverse operation, and marking the second instruction for removal by converting the first instruction into a third instruction that specifies an endian load that does not require the second instruction; identifying a binary operation accompanied by at least one vector element reverse operation and marking the at least one vector element reverse operation that accompanies the binary operation for removal; identifying a fourth instruction that is a vector element reverse operation that precedes a fifth instruction that is an endian store, and marking the fourth instruction for removal by converting the fifth instruction into a sixth instruction that specifies an endian store that does not require the fourth instruction; and identifying a seventh instruction that specifies a vector load of a literal value followed by an eighth instruction that is a vector element reverse operation, and marking the eighth instruction for removal by reversing order of the elements in the literal value in the seventh instruction; and in a fourth pass, deleting the vector element reverse operations marked for removal in the third pass.

Assignees

Inventors

Classifications

  • Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data · CPC title

  • by performing operations on the source code, e.g. via a compiler · CPC title

  • Reducing the execution time required by the program code · CPC title

  • with data re-ordering, e.g. Endian conversion · CPC title

  • Data position reversal, e.g. bit reversal, byte swapping · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9619214B2 cover?
An optimizing compiler includes a vector optimization mechanism that optimizes vector instructions by eliminating one or more vector element reverse operations. The compiler can generate code that includes multiple vector element reverse operations that are inserted by the compiler to account for a mismatch between the endian bias of the instruction and the endian preference indicated by the pr…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F8/443. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Apr 11 2017 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 10 related publications on this page (citations in our corpus or others sharing the same primary CPC).