Facilitating data processing using SIMD reduction operations across SIMD lanes

US11216281B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-11216281-B2
Application numberUS-201916412072-A
CountryUS
Kind codeB2
Filing dateMay 14, 2019
Priority dateMay 14, 2019
Publication dateJan 4, 2022
Grant dateJan 4, 2022

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

Various embodiments are provided for facilitating data processing by one or more processors in a computing system. An instruction to be executed may be obtained. The instruction is a single instruction multiple data (SIMD) reduction operation of an operand vector with a plurality of vector elements. The SIMD reduction operation may be executed to produce a result vector with a plurality of alternative vector elements. One or more reduction functions may be performed on each of a pair of vector elements from the plurality of vector elements of the operand vector and a result of the one or more reduction functions may be placed in a corresponding vector element of the result vector.

First claim

Opening claim text (preview).

The invention claimed is: 1. A method for facilitating data processing in a computing environment by one or more processors comprising: obtaining an instruction to be executed, wherein the instruction is a single instruction multiple data (SIMD) reduction operation of an operand vector with a plurality of vector elements; and executing the SIMD reduction operation to produce a result vector with a plurality of alternative vector elements, wherein the SIMD reduction operation of the operand vector is performed using data identified from neighboring SIMD lanes according to a defined operand specifier in the instruction without performing any permute and add instructions on values within the neighboring SIMD lanes, and wherein the defined operand specifier includes an additional value indicating an explicit one of the neighboring SIMD lanes from which to retrieve the data and is included in the instruction irrespective of which binary function is to be performed specified by a function specifier in the instruction. 2. The method of claim 1 , further including determining the instruction is a reduction function according to a field of the instruction to reduce one of a plurality of two-operand reduction functions. 3. The method of claim 1 , further including selecting, from a plurality of operations for the SIMD reduction operation, a two-operand reduction function to operate on a pair of vector elements from the plurality of vector elements of the operand vector. 4. The method of claim 3 , further including providing the result vector with a first SIMD lane of the result vector containing a result of applying a selected reduction function to all SIMD lanes of an 2 N -way input vector after applying the SIMD reduction operation a selected N times, wherein N is a positive integer or selected value. 5. The method of claim 1 , further including selecting, according to the operand specifier in the instruction, a pair of vector elements from the plurality of vector elements of the operand vector for each of the plurality of alternative vector elements of the result vector. 6. The method of claim 1 , further including performing a selected reduction function on each of a pair of vector elements from the plurality of vector elements of the operand vector and placing a result of the selected reduction function in a corresponding vector element of the result vector. 7. The method of claim 1 , further including: defining in the instruction, the operand specifier for a set of two-operand reduction functions, a target vector, and a source vector; indicating a first operand or a second operand of the plurality of two-operand reduction functions is selected from a similar SIMD lane or from an identified alternative SIMD lane; or indicating, by the operand specifier in the instruction, for each of the plurality of alternative vector elements in the result vector and corresponding ones of the plurality of vector elements of the operand vector, that the operand vector and the result vector are from similar or non-similar SIMD lanes. 8. A system for facilitating data processing in a computing environment, comprising: a memory device storing an instruction; and a processor in communication with the memory device, the processor comprising: a plurality of vector registers, each vector register divided into a plurality of single instruction multiple data (SIMD) lanes; and execution circuitry coupled to the plurality of vector registers, wherein the execution circuitry: obtains the instruction to be executed from the memory device, wherein the instruction is a SIMD reduction operation of an operand vector with a plurality of vector elements; and executes the SIMD reduction operation to produce a result vector with a plurality of alternative vector elements, wherein the SIMD reduction operation of the operand vector is performed using data identified from neighboring SIMD lanes of the plurality of SIMD lanes according to a defined operand specifier in the instruction without performing any permute and add instructions on values within the neighboring SIMD lanes, and wherein the defined operand specifier includes an additional value indicating an explicit one of the neighboring SIMD lanes from which to retrieve the data and is included in the instruction irrespective of which binary function is to be performed specified by a function specifier in the instruction. 9. The system of claim 8 , wherein the execution circuitry determines the instruction is a reduction function according to a field of the instruction to reduce one of a plurality of two-operand reduction functions. 10. The system of claim 8 , wherein the execution circuitry selects, from a plurality of operations for the SIMD reduction operation, a two-operand reduction function to operate on a pair of vector elements from the plurality of vector elements of the operand vector. 11. The system of claim 10 , wherein the execution circuitry provides the result vector with a first SIMD lane of the result vector containing a result of applying a selected reduction function to all SIMD lanes of an 2 N -way input vector after applying the SIMD reduction operation a selected N times, wherein N is a positive integer or selected value. 12. The system of claim 8 , wherein the execution circuitry selects, according to the operand specifier in the instruction, a pair of vector elements from the plurality of vector elements of the operand vector for each of the plurality of alternative vector elements of the result vector. 13. The system of claim 8 , wherein the execution circuitry performs a selected reduction function on each of a pair of vector elements from the plurality of vector elements of the operand vector and placing a result of the selected reduction function in a corresponding vector element of the result vector. 14. The system of claim 8 , wherein: the operand specifier for a set of two-operand reduction functions, a target vector, and a source vector are defined in the instruction; a first operand or a second operand of the plurality of two-operand reduction functions is selected from a similar SIMD lane or from an identified alternative SIMD lane according to an indication in the instruction; or for each of the plurality of alternative vector elements in the result vector and corresponding ones of the plurality of vector elements of the operand vector, the operand specifier in the instruction indicates that the operand vector and the result vector are from similar or non-similar SIMD lanes. 15. A computer program product for, by a processor, facilitating data processing in a computing environment, the computer program product comprising a non-transitory computer-readable storage medium having computer-readable program code portions stored therein, the computer-readable program code portions comprising: an executable portion that obtains an instruction to be executed, wherein the instruction is a single instruction multiple data (SIMD) reduction operation of an operand vector with a plurality of vector elements; and an executable portion that executes the SIMD reduction operation to produce a result vector with a plurality of alternative vector elements, wherein the SIMD reduction operation of the operand vector is performed using data identified from neighboring SIMD lanes according to a defined operand specifier in the instruction without performing any permute and add instructions on values within the neighboring SIMD lanes, and wherein the defined operand specifier includes an additional value indicating an explicit one of the neighboring SIMD lanes from which to retrieve the d

Assignees

Inventors

Classifications

  • Instructions to perform operations on packed data, e.g. vector, tile or matrix operations · CPC title

  • G06F9/3887Primary

    controlled by a single instruction for multiple data lanes [SIMD] · CPC title

  • Details on data register access · CPC title

  • Movement instructions, e.g. MOVE, SHIFT, ROTATE, SHUFFLE · CPC title

  • Decoding the operand specifier, e.g. specifier format · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US11216281B2 cover?
Various embodiments are provided for facilitating data processing by one or more processors in a computing system. An instruction to be executed may be obtained. The instruction is a single instruction multiple data (SIMD) reduction operation of an operand vector with a plurality of vector elements. The SIMD reduction operation may be executed to produce a result vector with a plurality of alte…
Who is the assignee on this patent?
IBM
What technology area does this patent fall under?
Primary CPC classification G06F9/3887. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Jan 04 2022 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 6 related publications on this page (citations in our corpus or others sharing the same primary CPC).