Apparatus and method for vector processing

US9916130B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9916130-B2
Application numberUS-201414564708-A
CountryUS
Kind codeB2
Filing dateDec 9, 2014
Priority dateNov 3, 2014
Publication dateMar 13, 2018
Grant dateMar 13, 2018

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

An apparatus comprises processing circuitry for performing, in response to a vector instruction, a plurality of lanes of processing or respective data elements with at least one operand vector to generate corresponding result data elements of a result vector. The processing circuitry may support performing at least two of the lanes of processing with different rounding modes for generating rounding values for the corresponding result data elements of the result vector. This allows two or more calculations with different rounding modes to be executed in response to a single instruction, to improve performance.

First claim

Opening claim text (preview).

We claim: 1. An apparatus comprising: processing circuitry to perform, in response to a vector instruction, a plurality of lanes of processing on respective data elements of at least one operand vector to generate corresponding result data elements of a result vector; wherein the processing circuitry supports performing at least two of said plurality of lanes of processing with different rounding modes for generating rounded values for the corresponding result data elements of the result vector; a first control storage location to store a variable rounding mode parameter indicating a default rounding mode; and a second control storage location comprising a plurality of control fields each for storing control information for controlling a corresponding lane of processing; wherein for at least one type of vector instruction, the control information includes a rounding mode value specifying the rounding mode to be used for the corresponding lane of processing; each control field includes information identifying whether the corresponding lane of processing is as an active lane for which the corresponding result data element is to be generated in dependence on said corresponding data element of the at least one operand vector, or an inactive lane for which the corresponding result data element of the result vector is independent of the corresponding data element of the at least one operand vector; and in response to said at least one type of vector instruction: when the rounding mode value stored in the second control storage location corresponding to a given lane of processing has a default value, the processing circuitry is to perform said given lane of processing using the default rounding mode indicated by the variable rounding mode parameter stored in the first control storage location; and when the rounding mode value stored in the second control storage location corresponding to said given lane of processing has a value other than said default value, the processing circuitry is to perform said given lane of processing using the rounding mode indicated by said rounding mode value stored in the second control storage location corresponding to said given lane of processing. 2. The apparatus according to claim 1 , wherein the processing circuitry supports performing said at least two of said plurality of lanes of processing with different rounding modes at least when processing a floating-point vector instruction. 3. The apparatus according to claim 1 , wherein the processing circuitry comprises rounding circuitry to generate a rounding increment for each lane in dependence on the rounding mode specified by the rounding mode value for the corresponding lane of processing. 4. The apparatus according to claim 1 , wherein each control field includes a rounding field for indicating the rounding mode value for the corresponding lane of processing; and for at least one other type of vector instruction, the rounding field specifies information other than the rounding mode value. 5. The apparatus according to claim 4 , wherein said information other than the rounding mode value is indicative of at least one of: a type of arithmetic or logical operation to be performed for the corresponding lane of processing; whether the corresponding lane of processing is to generate the result data element with saturating or non-saturating arithmetic; and which portion of a result of the corresponding lane of processing is to be represented by the corresponding result data element. 6. The apparatus according to claim 1 , wherein the processing circuitry comprises a plurality of processing units to perform at least some of said plurality of lanes of processing in parallel. 7. The apparatus according to claim 6 , wherein the processing circuitry comprises M processing units, where M >1, and if the vector instruction requires more than M lanes of processing, then the processing circuitry is configured to perform the plurality of lanes of processing in multiple passes of said processing units. 8. An apparatus comprising: processing means for performing, in response to a vector instruction, a plurality of lanes of processing on respective data elements of at least one operand vector to generate corresponding result data elements of a result vector; wherein the processing means supports performing at least two of said plurality of lanes of processing with different rounding modes for generating rounded values for the corresponding result data elements of the result vector; first means for storing a variable rounding mode parameter indicating a default rounding mode; and second means for storing a plurality of control fields comprising control information for controlling a corresponding lane of processing; wherein for at least one type of vector instruction, the control information includes a rounding mode value specifying the rounding mode to be used for the corresponding lane of processing; and each control field includes information identifying whether the corresponding lane of processing is as an active lane for which the corresponding result data element is to be generated in dependence on said corresponding data element of the at least one operand vector, or an inactive lane for which the corresponding result data element of the result vector is independent of the corresponding data element of the at least one operand vector; and in response to said at least one type of vector instruction: when the rounding mode value stored in the second means for storing corresponding to a given lane of processing has a default value, the processing means is to perform said given lane of processing using the default rounding mode indicated by the variable rounding mode parameter stored in the first means for storing; and when the rounding mode value stored in the second means for storing corresponding to said given lane of processing has a value other than said default value, the processing means is to perform said given lane of processing using the rounding mode indicated by said rounding mode value stored in the second means for storing corresponding to said given lane of processing. 9. A data processing method comprising: performing, in response to a vector instruction, a plurality of lanes of processing on respective data elements of at least one operand vector to generate corresponding result data elements of a result vector; wherein at least two of said plurality of lanes of processing are performed with different rounding modes for generating rounded values for the corresponding result data elements of the result vector; wherein a first control storage location stores a variable rounding mode parameter indicating a default rounding mode; a second control storage location comprises a plurality of control fields each for storing control information for controlling a corresponding lane of processing; for at least one type of vector instruction, the control information includes a rounding mode value specifying the rounding mode to be used for the corresponding lane of processing; and each control field includes information identifying whether the corresponding lane of processing is as an active lane for which the corresponding result data element is to be generated in dependence on said corresponding data element of the at least one operand vector, or an inactive lane for which the corresponding result data element of the result vector is independent of the corresponding data element of the at least one operand vector; and in response to said at least one type of vector instruction: when the rounding mode value stored in the second control storage location corresponding to a given lane of processing has a default value, the processing circuitry is

Assignees

Inventors

Classifications

  • Saturation, i.e. clipping the result to a minimum or maximum value · CPC title

  • by tracing the execution of the program · CPC title

  • Significance control · CPC title

  • according to one or more bits in the instruction, e.g. prefix, sub-opcode · CPC title

  • according to execution mode, e.g. mode flag · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9916130B2 cover?
An apparatus comprises processing circuitry for performing, in response to a vector instruction, a plurality of lanes of processing or respective data elements with at least one operand vector to generate corresponding result data elements of a result vector. The processing circuitry may support performing at least two of the lanes of processing with different rounding modes for generating roun…
Who is the assignee on this patent?
Advanced Risc Mach Ltd
What technology area does this patent fall under?
Primary CPC classification G06F11/3636. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Mar 13 2018 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).