User-space emulation framework for heterogeneous soc design
US-2024004776-A1 · Jan 4, 2024 · US
US9916130B2 · US · B2
| Field | Value |
|---|---|
| Publication number | US-9916130-B2 |
| Application number | US-201414564708-A |
| Country | US |
| Kind code | B2 |
| Filing date | Dec 9, 2014 |
| Priority date | Nov 3, 2014 |
| Publication date | Mar 13, 2018 |
| Grant date | Mar 13, 2018 |
A practical reading order for non-experts. Skip the full description unless you need deep technical detail.
What the patent document calls the invention.
A short plain-language summary of the technical disclosure.
Who owns or filed the patent and who is credited as inventor.
Filing, priority, publication, and grant dates set the timeline.
The legal scope of protection — read this for what is actually claimed.
Technology tags used to group this patent with similar filings.
Prior art links and similar publications in this corpus.
Official abstract text for this publication.
An apparatus comprises processing circuitry for performing, in response to a vector instruction, a plurality of lanes of processing or respective data elements with at least one operand vector to generate corresponding result data elements of a result vector. The processing circuitry may support performing at least two of the lanes of processing with different rounding modes for generating rounding values for the corresponding result data elements of the result vector. This allows two or more calculations with different rounding modes to be executed in response to a single instruction, to improve performance.
Opening claim text (preview).
We claim: 1. An apparatus comprising: processing circuitry to perform, in response to a vector instruction, a plurality of lanes of processing on respective data elements of at least one operand vector to generate corresponding result data elements of a result vector; wherein the processing circuitry supports performing at least two of said plurality of lanes of processing with different rounding modes for generating rounded values for the corresponding result data elements of the result vector; a first control storage location to store a variable rounding mode parameter indicating a default rounding mode; and a second control storage location comprising a plurality of control fields each for storing control information for controlling a corresponding lane of processing; wherein for at least one type of vector instruction, the control information includes a rounding mode value specifying the rounding mode to be used for the corresponding lane of processing; each control field includes information identifying whether the corresponding lane of processing is as an active lane for which the corresponding result data element is to be generated in dependence on said corresponding data element of the at least one operand vector, or an inactive lane for which the corresponding result data element of the result vector is independent of the corresponding data element of the at least one operand vector; and in response to said at least one type of vector instruction: when the rounding mode value stored in the second control storage location corresponding to a given lane of processing has a default value, the processing circuitry is to perform said given lane of processing using the default rounding mode indicated by the variable rounding mode parameter stored in the first control storage location; and when the rounding mode value stored in the second control storage location corresponding to said given lane of processing has a value other than said default value, the processing circuitry is to perform said given lane of processing using the rounding mode indicated by said rounding mode value stored in the second control storage location corresponding to said given lane of processing. 2. The apparatus according to claim 1 , wherein the processing circuitry supports performing said at least two of said plurality of lanes of processing with different rounding modes at least when processing a floating-point vector instruction. 3. The apparatus according to claim 1 , wherein the processing circuitry comprises rounding circuitry to generate a rounding increment for each lane in dependence on the rounding mode specified by the rounding mode value for the corresponding lane of processing. 4. The apparatus according to claim 1 , wherein each control field includes a rounding field for indicating the rounding mode value for the corresponding lane of processing; and for at least one other type of vector instruction, the rounding field specifies information other than the rounding mode value. 5. The apparatus according to claim 4 , wherein said information other than the rounding mode value is indicative of at least one of: a type of arithmetic or logical operation to be performed for the corresponding lane of processing; whether the corresponding lane of processing is to generate the result data element with saturating or non-saturating arithmetic; and which portion of a result of the corresponding lane of processing is to be represented by the corresponding result data element. 6. The apparatus according to claim 1 , wherein the processing circuitry comprises a plurality of processing units to perform at least some of said plurality of lanes of processing in parallel. 7. The apparatus according to claim 6 , wherein the processing circuitry comprises M processing units, where M >1, and if the vector instruction requires more than M lanes of processing, then the processing circuitry is configured to perform the plurality of lanes of processing in multiple passes of said processing units. 8. An apparatus comprising: processing means for performing, in response to a vector instruction, a plurality of lanes of processing on respective data elements of at least one operand vector to generate corresponding result data elements of a result vector; wherein the processing means supports performing at least two of said plurality of lanes of processing with different rounding modes for generating rounded values for the corresponding result data elements of the result vector; first means for storing a variable rounding mode parameter indicating a default rounding mode; and second means for storing a plurality of control fields comprising control information for controlling a corresponding lane of processing; wherein for at least one type of vector instruction, the control information includes a rounding mode value specifying the rounding mode to be used for the corresponding lane of processing; and each control field includes information identifying whether the corresponding lane of processing is as an active lane for which the corresponding result data element is to be generated in dependence on said corresponding data element of the at least one operand vector, or an inactive lane for which the corresponding result data element of the result vector is independent of the corresponding data element of the at least one operand vector; and in response to said at least one type of vector instruction: when the rounding mode value stored in the second means for storing corresponding to a given lane of processing has a default value, the processing means is to perform said given lane of processing using the default rounding mode indicated by the variable rounding mode parameter stored in the first means for storing; and when the rounding mode value stored in the second means for storing corresponding to said given lane of processing has a value other than said default value, the processing means is to perform said given lane of processing using the rounding mode indicated by said rounding mode value stored in the second means for storing corresponding to said given lane of processing. 9. A data processing method comprising: performing, in response to a vector instruction, a plurality of lanes of processing on respective data elements of at least one operand vector to generate corresponding result data elements of a result vector; wherein at least two of said plurality of lanes of processing are performed with different rounding modes for generating rounded values for the corresponding result data elements of the result vector; wherein a first control storage location stores a variable rounding mode parameter indicating a default rounding mode; a second control storage location comprises a plurality of control fields each for storing control information for controlling a corresponding lane of processing; for at least one type of vector instruction, the control information includes a rounding mode value specifying the rounding mode to be used for the corresponding lane of processing; and each control field includes information identifying whether the corresponding lane of processing is as an active lane for which the corresponding result data element is to be generated in dependence on said corresponding data element of the at least one operand vector, or an inactive lane for which the corresponding result data element of the result vector is independent of the corresponding data element of the at least one operand vector; and in response to said at least one type of vector instruction: when the rounding mode value stored in the second control storage location corresponding to a given lane of processing has a default value, the processing circuitry is
Saturation, i.e. clipping the result to a minimum or maximum value · CPC title
by tracing the execution of the program · CPC title
Significance control · CPC title
according to one or more bits in the instruction, e.g. prefix, sub-opcode · CPC title
according to execution mode, e.g. mode flag · CPC title
Related publications grouped by family.
Answers are generated from the same data shown on this page.