Vector floating point argument reduction

US9146901B2 · US · B2

Patent metadata
FieldValue
Publication numberUS-9146901-B2
Application numberUS-201113137576-A
CountryUS
Kind codeB2
Filing dateAug 26, 2011
Priority dateSep 24, 2010
Publication dateSep 29, 2015
Grant dateSep 29, 2015

How to read this patent

A practical reading order for non-experts. Skip the full description unless you need deep technical detail.

  1. Title

    What the patent document calls the invention.

  2. Abstract

    A short plain-language summary of the technical disclosure.

  3. Assignees and inventors

    Who owns or filed the patent and who is credited as inventor.

  4. Key dates

    Filing, priority, publication, and grant dates set the timeline.

  5. First independent claim

    The legal scope of protection — read this for what is actually claimed.

  6. CPC / IPC classifications

    Technology tags used to group this patent with similar filings.

  7. Citations and related patents

    Prior art links and similar publications in this corpus.

Abstract

Official abstract text for this publication.

A processing apparatus is provided with processing circuitry 6, 8 and decoder circuitry 10 responsive to a received argument reduction instruction FREDUCE4, FDOT3R to generate control signals 16 for controlling the processing circuitry 6, 8 . The action of the argument reduction instruction is to subject each component of an input vector to a scaling which adds or subtracts an exponent shift value C to the exponent of the input vector component. The exponent shift value C is selected such that a sum of this exponent shift value C with the maximum exponent value B of any of the input vector components lies within a range between a first predetermined value and a second predetermined value. A consequence of execution of this argument reduction instruction is that the result vector when subject to a dot-product operation will be resistant to floating point underflows or overflows.

First claim

Opening claim text (preview).

I claim: 1. Apparatus for processing data comprising: processing circuitry configured to perform processing operations upon data values; and decoder circuitry coupled to said processing circuitry and configured to decode program instructions to generate control signals for controlling said processing circuitry to perform processing operations specified by said program instructions; wherein said decoder circuitry is responsive to an argument reduction instruction to generate control signals to control said processing circuitry to perform a processing operation upon a vector floating point value having a plurality of components, each of said plurality of components including an integer exponent value and a mantissa value, said processing operation including generating a plurality of result components, the processing operation comprising: for each of said plurality of components, forming a high order exponent portion E ho being an uppermost P bits of said integer exponent value, where P is less than a total number of bits within said integer exponent value, and selecting a highest value E homax from among said high order exponent portions E ho , wherein E homax identifies a highest integer exponent value B of said plurality of components; selecting an exponent shift value C such that (B+C) is less than a first predetermined value E dotmax and (B+C) is greater than a second predetermined value E dotmin , where said exponent shift value C is an integer value; and for each of said plurality of components, if said exponent shift value C is non-zero, then adding a value of (2 (P−1) −E homax ) to said high order exponent portion E ho to generate one of said plurality of result components. 2. Apparatus as claimed in claim 1 , wherein said first predetermined value E dotmax is a lowest integer value where a square of a floating point value with an integer exponent value of E dotmax and a mantissa M produces a floating point overflows for at least one value of M. 3. Apparatus as claimed in claim 2 , wherein each component has a sign value S c , an integer exponent value E c and a mantissa value M c representing a floating point number (−1) S c *2 (E c −127 )*(1+(M c /2 24 )) and E dotmax is 190. 4. Apparatus as claimed in claim 1 , wherein said second predetermined value E dotmin is a highest integer value where a square of a floating point value with an integer exponent value of E dotmin and a mantissa M produces a floating point underflows for at least one value of M. 5. Apparatus as claimed in claim 4 , wherein each component has a sign value S c , an integer exponent value E c and a mantissa value M c representing a floating point number (−1) S c *2 (E c −127 )*(1+(M c /2 24 )) and E dotmin is 64. 6. Apparatus as claimed in claim 1 , wherein for any one of said plurality of components, if when adding said exponent shift value C to an integer exponent value of said component to generate one of said plurality of result components, said one of said plurality of result components is subject to a floating point underflow, then replacing said one of said plurality of result components with a value of zero. 7. Apparatus as claimed in claim 1 , wherein for any one of said plurality of components, if when adding said value of (2 (P−1) −E homax ) to said high order exponent portion E ho , said value of (2 (P−1) −E homax ) is negative and said adding underflows, then replacing a corresponding one of said plurality of result components with a value of zero. 8. Apparatus as claimed in claim 1 , wherein a total number of bits within said integer exponent value is 8 and P=3. 9. Apparatus as claimed in claim 1 , wherein if any of said plurality of components is a floating point not-a-number, then all of said plurality of result components are set be floating point not-a-numbers. 10. Apparatus as claimed in claim 1 , wherein if any of said plurality of components is a floating point infinity value, then each result component corresponding to a component with a float point infinity value is set to a floating point value with magnitude one and a sign matching said floating point infinity value of said component and all remaining result components are set to have a floating point value with magnitude zero. 11. Apparatus as claimed in claim 1 , wherein said argument reduction instruction also generates a result scalar product with a value the same as given by a scalar product of said plurality of result components. 12. Apparatus as claimed in claim 1 , wherein said processing circuitry and said decoder circuitry are responsive to said argument reduction instruction followed by a sequence of one or more further instructions to generate a normalised vector floating point value with a plurality of normalised components the same as given by: generating a result scalar product with a value the same as given by a scalar product of said plurality of result components; generating a reciprocal square root of said result scalar product; and for each result component, generating a corresponding normalised component by multiplying said result component by said reciprocal square root. 13. Apparatus as claimed in claim 1 , wherein said processing circuitry and said decoder circuitry are part of a graphics processing unit. 14. A virtual machine comprising computer including a non-transitory computer readable storage medium storing a program which, when implement by the computer, provides an apparatus for processing data as claimed in claim 1 . 15. Apparatus for processing data comprising: processing means for performing processing operations upon data values; and decoder means for decoding program instructions to generate control signals for controlling said processing circuitry to perform processing operations specified by said program instructions; wherein said decoder means is responsive to an argument reduction instruction to generate control signals to control said processing means to perform a processing operation upon a vector floating point value having a plurality of components, each of said plurality of components including an integer exponent value and a mantissa value, said processing operation including generating a plurality of result components, the processing operation comprising: for each of said plurality of components, forming a high order exponent portion E ho being an uppermost P bits of said integer exponent value, where P is less than a total number of bits within said integer exponent value, and selecting a highest value E homax from among said high order exponent portions E ho , wherein E homax identifies a highest integer exponent value B of said plurality of components; selecting an exponent shift value C such that (B+C) is less than a first predetermined value E dotmax and (B+C) is greater than a second predetermined value E dotmin , where said exponent shift value C is an integer value; and for each of said plurality of components, if said exponent shift value C is non-zero, then adding a value of (2 (P−1) −E homax ) to said high order exponent portion E ho to generate one of said plurality of result components. 16. A method of processing data comprising the step of: in response to decoding an argument reduction instruction by decoding circuitry, performing, by processing circuitry, a processing operation upon a vector floating point value having a plurality of components, each of said plurality of components including an integer exponent value and a mantissa value, said processing operation including generating a plurality of result components, the pr

Assignees

Inventors

Classifications

  • Roots or inverse roots of single operands · CPC title

  • G06F7/483Primary

    Computations with numbers represented by a non-linear combination of denominational numbers, e.g. rational numbers, logarithmic number system or floating-point numbers {(G06F7/4806, G06F7/4824, G06F7/49, G06F7/491, G06F7/544 take precedence)} · CPC title

  • G06F17/10Primary

    Complex mathematical operations {(function generation by table look-up G06F1/03; evaluation of elementary functions by calculation G06F7/544)} · CPC title

  • Inverse root of a number or a function, e.g. the reciprocal of a Pythagorean sum · CPC title

  • in floating-point computations · CPC title

Patent family

Related publications grouped by family.

External sources

Frequently asked questions

Answers are generated from the same data shown on this page.

What does patent US9146901B2 cover?
A processing apparatus is provided with processing circuitry 6, 8 and decoder circuitry 10 responsive to a received argument reduction instruction FREDUCE4, FDOT3R to generate control signals 16 for controlling the processing circuitry 6, 8 . The action of the argument reduction instruction is to subject each component of an input vector to a scaling which adds or subtracts an exponent s…
Who is the assignee on this patent?
Nystad Jorn, Advanced Risc Mach Ltd
What technology area does this patent fall under?
Primary CPC classification G06F7/483. Mapped technology areas include Physics.
When was this patent published?
Publication date Tue Sep 29 2015 00:00:00 GMT+0000 (Coordinated Universal Time) (B2). Legal status and post-grant events are not shown on this page.
What related patents are in patentsdb?
We list 8 related publications on this page (citations in our corpus or others sharing the same primary CPC).