Is there a way to get the correct rounding with the i387 fsqrt instruction?

Is there a way to get the correct rounding with the i387 fsqrt instruction? ...

... besides changing the accuracy mode in the x87 control word - I know that this is possible, but this is not a reasonable solution, because it has unpleasant problems such as reentration, where the precision mode will be incorrect if the sqrt operation is interrupted.

The problem I'm dealing with is this: the x87 fsqrt performs the correctly rounded (for IEEE 754) operation with the square root of exactly fpu registers, which I assume is extended (80-bit). However, I want to use it to implement effective square-root functions with single and double precision with the correct rounding of the results (in the current rounding mode). Since the result has excessive accuracy, the second step of converting the result to single or double precision rounds again, possibly leaving the result with incorrect rounding.

With some operations you can get around this with prejudice. For example, I can avoid excessive accuracy in the results of adding by adding an offset of two of the power, which forces 52 significant bits of the double-precision value to the last 52 bits of the 63-bit mantissa with extended precision, but I donโ€™t see any obvious way to do such a trick with square root.

Any smart ideas?

(C is also marked because the intended application is an implementation of the C sqrt and sqrtf .)

+7
source share
3 answers

First, let me get the obvious because of this: you should use SSE instead of x87. The SSE sqrtss and sqrtsd do exactly what you want, are supported on all modern x86 systems, and also significantly faster.

Now, if you insist on using x87, I will start with the good news: you do not need to do anything for the float. To calculate a correctly rounded square root in p-bit floating point format, you need bit 2p + 2 . Since 80 > 2*24 + 2 , additional rounding to one precision will always be correctly rounded, and you have a correctly rounded square root.

Now the bad news is: 80 < 2*53 + 2 , so there is no such luck for double precision. I can offer some workarounds; here is a good light from the top of the head.

  • let y = round_to_double(x87_square_root(x));
  • use the product Dekker (head-tail) to calculate a and b so that y*y = a + b sure.
  • calculate the residual r = x - a - b .
  • if (r == 0) return y
  • if (r > 0) , let y1 = y + 1 ulp and calculate a1 , b1 st y1*y1 = a1 + b1 . Compare r1 = x - a1 - b1 with r and return either y or y1 , depending on which one has the smallest remainder (or the one that has the least significant bit zero if the remainders are equal in magnitude).
  • if (r < 0) , do the same for y1 = y - 1 ulp .

In this mode, only the default rounding mode is processed; however, in directional rounding modes, just rounding to the destination format does the right thing.

+13
source

Ok, I think I have a better solution:

  • Calculate y=sqrt(x) in extended precision ( fsqrt ).
  • If the last 11 bits are not 0x400 , just convert them to double precision and return.
  • Add 0x100-(fpu_status_word&0x200) to the low word of the extended precision representation.
  • Convert to double precision and return.

Step 3 is based on the fact that bit C1 (0x200) of the status word is 1 if and only if the result fsqrt been rounded. This is true because, due to the test in step 2, x not a perfect square; if it were a perfect square, y would not have bits exceeding double precision.

It might be faster to complete step 3 with a conditional floating point rather than working on the presentation and reloading the bits.

Here's the code (seems to work in all cases):

 sqrt: fldl 4(%esp) fsqrt fstsw %ax sub $12,%esp fld %st(0) fstpt (%esp) mov (%esp),%ecx and $0x7ff,%ecx cmp $0x400,%ecx jnz 1f and $0x200,%eax sub $0x100,%eax sub %eax,(%esp) fstp %st(0) fldt (%esp) 1: add $12,%esp fstpl 4(%esp) fldl 4(%esp) ret 
+3
source

Perhaps this is not what you want, since it does not use the 387 fsqrt , but there is a surprisingly efficient sqrtf(float) in glibc implemented with 32-bit integer arithmetic. It even handles NaNs, Infs, subnormals correctly - perhaps some of these checks can be eliminated using real x87 / FP instructions. see glibc-2.14/sysdeps/ieee754/flt-32/e_sqrtf.c

The dbl-64/e_sqrt.c not very friendly. It is difficult to say what assumptions are made at first sight. Curiously, the i386 sqrt[f|l] implementation fsqrt just call fsqrt , but load the value differently. flds for SP, fldl for DP.

0
source

All Articles