The most accurate way to calculate asinhf () from log1pf ()?

Question

The most accurate way to calculate asinhf () from log1pf ()?

The inverse hyperbolic function asinh() is closely related to the natural logarithm. I am trying to determine the most accurate way to calculate asinh() from the standard math function C99 log1p() . For simplicity of experimentation, I limit myself to calculating the IEEE-754 one-dimensionality right now, that is, I look at asinhf() and log1pf() . I intend to reuse the same algorithm to calculate double precision, i.e. asinh() and log1p() , later.

My main goal is to minimize the ulp error, the secondary goal is to minimize the number of incorrectly rounded results, provided that the improved code is as minimal as possible than the versions listed below. Any gradual improvement in accuracy, say 0.2 ulp, would be very welcome. On the other hand, adding a few FMAs (fused multiply-add) will be great, on the other hand, I hope someone can figure out a solution that uses fast rsqrtf() (the inverse square root).

The resulting C99 code should be vectorized, possibly with some minor direct conversions. All intermediate calculations must be performed with the accuracy of the argument of the function and the result, since any transition to higher accuracy can have a serious negative impact on performance. The code should work correctly both with support for IEEE-754 denormal support, and in FTZ mode (from zero to zero).

So far, I have identified the following two candidate implementations. Note that the code can easily be converted to a non-wind vector version with a single call to log1pf() , but I did not do this at this point to avoid unnecessary obfuscation.

 /* for a >= 0, asinh(a) = log (a + sqrt (a*a+1)) = log1p (a + (sqrt (a*a+1) - 1)) = log1p (a + sqrt1pm1 (a*a)) = log1p (a + (a*a / (1 + sqrt(a*a + 1)))) = log1p (a + a * (a / (1 + sqrt(a*a + 1)))) = log1p (fma (a / (1 + sqrt(a*a + 1)), a, a) = log1p (fma (1 / (1/a + sqrt(1/a*a + 1)), a, a) */ float my_asinhf (float a) { float fa, t; fa = fabsf (a); #if !USE_RECIPROCAL if (fa >= 0x1.0p64f) { // prevent overflow in intermediate computation t = log1pf (fa) + 0x1.62e430p-1f; // log(2) } else { t = fmaf (fa / (1.0f + sqrtf (fmaf (fa, fa, 1.0f))), fa, fa); t = log1pf (t); } #else // USE_RECIPROCAL if (fa > 0x1.0p126f) { // prevent underflow in intermediate computation t = log1pf (fa) + 0x1.62e430p-1f; // log(2) } else { t = 1.0f / fa; t = fmaf (1.0f / (t + sqrtf (fmaf (t, t, 1.0f))), fa, fa); t = log1pf (t); } #endif // USE_RECIPROCAL return copysignf (t, a); // restore sign }

With a specific implementation of log1pf() that is accurate for <0.6 ulps, I observe the following error statistics while fully testing all 2 ³² possible IEEE-754 inputs with one precision. When USE_RECIPROCAL = 0 , the maximum error is 1.49486 ulp, and there are 353,587,822 incorrectly rounded results. With USE_RECIPROCAL = 1 maximum error is 1.50805 ulp, and there are only 77 569 390 incorrectly rounded results.

In terms of performance, the USE_RECIPROCAL = 0 option will be faster if the back and full divisions take about the same amount of time, but the USE_RECIPROCAL = 1 option may be faster if very fast mutual support is available.

Answers may suggest that all basic arithmetic, including FMA (fused multiply-add), is correctly rounded according to IEEE-754, rounded to the nearest or even mode. In addition, faster, almost correct, rounded versions of reverse and rsqrtf() versions may be available, where “almost correctly rounded” means that the maximum ulp error will be limited to approximately 0.53 ulps and the vast majority of the results, say,> 95%, correctly rounded off. Basic arithmetic with directional rounding can be accessed at no additional cost.

+7

c math floating-point floating-accuracy

njuffa Dec 30 '15 at 18:42

source share

2 answers

First, you can take a look at the accuracy and speed of your log1pf function: they may vary slightly between libms (I found OS X's math functions are fast, glibc is slower, but usually rounded correctly).

Openlibm, based on BSD libm, which in turn is based on Sun fdlibm, uses several range approaches, but the main bit is the ratio :

 t = x*x; w = log1pf(fabsf(x)+t/(one+sqrtf(one+t)));

You can also try compiling with the -fno-math-errno option, which disables the old System V error codes for sqrt (IEEE-754 exceptions will still work).

+1

Simon byrne Dec 31 '15 at 15:39

source share

njuffa · Accepted Answer · 2016-01-01T06:24:31+0000

After various additional experiments, I was convinced that a simple argument conversion, which does not use higher accuracy than the argument and the result, cannot achieve a more severe error than that which was achieved by the first option in the code I published.

Since my question is to minimize the error for argument conversion that occurs in addition to the error in log1pf() , the easiest approach to use for experimenting is to use a correctly rounded implementation of this logarithm function. Note that an implementation with the correct rounding is unlikely to exist in the context of a high-performance environment. According to the works of J.-M. Muller et. al., to obtain accurate results with one precision, it is enough to calculate the advanced calculation of x86, for example

 float accurate_log1pf (float a) { float res; __asm fldln2; __asm fld dword ptr [a]; __asm fyl2xp1; __asm fst dword ptr [res]; __asm fcompp; return res; }

The implementation of asinhf() using the first option from my question is as follows:

 float my_asinhf (float a) { float fa, s, t; fa = fabsf (a); if (fa >= 0x1.0p64f) { // prevent overflow in intermediate computation t = log1pf (fa) + 0x1.62e430p-1f; // log(2) } else { t = fmaf (fa / (1.0f + sqrtf (fmaf (fa, fa, 1.0f))), fa, fa); t = accurate_log1pf (t); } return copysignf (t, a); // restore sign }

Testing with all operands with a 1st degree of accuracy of 2 ³² IEEE-754 shows that a maximum error of 1.49486070 ulp occurs at ± 0x1.ff5022p-9 and there are 353 521 140 incorrect rounded results. What happens if the conversion of the entire argument uses double precision arithmetic? Code changes to

 float my_asinhf (float a) { float fa, s, t; fa = fabsf (a); if (fa >= 0x1.0p64f) { // prevent overflow in intermediate computation t = log1pf (fa) + 0x1.62e430p-1f; // log(2) } else { double tt = fa; tt = fma (tt / (1.0 + sqrt (fma (tt, tt, 1.0))), tt, tt); t = (float)tt; t = accurate_log1pf (t); } return copysignf (t, a); // restore sign }

However, with this change, the error does not improve! The maximum error of 1.49486070 ulp is still found at ± 0x1.ff5022p-9 , and now there are 0x1.ff5022p-9 incorrectly rounded results, slightly less than before. The problem is that the float operand cannot pass enough information to log1pf() to get more accurate results. A similar problem arises when computing sinf() and cosf() . If the argument presented, represented as a correctly rounded operand of float , is passed to the main polynomials, the resulting error in sinf() and cosf() is only slightly less than 1.5 ulp, as we observe with my_asinhf() .

One solution is to compute the converted argument higher than the single argument, for example, as a pair of double-float operands (a useful overview of double-float methods can be found in this document by Andrew Tall ). In this case, we can use additional information to perform linear interpolation according to the result based on the knowledge that the derivative of the logarithm is inverse. This gives us:

 float my_asinhf (float a) { float fa, s, t; fa = fabsf (a); if (fa >= 0x1.0p64f) { // prevent overflow in intermediate computation t = log1pf (fa) + 0x1.62e430p-1f; // log(2) } else { double tt = fa; tt = fma (tt / (1.0 + sqrt (fma (tt, tt, 1.0))), tt, tt); t = (float)tt; // "head" of double-float s = (float)(tt - (double)t); // "tail" of double-float t = fmaf (s, 1.0f / (1.0f + t), accurate_log1pf (t)); // interpolate } return copysignf (t, a); // restore sign }

An exhaustive test of this version shows that the maximum error has been reduced to 0.99999948 ulp, it occurs at ± 0x1.deeea0p-22 . There are 349,653,534 incorrectly rounded results. A targeted implementation of asinhf() achieved.

Unfortunately, the practical usefulness of this result is limited. Depending on the HW platform, the throughput of arithmetic operations on double can be from 1/2 to 1/32 of the productivity of float operations. A double-precision calculation can be replaced by a double-float calculation, but it will also be costly. Finally, my approach here was to use a single-precision implementation as evidence for subsequent double-precision work, and many hardware platforms (of course, everything that interests me) do not offer hardware support for a higher-precision number format than IEEE -754 binary64 (double precision). Therefore, any solution should not require high-precision arithmetic in intermediate calculations.

Since all the unpleasant arguments in the case of asinhf() are small in magnitude, it was possible [partially?] To solve the problem of accuracy using the polynomial minimax approximation for the region around the coordinate origin. Since this would create another branch of code, this would most likely complicate vectology.

The most accurate way to calculate asinhf () from log1pf ()?

More articles: