Arithmetic and floating point reproducibility

Question

Arithmetic and floating point reproducibility

Is IEEE-754 arithmetic reproducible across platforms?

I tested code written in R that uses random numbers. I thought setting the seed of a random number generator on all platforms tested would make the tests reproducible, but this is not like the rexp() value that generates exponentially distributed random numbers.

This is what I get on 32-bit Linux:

 options(digits=22) ; set.seed(9) ; rexp(1, 5) # [1] 0.2806184054728815824298 sessionInfo() # R version 3.0.2 (2013-09-25) # Platform: i686-pc-linux-gnu (32-bit)

and this is what I get on 64-bit OSX 10.9:

 options(digits=22) ; set.seed(9) ; rexp(1, 5) # [1] 0.2806184054728815269186 sessionInfo() # R version 3.0.2 (2013-09-25) # Platform: x86_64-apple-darwin10.8.0 (64-bit)

The 64-bit version of Linux gives the same results as the 64-bit OSX, so this seems like a problem with the 32-bit and 64-bit versions.

Suppose both versions of R were compiled with the same version of GCC and with the same (default R) compilation flags that force the compiler to use IEEE-754 arithmetic.

My question is: can this be considered an error in R? Or is it just a “normal” consequence of using approximate floating point arithmetic with finite precision?

I sent the same question to the R-devel mailing list, but did not receive the answer on the list, and only one answer in private, trying to convince me that this is not a mistake, and I have to live with it.

This is what IEEE-754 says about reproducibility (from Wikipedia):

IEEE 754-1985 allows many implementation options (for example, encoding certain values and detecting certain exceptions). IEEE 754-2008 tightened many of them, but several options still remain (especially for binary formats). Reproducibility in the article recommends that language standards write reproducible programs (i.e. programs that will result in all implementations of the language) and describes what needs are needed to achieve reproducible results.

And this is in the "Recommendations" section.

My (subjective) opinion is that this is a mistake, because the whole point of the IEEE-754 standard has reproducible, platform-independent floating point arithmetic.

+8

floating-point ieee-754 r

Gabor Csardi Jan 19 '14 at 1:56

source share

1 answer

Eric Postpischil · Accepted Answer · 2014-01-19T02:04:58+0000

There are problems with the reproducibility of even elementary floating point operations in high-level languages, but they are usually controlled by various operations on the platform, such as setting compiler switches, using custom code to set elements and floating point modes, or, if necessary, writing significant assembly operations. As was developed in the comments, the specific problem you are facing may be that different C implementations use different precision to evaluate intermediate floating point expressions. Often this can be controlled using compiler switches or by including butt and assignments in expressions that require rounding to the nominal type (thus discarding excessive precision).

However, more complex functions, such as exp and cos , do not play regularly on different platforms. Although the 2008 IEEE-754 standard recommends implementing these functions with the correct rounding, this task has not been completed for any math library with a limited runtime. No one in the world has done the math to achieve this.

The CRlibm project implemented some of the functions with known runtime boundaries, but the work is incomplete. (Comment by Per Pascal Cuoqs, when CRlibm does not have a valid time reference for proper rounding, it returns to the result, which is likely to be correctly rounded due to the calculation with very high precision.) Finding out how to deliver correctly in a limited time, and prove that it is difficult for many functions. (See how you can prove that the value of cos(x) , where x is any double value, is closer than a small distance e from the middle between the two represented values. The midpoint is important because it should round off from returning one result to return another, and e will tell you how accurately and accurately you should calculate the approximation to ensure proper rounding.)

The current state of affairs is that many of the functions in the mathematical library are approximated, some accuracy is weaker than the correct rounding, and different providers use different implementations with different approximations. I believe that R uses some of these functions in its rexp implementation and that it uses its own libraries of its target platforms, so it has different results on different platforms.

To fix this, you can use a common math library on target platforms (possibly CRlibm).

Arithmetic and floating point reproducibility

More articles: