Difference in accuracy with floating point division and multiplication

Is there any difference between this:

average = (x1+x2)/2; deviation1 = x1 -average; deviation2 = x2 -average; variance = deviation1*deviation1 + deviation2*deviation2; 

and this:

 average2 = (x1+x2); deviation1 = 2*x1 -average2; deviation2 = 2*x2 -average2; variance = (deviation1*deviation1 + deviation2*deviation2) / 4; 

Please note that in the second version I try to postpone the division as late as possible. Does the second version [division delays] increase accuracy overall?

The snippet above is just an example, I'm not trying to optimize this snippet.

By the way, I am asking for a separation in general, and not just 2 or a power of 2, since they come down to simple shifts in the IEEE 754 presentation. I took the division by 2, just to illustrate the problem, using a very simple example.

+8
language-agnostic floating-point algorithm numbers
source share
4 answers

Nothing will come of it. You only change the scale, but in your calculations you will not get more significant figures.

The Wikipedia article on deviations explains at a high level some of the variance of computations in a reliable way.

+3
source share

You do not get accuracy from this, since the IEEE754 (probably what you use under the covers) gives you the same accuracy (number of bits) at any scale that you work on. For example, 3.14159 x 10 7 will be as accurate as 3.14159 x 10 10 .

The only possible advantage (of the first) is that you can avoid overflow when setting up deviations. But, as long as the values ​​themselves are less than half the maximum possible, this will not be a problem.

+2
source share

I have to agree with David Heffernan, this will not give you higher accuracy.

The reason is how floating point values ​​are stored. You have several bits representing significant digits and some bits representing an exponent (e.g. 3.1714x10-12). Bits for significant digits will always be the same no matter how big your digit is, which means that the result will not really be different.

Even worse, delaying division can lead to overflow if you have very large numbers.

If you really need higher accuracy, there are many libraries that allow you to make large numbers or numbers with greater accuracy.

+1
source share

The best way to answer your question is to run tests (both random and distributed over the range?), And see if the resulting numbers in the binary representation match.

Note that one of the problems you will have if you do this is that your functions will not work for the value > MAX_INT/2 , due to the way you average the code.

 avg = (x1+x2)/2 # clobbers numbers > MAX_INT/2 avg = 0.5*x1 + 0.5*x2 # no clobbering 

This is almost certainly not a problem if you are not writing a library at the language level. And if most of your numbers are small, it may not matter at all? In fact, this is probably not worth considering, since the dispersion value will exceed MAX_INT , since it is an integral quadratic value; I would say that you can use standard deviation, but nobody does.

Here I am doing some experiments in python (which, I think, supports IEEE regardless of the fact that it possibly delegates math to C libraries ...):

 >>> def compare(numer, denom): ... assert ((numer/denom)*2).hex()==((2*numer)/denom).hex() >>> [compare(a,b) for a,b in product(range(1,100),range(1,100))] 

No problem, I think, because division and multiplication by 2 are well represented in binary format. However, try to multiply and divide by 3:

 >>> def compare(numer, denom): ... assert ((numer/denom)*3).hex()==((3*numer)/denom).hex(), '...' Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 1, in <listcomp> File "<stdin>", line 2, in compare AssertionError: 0x1.3333333333334p-1!=0x1.3333333333333p-1 

Perhaps this matters a lot? Perhaps if you work with very small numbers (in this case you can use log arithmetic ). However, if you work with large numbers (which is rare in probability) and you postpone the division, you, as I mentioned, overflowed the risk, but even worse, the risk of errors due to hard-to-read code .

+1
source share

All Articles