Solving floating point precision problems

I was wondering if there is a way to overcome the accuracy problem, which seems to be the result of an internal representation of the machine’s floating point numbers:

For clarity, the problem is summarized as:

// str is "4.600"; atof( str ) is 4.5999999999999996 double mw = atof( str ) // The variables used in the columns calculation below are: // // mw = 4.5999999999999996 // p = 0.2 // g = 0.2 // h = 1 (integer) int columns = (int) ( ( mw - ( h * 11 * p ) ) / ( ( h * 11 * p ) + g ) ) + 1; 

Before casting to an integer type, the result of calculating the columns is 1.999999999999999996; so close, but has not yet reached the desired result 2.0.

Any suggestions are welcome.

+9
c ++ floating-point floating-accuracy
Feb 26 '09 at 14:39
source share
8 answers

A very simple and effective way to round a floating point number to an integer:

 int rounded = (int)(f + 0.5); 

Note: this only works if f always positive. (thanks to j random hacker)

+4
Feb 26 '09 at 14:57
source share

When you use floating point arithmetic, strict equality is almost pointless. Usually you want to compare with a range of valid values.

Note that some values ​​cannot be represented exactly as floating point values.

See What Every Computer Scientist Should Know About Floating-Point Arithmetic and Comparison of Floating-Point Numbers .

+15
Feb 26 '09 at 14:48
source share

If you have not read it, the title of this article is really correct. Please study it to learn more about the basics of floating point arithmetic on modern computers, some pitfalls, and explanations of why they behave the way they behave.

+11
Feb 26 '09 at 14:47
source share

No problem with clear.

The result you obtained (1.9999999999999996) differs from the mathematical result (2) by the limit 1E-16. This is pretty accurate, given your input of "4.600".

You, of course, have a rounding problem. The default rounding in C ++ is truncation; you want something similar to a Kip solution. The details depend on your exact domain, do you expect round(-x)== - round(x) ?

+11
Feb 26 '09 at 15:31
source share

If precision is really important, you should consider using double precision floating point numbers, not just floating point ones. Although from your question, it seems you already have one. However, you still have a problem checking certain values. You need code in the lines (assuming you check your value against zero):

 if (abs(value) < epsilon) { // Do Stuff } 

where "epsilon" is a small but non-zero value.

+5
Feb 26 '09 at 15:51
source share

On computers, floating point numbers are never accurate. They are always just close. (1e-16 is close.)

Sometimes there are hidden bits that you do not see. Sometimes the basic rules of algebra are no longer applied: a * b! = B * a. Sometimes comparing case with memory, these subtle differences are revealed. Or using a math coprocessor against a floating point time library. (I do this waayyy tooo long.)

C99 defines: (Look at math.h)

 double round(double x); float roundf(float x); long double roundl(long double x); 

.

Or you can roll it yourself:

 template<class TYPE> inline int ROUND(const TYPE & x) { return int( (x > 0) ? (x + 0.5) : (x - 0.5) ); } 

For floating point equivalence try:

 template<class TYPE> inline TYPE ABS(const TYPE & t) { return t>=0 ? t : - t; } template<class TYPE> inline bool FLOAT_EQUIVALENT( const TYPE & x, const TYPE & y, const TYPE & epsilon ) { return ABS(xy) < epsilon; } 
+3
Feb 26 '09 at 16:36
source share

You can read this article to find what you are looking for.

You can get the absolute value of the result, as shown here :

 x = 0.2; y = 0.3; equal = (Math.abs(x - y) < 0.000001) 
+2
Feb 26 '09 at 14:43
source share

Use decimals: decNumber ++

+1
Feb 26 '09 at 14:41
source share



All Articles