A floating point calculation gives different results with a float than with a double

Question

A floating point calculation gives different results with a float than with a double

I have the following line of code.

hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()));

void onBeingHit(int decHP) method accepts an integer and updates health points.
float getDefensePercent() method is a getter method that returns the percentage of hero protection.
ENEMY_ATTACK_POINT is a macro defined as #define ENEMY_ATTACK_POINT 20 .

Let's say hero->getDefensePercent() returns 0.1 . Thus, the calculation

 20 * (1.0 - 0.1) = 20 * (0.9) = 18

Whenever I tried it with the following code (no f appending 1.0 )

 hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()));

I got 17 .

But for the following code ( f added after 1.0 )

 hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0f - hero->getDefensePercent()));

I got 18 .

What's happening? Is f significant at all, although hero->getDefensePercent() already in the float?

+4

c ++ double floating-point floating-accuracy

haxpor Feb 15 '13 at 7:51

source share

3 answers

1.0 is interpreted as double, unlike 1.0f, which is considered by the compiler as a float.

The suffix f simply tells the compiler which is a float and is double.

As the name suggests, double has 2x float accuracy. In the general case, double has 15 to 16 decimal digits of precision, while float has only 7.

This loss of accuracy can significantly reduce truncation errors.

See MSDN (C ++)

+4

user2166576 Feb 15 '13 at 7:55

source share

The reason this happens is a more accurate result when using double , i.e. 1.0 .

Try to round the result, which will lead to a more accurate integral result after the conversion:

 hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()) + 0.5);

Note that adding 0.5 and truncating to int immediately after it causes the result to be rounded, so by the time your result is 17.999... , it will be 18.499... which will be truncated to 18

+4

Liho Feb 15 '13 at 8:00

source share

leemes · Accepted Answer · 2013-02-15T08:00:17+0000

What's happening? Why in both cases there is no integer result 18 ?

The problem is that the result of the floating point expression is rounded to zero when converted to an integer value (in both cases).

0.1 cannot be represented exactly as a floating point value (in both cases). The compiler converts to the IEEE754 floating point binary number and decides whether to round up or down to the value represented. The processor then multiplies this value at runtime, and the result is rounded to get an integer value.

Ok, but since both double and float behave like this, why am I getting 18 in one of two cases, but 17 in the other case? I am embarrassed.

Your code accepts the result of the 0.1f (float) function, and then calculates 20 * (1.0 - 0.1f) , which is a double expression, and 20 * (1.0f - 0.1f) is a floating point expression. Now the floating point version is slightly larger than 18.0 and is rounded to 18 , and the double expression is slightly less than 18.0 and rounded to 17 .

If you don’t know exactly how the IEEE754 binary floating-point numbers are constructed from decimal numbers, it will be pretty random if it is slightly less or slightly more than the decimal number you entered in your code. Therefore, you should not count on it. Do not try to fix this problem by adding f to one of the numbers and say “now it works, so I leave it f there”, because another value behaves differently.

Why does the type of expression depend on the accuracy of this f ?

This is because the floating point literal in C and C ++ is of type double by default. If you add f , it will be a float. The result of floating-point compression is of a “larger" type. The result of a double expression and an integer is still a double expression, and int and float will be float. So the result of your expression is either float or double.

Good, but I don't want to round to zero. I want to round to the nearest number.

To fix this problem, add half the result before converting it to an integer:

 hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()) + 0.5);

In C ++ 11, there is std::round() for this. In previous versions of the standard, there was no such function for rounding to the nearest integer. (For more details see Comments).

If you don't have std::round , you can write it yourself. Be careful when dealing with negative numbers. When converting to an integer, the number will be truncated (rounded to zero), which means that negative values will be rounded, not down. Therefore, we must subtract half if the number is negative:

 int round(double x) { return (x < 0.0) ? (x - .5) : (x + .5); }

A floating point calculation gives different results with a float than with a double

More articles: