Implicit conversion from long long to float gives unexpected result

In an attempt to verify (using VS2012) a book application (second sentence) that

When we assign an integral value to an object of floating-point type, the fractional part is zero. Precision may be lost if the integer has more bits than the floating-point object can accommodate. 

I wrote the following prog:

 #include <iostream> #include <iomanip> using std::cout; using std::setprecision; int main() { long long i = 4611686018427387905; // 2^62 + 2^0 float f = i; std::streamsize prec = cout.precision(); cout << i << " " << setprecision(20) << f << setprecision(prec) << std::endl; return 0; } 

Output signal

 4611686018427387905 4611686018427387900 

I was expecting form output

 4611686018427387905 4611690000000000000 

How is a 4-byte float able to store so much information about an 8-byte integer? Is there a value for i that actually demonstrates the claim?

+7
c ++
source share
2 answers

Floats do not store their data in database 10, they store it in database 2. Thus, 4611690000000000000 is actually not a very round number. This is a binary representation:

 100000000000000000000111001111100001000001110001010000000000000. 

As you can see, writing takes a lot of data. However, the number that is actually printed has the following binary representation:

 11111111111111111111111111111111111111111111111111111111111100 

As you can see, the multidimensional number and the fact that it is disconnected by 4 from the power of two is most likely due to rounding in the convert-to-base-10 algorithm.

As an example of a number that does not fit in the float, try the number you expected:

 4611690000000000000 

You will notice that it will come out very differently.

+5
source share

A float stores so much information because you are working with a number that is so close to a power of 2.

The float format stores numbers in the main binary scientific notation. In your case, it is stored as something like

1.0000000 ... [61 zeros] ... 00000001 * 2 ^ 62.

The float format cannot store 62 decimal places, so the final 1 is disabled ... but we have 2 ^ 62 left, which almost exactly matches the number you are trying to save.

I am not good at production examples, but there is no CERT; you can view an example of what happens with converted numerical conversions here . Note that the example is in Java, but C ++ uses the same floating-point types; In addition, the first example is a conversion between a 4-byte int and a 4-byte float , but this proves your point again (there is less integer information that needs to be stored than in your example, but it still fails).

+4
source share

All Articles