Convert int to float: how to do it

I am new to C programming language and I would like to ask a question.

The integer here is executed for float, then f (somehow) successfully represents 5.0:

int i = 5; float f = i; //Something happened here... 

However, if we try this approach:

 int i = 5; float f = *(float *)&i; 

f will not get 5.0, because it interprets the bits stored in i in float mode. So, what is the magic that the correspondent actually performs in the first case? This seems to be a rather time-consuming job ... Can anyone point this out? Thanks.

+7
source share
7 answers

This is a work in progress, but any floating point processor will provide instructions that do this.

If you needed to convert 2 int additions to the IEEE float format for yourself, you would:

  • take the integer base-2 log (closely related to the index of the highest bit of the set), which gives you an exponent. Slide it and store it in the float exponent bits.
  • copy the upper n bits of int (starting with the bits after the first set of unsigned characters) to the float value. n , however, has many bits of value, but in float (23 for a 32-bit single-point float). If the int has remaining bits (that is, if it is greater than 2 24 ), and the next bit after those for which you have a place is 1 , you may or may not be rounded up depending on the IEEE rounding mode in operation.
  • copy the character bit from int to float .
+13
source

If you look at the assembly

  int i = 5; 000D139E mov dword ptr [i],5 float f = i; 000D13A5 fild dword ptr [i] 000D13A8 fstp dword ptr [f] 

fild is what magic does

+6
source

On IA32 systems, the compiler generates the following: -

 fild dword ptr [i] ; load integer in FPU register, I believe all 32bit integers can be represented exactly in an FPU register fstp dword ptr [f] ; store fpu register to RAM, truncating/rounding to 32 bits, so the value may not be the same as i 
+2
source

convert int bits to float

 float IntBitsToFloat(long long int bits) { int sign = ((bits & 0x80000000) == 0) ? 1 : -1; int exponent = ((bits & 0x7f800000) >> 23); int mantissa = (bits & 0x007fffff); mantissa |= 0x00800000; // Calculate the result: float f = (float)(sign * mantissa * Power(2, exponent-150)); return f; } 
+2
source

The magic depends on your platform.

One of the possibilities is that your processor has special instructions for copying floating-point numbers into integral registers.

Of course, someone should design these processors, so this is actually not an explanation of the algorithm.

The platform can use a floating point format that is similar to this (actually this is a fixed-point format for example):

 [sIIIIFFFF] 

where s is the sign, I is the part before the point, F is the part after the point, for example. (the point is virtual and for presentation only)

  - 47.5000 [sIIII.FFFF] 

in this case, the conversion is almost trivial and can be implemented using the bitrate:

  -47.5000 >> 4 --------------- -47 

And, as in this example, the C ++ implementation on the market uses a floating point view, which is often called the IEEE floating point , see also IEEE 754-1985 . They are more complex than fixed-point numbers, because they do mean a simple formula of the form _s * m n however they have a well-defined interpretation, and you can expand them into something more suitable.

+1
source

In almost all modern systems, the specification for floating point arithmetic is the IEEE754 standard. This details everything from a layout in memory to how truncation and rounding are propagated. This is a large area, and quite often you need to talk in detail about scientific and engineering programming.

0
source

Well, I just compiled the code in question under VC ++ and looked at the disassembly:

  int i = 5; 00A613BE mov dword ptr [i],5 float f = i; 00A613C5 fild dword ptr [i] 00A613C8 fstp dword ptr [f] 
0
source

All Articles