Convert int to float: how to do it

Question

Convert int to float: how to do it

I am new to C programming language and I would like to ask a question.

The integer here is executed for float, then f (somehow) successfully represents 5.0:

int i = 5; float f = i; //Something happened here...

However, if we try this approach:

 int i = 5; float f = *(float *)&i;

f will not get 5.0, because it interprets the bits stored in i in float mode. So, what is the magic that the correspondent actually performs in the first case? This seems to be a rather time-consuming job ... Can anyone point this out? Thanks.

+7

c ++ c casting type-conversion

Xu hong Jul 20 '12 at 15:14

source share

7 answers

Steve jessop · Answer 1 · 2012-07-20T15:16:17+0000

This is a work in progress, but any floating point processor will provide instructions that do this.

If you needed to convert 2 int additions to the IEEE float format for yourself, you would:

take the integer base-2 log (closely related to the index of the highest bit of the set), which gives you an exponent. Slide it and store it in the float exponent bits.
copy the upper n bits of int (starting with the bits after the first set of unsigned characters) to the float value. n , however, has many bits of value, but in float (23 for a 32-bit single-point float). If the int has remaining bits (that is, if it is greater than 2 ²⁴ ), and the next bit after those for which you have a place is 1 , you may or may not be rounded up depending on the IEEE rounding mode in operation.
copy the character bit from int to float .

parapura rajkumar · Answer 2 · 2012-07-20T15:29:01+0000

If you look at the assembly

  int i = 5; 000D139E mov dword ptr [i],5 float f = i; 000D13A5 fild dword ptr [i] 000D13A8 fstp dword ptr [f]

fild is what magic does

Skizz · Answer 3 · 2012-07-20T15:30:03+0000

On IA32 systems, the compiler generates the following: -

 fild dword ptr [i] ; load integer in FPU register, I believe all 32bit integers can be represented exactly in an FPU register fstp dword ptr [f] ; store fpu register to RAM, truncating/rounding to 32 bits, so the value may not be the same as i

Pankaj kumar boora · Answer 4 · 2017-05-16T06:37:54+0000

convert int bits to float

 float IntBitsToFloat(long long int bits) { int sign = ((bits & 0x80000000) == 0) ? 1 : -1; int exponent = ((bits & 0x7f800000) >> 23); int mantissa = (bits & 0x007fffff); mantissa |= 0x00800000; // Calculate the result: float f = (float)(sign * mantissa * Power(2, exponent-150)); return f; }

Sebastian mach · Answer 5 · 2012-07-20T15:26:12+0000

The magic depends on your platform.

One of the possibilities is that your processor has special instructions for copying floating-point numbers into integral registers.

Of course, someone should design these processors, so this is actually not an explanation of the algorithm.

The platform can use a floating point format that is similar to this (actually this is a fixed-point format for example):

 [sIIIIFFFF]

where s is the sign, I is the part before the point, F is the part after the point, for example. (the point is virtual and for presentation only)

  - 47.5000 [sIIII.FFFF]

in this case, the conversion is almost trivial and can be implemented using the bitrate:

  -47.5000 >> 4 --------------- -47

And, as in this example, the C ++ implementation on the market uses a floating point view, which is often called the IEEE floating point , see also IEEE 754-1985 . They are more complex than fixed-point numbers, because they do mean a simple formula of the form _s * m ⁿ however they have a well-defined interpretation, and you can expand them into something more suitable.

David g · Answer 6 · 2012-07-20T15:26:37+0000

In almost all modern systems, the specification for floating point arithmetic is the IEEE754 standard. This details everything from a layout in memory to how truncation and rounding are propagated. This is a large area, and quite often you need to talk in detail about scientific and engineering programming.

Joseph Willcoxson · Answer 7 · 2012-07-20T15:31:11+0000

Well, I just compiled the code in question under VC ++ and looked at the disassembly:

  int i = 5; 00A613BE mov dword ptr [i],5 float f = i; 00A613C5 fild dword ptr [i] 00A613C8 fstp dword ptr [f]

Convert int to float: how to do it

More articles: