What float values could not be converted to int without undefined behavior [C ++]?

Question

What float values could not be converted to int without undefined behavior [C ++]?

I just read this from the C ++ 14 standard (my emphasis):

4.9 Transforms with a floating integral [conv.fpint]
1 The value of a variable of type floating point can be converted to a prvalue of an integer type. The conversion truncates; those. the fractional part is discarded. The behavior is undefined if the truncated value cannot be represented in the destination type. [...]

What made me think

What values, if any, floatcannot be represented as intafter truncation? (Does it depend on the implementation?)
If they are, does this mean that it is auto x = static_cast<int>(float)unsafe?
What is the correct / safe way to convert floatto intthen (if you want truncation)?

+6

c ++ type-conversion implicit-conversion c ++ 14

ricab Jan 31 '18 at 17:58

source share

2 answers

, , . , , iee754 4 floats 8 doubles 2 (int32_t 4 int64_t 8 ).

, ( UB), memcpy .

, , , , UB , , double → int32_t. , , float min/max , .

, INT_MIN/INT_MAX ( ) , , .

Inf/NaN UB .

// float->int64 edgecases
static const uint32_t FloatbitsMaxFitInt64 = 0x5effffff; // [9223371487098961920] Largest float which still fits int an signed int64
static const uint32_t FloatbitsMinNofitInt64 = 0x5f000000; // [9223372036854775808] the bit pattern of the smallest float which is too big for a signed int64
static const uint32_t FloatbitsMinFitInt64 = 0xdf000000; // [-9223372036854775808] Smallest float which still fits int an signed int64
static const uint32_t FloatbitsMaxNotfitInt64 = 0xdf000001; // [-9223373136366403584] Largest float which to small for a signed int64

// float->int32 edgecases
static const uint32_t FloatbitsMaxFitInt32 = 0x4effffff; // [2147483520] the bit pattern of the largest float which still fits int an signed int32
static const uint32_t FloatbitsMinNofitInt32 = 0x4f000000; // [2147483648] the bit pattern of the smallest float which is too big for a signed int32
static const uint32_t FloatbitsMinFitInt32 = 0xcf000000; // [-2147483648] the bit pattern of the smallest float which still fits int an signed int32
static const uint32_t FloatbitsMaxNotfitInt32 = 0xcf000001; // [-2147483904] the bit pattern of the largest float which to small for a signed int32

// double->int64 edgecases
static const uint64_t DoubleBitsMaxFitInt64 = 0x43dfffffffffffff; // [9223372036854774784] Largest double which fits into an int64
static const uint64_t DoubleBitsMinNofitInt64 = 0x43e0000000000000; // [9223372036854775808] Smallest double which is too big for an int64
static const uint64_t DoubleBitsMinFitInt64 = 0xc3e0000000000000; // [-9223372036854775808] Smallest double which fits into an int64
static const uint64_t DoubleBitsMaxNotfitInt64 = 0xc3e0000000000001; // [-9223372036854777856] largest double which is too small to fit into an int64

// double->int32 edgecases[when truncating(round towards zero)]
static const uint64_t DoubleBitsMaxTruncFitInt32 = 0x41dfffffffffffff; // [~2147483647.9999998] Largest double that when truncated will fit into an int32
static const uint64_t DoubleBitsMinTruncNofitInt32 = 0x41e0000000000000; // [2147483648.0000000] Smallest double that when truncated wont fit into an int32
static const uint64_t DoubleBitsMinTruncFitInt32 = 0xc1e00000001fffff; // [~2147483648.9999995] Smallest double that when truncated will fit into an int32
static const uint64_t DoubleBitsMaxTruncNofitInt32 = 0xc1e0000000200000; // [2147483649.0000000] Largest double that when truncated wont fit into an int32

// double->int32 edgecases [when rounding via bankers method(round to nearest, round to even on half)]
static const uint64_t DoubleBitsMaxRoundFitInt32 = 0x41dfffffffdfffff; // [2147483647.5000000] Largest double that when rounded will fit into an int32
static const uint64_t DoubleBitsMinRoundNofitInt32 = 0x41dfffffffe00000; // [~2147483647.5000002] Smallest double that when rounded wont fit into an int32
static const uint64_t DoubleBitsMinRoundFitInt32 = 0xc1e0000000100000; // [-2147483648.5000000] Smallest double that when rounded will fit into an int32
static const uint64_t DoubleBitsMaxRoundNofitInt32 = 0xc1e0000000100001; // [~2147483648.5000005] Largest double that when rounded wont fit into an int32

, :

if( f >= B2F(FloatbitsMinFitInt32) && f <= B2F(FloatbitsMaxFitInt32))
    // cast is valid.

B2F - :

float B2F(uint32_t bits)
{
    static_assert(sizeof(float) == sizeof(uint32_t), "Weird arch");
    float f;
    memcpy(&f, &bits, sizeof(float));
    return f;
}

, nans/inf ( ), non-iee754 (, ffast-math on gcc /fp: fast on msvc)

+7

Mike Vine 31 . '18 18:37

anatolyg · Accepted Answer · 2018-01-31T18:45:31+0000

No wonder what floatmatters is out of range int. Floating point values were invented to represent very large (as well as very small) values.

INT_MAX + 1(usually equal 2147483648) cannot be represented int, but can be represented float.
Yes, static_cast<int>(float)as dangerous as undefined behavior. However, something simple, as x + yfor sufficiently large integers xand y, is also UB, so there is nothing surprising here.

, , ++. Boost numeric_cast, ; . ( INT_MIN INT_MAX), ,

float f;
int i;
...
if (static_cast<double>(INT_MIN) <= f && f < static_cast<double>(INT_MAX))
    i = static_cast<int>(f);
else if (f < 0)
    i = INT_MIN;
else
    i = INT_MAX;

. double, int? , . , , int? , boost::numeric_cast, .

What float values ​​could not be converted to int without undefined behavior [C ++]?

More articles:

What float values could not be converted to int without undefined behavior [C ++]?