User71404 answer extension:
int f(unsigned x) { if (x <= INT_MAX) return static_cast<int>(x); if (x >= INT_MIN) return static_cast<int>(x - INT_MIN) + INT_MIN; throw x;
If x >= INT_MIN (follow the promotion rules, INT_MIN converted to unsigned ), then x - INT_MIN <= INT_MAX , so this will not overflow.
If this is not obvious, look at the claim "If x >= -4u , then x + 4 <= 3 " and keep in mind that INT_MAX will be equal to at least the math value -INT_MIN-1.
On the most common systems, where !(x <= INT_MAX) implies x >= INT_MIN , the optimizer should be able (and on my system, able) to remove the second check, determine that two return can be compiled for the same code, and also remove the first check. List of assembled assemblies:
__Z1fj: LFB6: .cfi_startproc movl 4(%esp), %eax ret .cfi_endproc
Hypothetical implementation in your question:
- INT_MAX equals 32767
- INT_MIN is -2 32 + 32768
impossible, therefore, does not require special consideration. INT_MIN will be either -INT_MAX or -INT_MAX - 1 . This follows from the representation C of integer types (6.2.6.2), for which bit n must be value bits, one bit must be a signed bit and allows only one trap representation (not counting invalid representations due to filling bits), namely the one that otherwise it would represent negative zero / -INT_MAX - 1 . C ++ does not allow any integer representations other than what C allows.
Refresh . The Microsoft compiler does not seem to notice that the tags x > 10 and x >= 11 are checking the same thing. It only generates the desired code if x >= INT_MIN is replaced by x > INT_MIN - 1u , which it can detect as negation x <= INT_MAX (on this platform).
[Update from the survey (Nemo), in which we discuss below]
Now I believe that this answer works in all cases, but for complex reasons. I will most likely award a reward for this decision, but I want to capture all the gory details in case someone cares.
Let's start with C ++ 11, section 18.3.3:
Table 31 describes the <climits> header.
...
The content is the same as the title of the standard C library <limits.h> .
Here, “Standard C” means C99, the specification of which severely restricts the representation of signed integers. They look like unsigned integers, but with one bit intended for the “sign”, and zero or more bits intended for the “fill”. The padding bits do not contribute to the value of the integer, and the signed bit only contributes as a double padding, single padding or signed value.
Since C ++ 11 inherits the <climits> macros from C99, INT_MIN is either -INT_MAX or -INT_MAX-1, and the hvd code is guaranteed to work. (Note that due to the addition, INT_MAX can be much smaller than UINT_MAX / 2 ... But thanks to the way the signed-> unsigned casts function works, this answer handles this fine.)
C ++ 03 / C ++ 98 is more complicated. It uses the same wording to inherit <climits> from "Standard C", but now "Standard C" means C89 / C90.
All of them - C ++ 98, C ++ 03, C89 / C90 - have the wording I give in my question, but also include this (C ++ 03, section 3.9.1, paragraph 7):
Integral type representations must define values ​​using a pure binary numbering system. (44) [Example: this International Standard allows for the addition of 2s, the addition of 1s and the signature of the representation for integral types.]
Footnote (44) defines a “pure binary numbering system”:
A positional representation for integers that uses the binary digits 0 and 1, in which the values ​​represented by the successive bits are additive, start with 1 and multiply by the power integral 2, with the possible exception of the bit with the highest position.
What is interesting in this formulation is that it contradicts itself, because the definition of a "pure binary numbering system" does not allow the representation of a sign / quantity! This allows a high bit to have, say, a value of -2 n-1 (two additions) or - (2 n-1 -1) (one addition), but for a large bit there is no value that leads to a sign / value.
In any case, my “hypothetical implementation” does not qualify as “pure binary” in accordance with this definition, therefore it is excluded.
However, the fact that the high bit is a special tool, we can imagine that it brings any value at all: a small positive value, a huge positive value, a small negative value or a huge negative value. (If the sign bit can contribute - (2 n-1 -1), why not - (2 n-1 -2)? And so on.)
So, imagine an integer representation with a sign that assigns an empty value to the "sign" bit.
A small positive value for the sign bit will result in a positive range for the int (possibly as large as unsigned ), and the hvd code handles this simply.
A huge positive value for the sign bit will cause int have a maximum exceeding unsigned , which is prohibited.
A huge negative value for the sign bit will lead to the fact that int will be an non-contiguous range of values ​​and a different formulation in the specification rules.
Finally, what about the sign bit, which introduces a small negative value? Can we have 1 in the "sign bit", for example, to add the value -37 to the value of int? So, INT_MAX will (say) 2 31 -1, and INT_MIN will be -37?
This will lead to some numbers having two representations ... But a single complement gives two representations to zero, and this is allowed by the "example". Nowhere does the specification say that zero is the only integer that can have two representations. Therefore, I think that this hypothesis is allowed by the specification.
Indeed, any negative value from -1 to -INT_MAX-1 appears to be valid as a value for the "sign bit", but nothing less (so that the range is not continuous). In other words, INT_MIN can be anything: from -INT_MAX-1 to -1.
Now guess what? For the second cast in the hvd code, to avoid the behavior defined during implementation, we just need x - (unsigned)INT_MIN less than or equal to INT_MAX . We just showed that INT_MIN at least -INT_MAX-1 . Obviously, x does not exceed UINT_MAX . Dropping a negative unsigned number is the same as adding UINT_MAX+1 . Combine all of this:
x - (unsigned)INT_MIN <= INT_MAX
if and only if
UINT_MAX - (INT_MIN + UINT_MAX + 1) <= INT_MAX -INT_MIN-1 <= INT_MAX -INT_MIN <= INT_MAX+1 INT_MIN >= -INT_MAX-1
The last thing we just showed, so even in this flawed case, the code really works.
This exhausts all possibilities, thereby ending this extremely academic exercise.
Bottom line. There is some seriously undefined behavior for signed integers in C89 / C90 that were inherited from C ++ 98 / C ++ 03. It is fixed in C99, and C ++ 11 indirectly inherits the correction by including <limits.h> from C99. But even C ++ 11 preserves the very contradictory wording of the "pure binary representation" ...