C at negative zero (1 complementary and signed value)

All these functions give the expected result on my machine. Do they all work on other platforms?

More specifically, if x has a representation of 0xffffffff bits on 1 add-on machine or 0x80000000 on sign machines, what does the standard for representing (unsigned) x mean?

Also, I think that (unsigned) cast in v2, v2a, v3, v4 is redundant. Is it correct?

Assume sizeof (int) = 4 and CHAR_BIT = 8

int logicalrightshift_v1 (int x, int n) { return (unsigned)x >> n; } int logicalrightshift_v2 (int x, int n) { int msb = 0x4000000 << 1; return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0); } int logicalrightshift_v2a (int x, int n) { return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0); } int logicalrightshift_v3 (int x, int n) { return ((x & 0x7fffffff) >> n) | (x < 0 ? (unsigned)0x80000000 >> n : 0); } int logicalrightshift_v4 (int x, int n) { return ((x & 0x7fffffff) >> n) | (((unsigned)x & 0x80000000) >> n); } int logicalrightshift_v5 (int x, int n) { unsigned y; *(int *)&y = x; y >>= n; *(unsigned *)&x = y; return x; } int logicalrightshift_v6 (int x, int n) { unsigned y; memcpy (&y, &x, sizeof (x)); y >>= n; memcpy (&x, &y, sizeof (x)); return x; } 
+7
source share
2 answers

If x has a representation of bits 0xffffffff on 1 complement machines or 0x80000000 on sign machines, which the standard says about the representation (unsigned) of x?

A conversion to unsigned specified in terms of values, not representations. If you convert -1 to unsigned , you always get UINT_MAX (so if your unsigned is 32 bits, you always get 4294967295 ). This happens regardless of the representation of the signed numbers that your implementation uses.

Similarly, if you convert -0 to unsigned , you always get 0 . -0 numerically equal to 0.

Please note that to support negative zeros, the implementation of one complement or sign value is not required; if this is not the case, then access to such a view causes the program to have undefined behavior.

Browse your functions one by one:

 int logicalrightshift_v1(int x, int n) { return (unsigned)x >> n; } 

The result of this function for negative values ​​of x will depend on UINT_MAX and will be further determined by the implementation if (unsigned)x >> n not in the range of int . For example, logicalrightshift_v1(-1, 1) will return UINT_MAX / 2 regardless of which representation the machine uses for signed numbers.

 int logicalrightshift_v2(int x, int n) { int msb = 0x4000000 << 1; return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0); } 

Almost everything about this can be determined by implementation. Assuming you are trying to create a value in msb with 1 in the sign of the bits and zeros in the bits of the value, you cannot do this with shifts - you can use ~INT_MAX , but it is allowed to have undefined on a machine with a signed value that does not allow negative zeros, and she is allowed to give the result defined by the implementation on two machines with additions.

Types 0x7fffffff and 0x80000000 will depend on ranges of different types, which will affect the promotion of other values ​​in this expression.

 int logicalrightshift_v2a(int x, int n) { return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0); } 

If you create an unsigned value that is not in the int range (for example, with a 32-bit int , values> 0x7fffffff ), then the implicit conversion in the return statement creates an implementation, a specific value. The same goes for v3 and v4.

 int logicalrightshift_v5(int x, int n) { unsigned y; *(int *)&y = x; y >>= n; *(unsigned *)&x = y; return x; } 

This is still implemented because it is not determined whether the character bit in the int representation matches the bit value or the fill bit in the unsigned representation. If this corresponds to a fill bit, it could be a trap representation, in which case the behavior is undefined.

 int logicalrightshift_v6(int x, int n) { unsigned y; memcpy (&y, &x, sizeof (x)); y >>= n; memcpy (&x, &y, sizeof (x)); return x; } 

This includes the same comments as v5.

Also, I think (unsigned) cast in v2, v2a, v3, v4 is redundant. Is it right?

It depends. As a hexadecimal constant, 0x80000000 will be of type int if this value is in the range of int ; otherwise unsigned if this value is in the range unsigned ; otherwise long , if this value is in the long range; otherwise unsigned long (because this value is within the minimum acceptable range of unsigned long ).

If you want to make sure that it has an unsigned type, then the constant suffix is ​​with U , 0x80000000U .


Summary:

  • Converting a number greater than INT_MAX to int gives the result determined by the implementation (or, indeed, allows to increase the signal determined by the implementation).

  • Converting an out-of-range number to unsigned is done by re-adding or subtracting UINT_MAX + 1 , which means that it depends on the mathematical value, and not on the representation.

  • Checking a negative int representation as unsigned not portable (positive int representations are fine, though).

  • Creating a negative zero when using bitwise operators and trying to use the resulting value are not portable.

If you need β€œlogical shifts,” then you should use unsigned types everywhere. Signed types are designed to work with algorithms in which value matters, not representation.

+8
source

If you follow the standard for the word, none of them will be guaranteed the same on all platforms.

In v5, you break strict anti-aliasing, which is undefined.

In v2 - v4, you signed the right shift, which is defined by the implementation. (see comments for more details)

In v1, you are subscribed to an unsigned cast, which is an implementation determined when the number is out of range.

EDIT:

v6 can really work under the following assumptions:

  • 'int' is either 2 or 1 complement.
  • unsigned and int are exactly the same size (both in bytes and in bits and are tightly packed).
  • The designation unsigned corresponds to the value int .
  • Mark and bit layout are the same: (see cafe comment for more details.)
+2
source

All Articles