Why is this condition not met for comparing negative and positive integers?

#include <stdio.h> int arr[] = {1,2,3,4,5,6,7,8}; #define SIZE (sizeof(arr)/sizeof(int)) int main() { printf("SIZE = %d\n", SIZE); if ((-1) < SIZE) printf("less"); else printf("more"); } 

The output after compiling with gcc is "more" . Why does the if condition fail even if -1 < 8 ?

+8
c ++ c comparison if-statement signed
Aug 15 '13 at 7:14
source share
6 answers

The problem is in your comparisons:

  if ((-1) < SIZE) 

sizeof usually returns an unsigned long , so SIZE will be an unsigned long , whereas -1 is just an int . Promotion rules in C and related languages ​​mean that -1 will be converted to size_t before the conversion, so -1 will become a very large positive value (the maximum value is unsigned long ).

One way to fix this is to change the comparison to:

  if (-1 < (long long)SIZE) 

although this is actually a meaningless comparison, since the unsigned value will always be> = 0 by definition, and the compiler can warn you about this.

As @Nobilis later pointed out, you should always include compiler warnings and take note of them: if you compiled, for example, gcc -Wall ... compiler would warn you of your error.

+17
Aug 15 '13 at 7:15
source share

TL; DR

Be careful with mixed signed / unsigned operations (use -Wall compiler warnings). The Standard has a long section about this. In particular, it is often, but not always, true that the signed value is converted to unsigned (although this is done in your specific example). See This Explanation Below (taken from this Q & A )

Corresponding quote from the C ++ standard:

5 Expressions [expr]

10 Many binary operators awaiting arithmetic operands or an enumeration type cause conversions and produce results in a similar way. The goal is to give a generic type, which is also a type of result. This pattern is called regular arithmetic conversion, which are defined as follows:

[2 articles on equal types or types of equal signs omitted]

- Otherwise, if the operand that has an unsigned integer type has a rank greater than or equal to the ranks of the type of the other operand, the integer-type operand must be converted to the operand type with an unsigned integer.

- Otherwise, if the operand type with an integer sign can represent all values ​​of the operand type with an unsigned integer type, the operand with an unsigned integer must be converted to the operand type with an unsigned integer type.

- Otherwise, both operands must be converted to an unsigned integer type corresponding to the operand type with a signed integer type.

Your actual example

To find out in which of the 3 cases your program crashes, slightly change it to

 #include <stdio.h> int arr[] = {1,2,3,4,5,6,7,8}; #define SIZE (sizeof(arr)/sizeof(int)) int main() { printf("SIZE = %zu, sizeof(-1) = %zu, sizeof(SIZE) = %zu \n", SIZE, sizeof(-1), sizeof(SIZE)); if ((-1) < SIZE) printf("less"); else printf("more"); } 

In the Coliru online compiler, this displays 4 and 8 for sizeof() from -1 and SIZE respectively, and selects the larger branch ( live example ).

The reason is that the unsigned type has a higher rank than the signed type. Therefore, Proposition 1 is applied, and the signed type is converted by value to the unsigned type (on most implementations, as a rule, by preserving the representation of bits, therefore wrapping around a very large unsigned number), and then the comparison selects “more”.

Variations on the topic

Rewriting the condition in if ((long long)(-1) < (unsigned)SIZE) will result in a less branch ( live example ).

The reason is that the signed type has a higher rank than the unsigned type, and may also contain all unsigned values. Therefore, clause 2 applies, and the unsigned type is converted to the signed type, and then the comparison then selects the less branch.

Of course, you will never write such a far-fetched if() with an explicit cast, but the same effect can happen if you compare variables with long long and unsigned types. Thus, this illustrates the fact that mixed signed / unsigned arithmetic is very subtle and depends on relative sizes ("ranking" in the words of the standard). In particular, there are no fixed rules saying that signers will always be converted to unsigned .

+11
Aug 15 '13 at 9:07 on
source share

When you perform a comparison between signed and unsigned , where unsigned has at least equal rank than signed type (see TemplateRex answer for exact rules), signed converted to unsigned type.

For your case, on a 32-bit machine, the binary representation of -1 as unsigned is 4294967295. Thus, you compare if 4294967295 is less than 8 (this is not the case).

If you enabled warnings, you would have warned the compiler that something suspicious is happening:

warning: comparison between signed and unsigned integer expressions [-Wsign-compare]

Since the discussion has changed a bit as far as the use of unsigned appropriate, let me quote James Gosling regarding the lack of unsigned types in Java (and I will shamelessly link to another post on my topic):

Gosling: for me, as a language designer, which I really don’t think like today, that the “simple” really turned out to be. I expect J. Random Developer to put the specification into his head. What the definition says is that, for example, Java is not - and in fact a lot of angular affairs end up in these languages ​​that no one really understands. Try any C developer about unsigned, and pretty soon you will find that almost no C developer understands what is unsigned, what unsigned arithmetic is. Complex C made such things. The language part of Java, I think, is quite simple. Libraries you need to find.

+7
Aug 15 '13 at 7:16
source share

This is a historical C design error that has also been repeated in C ++.

It goes back to 16-bit computers, and the error decided to use all 16 bits to represent sizes up to 65536, giving up the ability to represent negative sizes.

It would not be a mistake if the unsigned value were a "non-negative integer" (the size cannot be logically negative), but this is a problem with the language conversion rules.

Given the rules for converting the language, the unsigned type in C does not represent a non-negative number, but instead it looks more like a bitmask (the mathematical term is actually " ℤ/n ring member "). To understand why to consider that for the language C and C ++

  • unsigned - unsigned gives unsigned result
  • signed + unsigned gives and unsigned result

both of them obviously make no sense if you read unsigned as a "non-negative number".

Of course, saying that the size of the object is a member of the ℤ/n ring, it makes no sense at all, and here it is, where the error is.

Practical implications:

Every time you deal with the size of an object , be careful because this value is unsigned , and this type in C / C ++ has many properties that are illogical to a number. Always remember that unsigned does not mean “non-negative integer”, but “a member of the algebraic ring ℤ/n ” and, most dangerous, in the case of a mixed operation, a int converted to unsigned int and not vice versa.

For example:

 void drawPolyline(const std::vector<P2d>& pts) { for (int i=0; i<pts.size()-1; i++) { drawLine(pts[i], pts[i+1]); } } 

is a mistake because if it skips an empty dot vector, it will perform illegal operations (UB). The reason is that pts.size() is unsigned .

The rules of the language convert 1 (integer) to 1{mod n} , will subtract in ℤ/n , resulting in (size-1){mod n} , will also convert i to the representation {mod n} and will perform the comparison in ℤ/n .

C / C ++ actually defines the < operator in ℤ/n (rarely performed in mathematics), and you get access to pts[0] , pts[1] ... and so on to huge numbers, even if the input vector was empty .

The right cycle could be

 void drawPolyline(const std::vector<P2d>& pts) { for (int i=1; i<pts.size(); i++) { drawLine(pts[i-1], pts[i]); } } 

but I usually prefer

 void drawPolyline(const std::vector<P2d>& pts) { for (int i=0,n=pts.size(); i<n-1; i++) { drawLine(pts[i], pts[i+1]); } } 

in other words, get rid of unsigned as soon as possible and just work with regular ints.

Never use unsigned to represent the size of containers or counters, because unsigned means " ℤ/n member" and container size is not one of these things. Unsigned types are useful, but NOT for representing the size of objects.

The standard C / C ++ library, unfortunately, made this wrong choice, and it is too late to fix it. However, you are not required to make the same mistake.

According to Bjarne Straustrup :

Using unsigned instead of int to get another bit to represent positive integers is almost never a good idea. Attempts to ensure that some values ​​are positive by declaring unsigned variables are usually defeated by implicit conversion rules

+6
Aug 15 '13 at 8:02
source share

well, I'm not going to repeat the strong words of Paul R, but when you compare unsigned and integers, you will experience the bad things of the dome.

do if ((-1) < (int)SIZE)

instead of the if condition

+2
Aug 15 '13 at 7:18
source share

Convert unsigned type returned from sizeof operator to signed

when you compare two unsigned and signed numbers, the compiler implicitly converts the signed to unsigned. -1 in 4 bytes of int is 11111111 11111111 11111111 11111111 when converting to unsigned this representation will refer to 2 ^ 16-1
So basically you are comparing this 2 ^ 16-1> SIZE, which would be true.
You must override this by explicitly passing the unsigned value to the signature. Since the sizeof operator returns an unsigned long long, you must send it to a long long

 if((-1)<(signed long long)SIZE) 

use this if condition in your code

-one
Aug 15 '13 at 7:21
source share



All Articles