What are the consequences of returning a value of -1 as the return value of size_t in C?

I am reading a tutorial, and one example does this. Below I reproduced the example in abbreviated form:

#include <stdio.h> #define SIZE 100 size_t linearSearch(const int array[], int searchVal, size_t size); int main(void) { int myArray[SIZE]; int mySearchVal; size_t returnValue; // populate array with data & prompt user for the search value // call linear search function returnValue = linearSearch(myArray, mySearchVal, SIZE); if (returnValue != -1) puts("Value Found"); else puts("Value Not Found"); } size_t linearSearch(const int array[], int key, size_t size) { for (size_t i = 0; i < size; i++) { if (key == array[i]) return i; } return -1; } 

Are there any potential issues with this? I know that size_t is defined as an unsigned integral type, so it seems like it might at some point cause problems if I return -1 as the return value of size_t.

+5
source share
3 answers

Several APIs appear in it that use the maximum signed value or unsigned integer value as the control value. For example, the C ++ method std::string::find() returns std::string::npos if the value specified in find() is not found in the string and std::string::npos is (std::string::size_type)-1 .

Similarly, on iOS and OS X, the NSArray indexOfObject: method NSNotFound when the object cannot be found in the array. Surprisingly, NSNotFound is actually defined by NSIntegerMax , which is either INT_MAX for 32-bit platforms or LONG_MAX for 64-bit platforms, although NSArray indexes NSArray usually NSUInteger (which is either unsigned int for 32-bit platforms or unsigned long for 64-bit platforms).

This means that there will be no difference between “not found” and “element number 18,446,744,073,709,551,615” (for 64-bit systems), but whether an acceptable compromise is suitable for you.

An alternative is that the function returns an index through a pointer argument and has a function return value indicating success or failure, for example.

 #include <stdbool.h> bool linearSearch(const int array[], int val, size_t size, size_t *index) { // find value and then if (found) { *index = indexOfFoundItem; return true; } else { *index = 0; // optional, in some cases, better to leave *index untouched return false; } } 
+2
source

Your compiler may decide to file a complaint about comparing signed with unsigned - GCC or Clang, if provoked * but otherwise it "works." On two-component machines (most machines today) (size_t)-1 same as SIZE_MAX - indeed, as discussed in extenso in the comments, this is the same for machines with one addition or a symbolic value due to the wording in clause 6.3. 3.3 standards C99 and C11).

Using (size_t)-1 to indicate "not found" means that you cannot distinguish between the last record in the maximum possible array and "not found", but this is rarely an actual problem.

So, is this just a one-edge case in which I might have a problem?

The array must be a char array, however, in order to be large enough to cause problems - and although you can have 4 GiB memories with a 32-bit machine, it is pretty implausible to have all the memory in an array of characters (and this is much less likely that problem with 64-bit machines, most of them do not work up to 16 exbibytes of memory). Thus, this is not a practical edge case.

On POSIX, there is a type ssize_t , a signed type of the same size size_t . You can use this instead of size_t . However, this causes the same longing that (size_t)-1 causes, in my experience. In addition, on a 32-bit machine, you may have a 3 gigabyte memory block processed as a char array, but with ssize_t you could not use more than two GiB as the return type - d you need to use SSIZE_MIN (if it exists, I don’t sure he does) instead of -1 as the signal value.


* GCC or Clang needs to be provoked quite difficult. Just using -Wall not enough; A warning is required by -Wextra (or the specific -Wsign-compare option). Since I usually compile with -Wextra , I know about this problem; not everyone is just as vigilant.

Comparison of signed and unsigned values ​​is completely determined by the standard, but can lead to counter-intuitive results (since small negative numbers look very large when converted to unsigned values), so compilers complain if they require it.

+1
source

Usually, if you want to return negative values ​​and still have some idea of ​​the size type, you use ssize_t . gcc and clang both complain, but the following compilation. Please note that some of the following actions are undefined ...

 #include <stdio.h> #include <stdint.h> size_t foo() { return -1; } void print_bin(uint64_t num, size_t bytes); void print_bin(uint64_t num, size_t bytes) { int i = 0; for(i = bytes * 8; i > 0; i--) { (i % 8 == 0) ? printf("|") : 1; (num & 1) ? printf("1") : printf("0"); num >>= 1; } printf("\n"); } int main(void){ long int x = 0; printf("%zu\n", foo()); printf("%ld\n", foo()); printf("%zu\n", ~(x & 0)); printf("%ld\n", ~(x & 0)); print_bin((~(x & 0)), 8); } 

Output signal

 18446744073709551615 -1 18446744073709551615 -1 |11111111|11111111|11111111|11111111|11111111|11111111|11111111|11111111 

I'm on a 64 bit machine. In binary format

 |11111111|11111111|11111111|11111111|11111111|11111111|11111111|11111111 

may mean -1 or 18446744073709551615 , it depends on the context, i.e. how the type that uses this binary representation is used.

0
source

All Articles