Why are unsigned integers error prone?

I watched this video . Bjarne Stroustrup says unsigned ints are error prone and lead to errors. Therefore, you should use them only when you really need them. I also read one of the questions about Qaru (but I don’t remember which one) that using unsigned ints can lead to security errors.

How do they lead to security errors? Can someone clearly explain this by providing a suitable example?

+60
c ++ unsigned-integer
May 22 '15 at 11:11
source share
8 answers

One possible aspect is that unsigned integers can lead to several difficult tasks in loops, since the lower stream leads to large numbers. I can’t count (even with an unsigned integer!) How many times I made a variant of this error

for(size_t i = foo.size(); i >= 0; --i) ... 

Note that by definition i >= 0 always true. (This is primarily due to the fact that if i signed, the compiler will warn of a possible overflow using size_t of size() ).

There are other reasons Danger mentioned - unsigned types used here! , the most powerful of which, in my opinion, is an implicit type conversion between signed and unsigned.

+49
May 22 '15 at 11:19
source share

One big factor is that it complicates the logic of the loop: imagine that you want to iterate over everything except the last element of the array (what happens in the real world). So you write your function:

 void fun (const std::vector<int> &vec) { for (std::size_t i = 0; i < vec.size() - 1; ++i) do_something(vec[i]); } 

It looks good, right? It even compiles with very high warning levels! ( Live ) So you put this in your code, all tests go smoothly and you forget about it.

Now, later, someone comes through the empty vector passages to your function. Now, with an integer sign, you would hopefully notice a warning sign compiler warning , present a corresponding cast and not post an error code in the first place.

But in your unsigned integer implementation, you end and the loop condition becomes i < SIZE_T_MAX . Disaster, UB, and most likely a failure!

I want to know how they lead to security errors?

This is also a security issue, in particular a buffer overflow . One way you could possibly use this is if do_something did something that an attacker could observe. He could find what contribution will be made to do_something , and thus the data that the attacker could not get will be a memory leak. This will be a script similar to "Heart Failure . " (Thanks to the ratchet skill, pointing this out to comment .)

+36
May 22 '15 at 11:18
source share

I will not watch the video just to answer the question, but one problem is the confusing conversions that can happen if you mix signed and unsigned values. For example:

 #include <iostream> int main() { unsigned n = 42; int i = -42; if (i < n) { std::cout << "All is well\n"; } else { std::cout << "ARITHMETIC IS BROKEN!\n"; } } 

Promotion rules mean that i converted to unsigned for comparison, giving a large positive number and an amazing result.

+23
May 22 '15 at 11:23
source share

Although this can only be considered as a variant of the existing answers: referring to "Signed and unsigned types in interfaces," C ++ Report, September 1995 by Scott Meyers, it is especially important to avoid unsigned types in interfaces .

The problem is that it becomes impossible to detect certain errors that interface clients can make (and if they can make them, they will make them).

The given example:

 template <class T> class Array { public: Array(unsigned int size); ... 

and possible creation of this class

 int f(); // f and g are functions that return int g(); // ints; what they do is unimportant Array<double> a(f()-g()); // array size is f()-g() 

The difference in the values ​​returned by f() and g() can be negative for a number of reasons. The constructor of the Array class will receive this difference as a value that is implicitly converted to unsigned . Thus, as a developer of the Array class, you cannot distinguish between the ergonomically past value of -1 and the very large distribution of the array.

+11
May 22 '15 at 15:04
source share

The big problem with unsigned int is that if you subtract 1 from unsigned int 0, the result will not be a negative number, the result will be no less than the number you started from, but the result is the largest possible unsigned int value.

 unsigned int x = 0; unsigned int y = x - 1; if (y > x) printf ("What a surprise! \n"); 

And this is what makes unsigned int error minded. Of course, unsigned int works exactly the way it is designed to work. It is absolutely safe if you know what you are doing and do not make mistakes. But most people are wrong.

If you use a good compiler, you turn on all the warnings that the compiler generates, and it will tell you when you are doing dangerous things, which can be errors.

+4
May 22 '15 at 11:59
source share

The problem with unsigned integer types is that, depending on their size, they can represent one of two different things:

  • Unsigned types smaller than int (for example, uint8 ) store numbers in the range 0..2ⁿ-1, and calculations with them will behave according to the rules of integer arithmetic if they do not exceed the range of type int . In accordance with these rules, if such a calculation exceeds the range of int , the compiler is allowed to do whatever he likes with the code, even if allowed to hide the laws of time and causality (some compilers will do just that!), And even if the result of the calculation will be returned to unsigned type smaller than int .
  • Unsigned types unsigned int and larger hold elements of the abstract wrapping algebraic ring of integers, matching mod 2ⁿ; this effectively means that if the calculation falls outside the range 0..2ⁿ-1, the system will add or subtract any amount equal to 2ⁿ to get the value back in the range.

Therefore, with uint32_t x=1, y=2; xy expression can have one of two values ​​depending on whether int more than 32 bits.

  • If int greater than 32 bits, the expression subtracts the number 2 from the number 1, giving the number -1. Please note that although a variable of type uint32_t cannot contain a value of -1, regardless of the size of int , saving either -1 will cause such a variable to contain 0xFFFFFFFF, but until or until the value is forcedly unsigned, it will behave like signed value -1.
  • If int is 32 bits or less, the expression will give the value uint32_t , which when added to the value uint32_t will give the value uint32_t 1 (i.e. the value uint32_t 0xFFFFFFFF).

IMHO, this problem can be solved purely if C and C ++ were to define new unsigned types [for example. unum32_t and uwrap32_t], so that unum32_t will always behave like a number, regardless of the size of int (possibly requiring that the right subtraction or unary minus operation be raised to the next larger signed type if int is 32 bits or less), and wrap32_t will always behave like a member of an algebraic ring (blocking promotions, even if int was more than 32 bits). However, in the absence of such types, it is often impossible to write code that is portable and clean, since portable code often requires the use of types in all places.

+2
May 22 '15 at 17:45
source share

The rules for converting numbers to C and C ++ are a Byzantine mess. Using unsigned types exposes you to this confusion to a much greater degree than using purely signed types.

Take, for example, the simple case of comparing two variables: one with a sign and the other with no sign.

  • If both operands are smaller than int, they will both be converted to int, and the comparison will give numerically correct results.
  • If the unsigned operand is smaller than the signed operand, then both will be converted to the type of the signed operand, and the comparison will give numerically correct results.
  • If the unsigned operand is greater than or equal in size to the signed operand, and also greater than or equal in size to int, then both will be converted to the type of the unsigned operand. If the value of the signed operand is less than zero, this will lead to numerically incorrect results.

As another example, consider the multiplication of two unsigned integers of the same size.

  • If the size of the operand is greater than or equal to the size of int, then the multiplication will have a certain semantics of flow.
  • If the size of the operand is smaller than int, but greater than or equal to half the size of int, then there is the possibility of undefined behavior.
  • If the size of the operand is less than half the size of int, then multiplication will give numerically correct results. Assigning this result back to a variable of the original unsigned type will create certain flow semantics.
+2
May 22 '17 at 12:07 a.m.
source share

In addition to the range / warp issue with unsigned types. Using a combination of unsigned and signed integer types affects a significant processor performance issue. Smaller than floating point, but quite a lot to ignore it. In addition, the compiler can set the range check to a value and change the behavior of further checks.

-3
May 22 '15 at 13:31
source share



All Articles