Where does C ++ really store the string if the char array that stores it is smaller than the string?

I am testing an example about strings in C ++ from the book "C ++ Premiere".

const int size = 9; char name1[size]; char name2[size] = "C++owboy"; // 8 characters here cout << "Howdy! I'm " << name2 << "! What your name?" << endl; cin >> name1; // I input "Qwertyuiop" - 11 chars. It is more than the size of name1 array; // now I do cout cout << "Well, your name has " << strlen(name1) << " letters"; // "Your name has 11 letters". cout << " and is stored in an array of " << size(name1) << " bytes"; // ...stored in an array of 9 bytes. 

How can it be that 11 characters are stored in an array for only 8 characters + '\ 0' char? Does it get wider when compiling? Or is the string stored somewhere else?

Also, I cannot:

 const int size = 9; char name2[size] = "C++owboy_12345"; // assign 14 characters to 9 chars array 

But I can do what I wrote above:

 cin >> name1; // any length string into an array of smaller size 

What's the trick? I use the NetBeans compiler and Cygwin g ++.

+4
source share
4 answers

This is a typical buffer overflow. This is why you should always check the size of the input if you put it in the buffer. Here's what happens:

In C ++ (and C), array names are just pointers to the first element of the array. The compiler knows the size of the array and performs some compile-time checks. But at runtime, it will only be considered as char *.

When you did cin >> name1 , you passed char * to cin . cin does not know how significant the allocated space is - all it has is a pointer to some memory. Thus, it is assumed that you have been allocated enough space, everything is written and goes through the end of the array. Here is the image:

 Bytes 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Before |-----name1 array-------| |--- other data-| After Q wertyuiop \0 |-er data-| 

As you can see, you overwritten the other data that was saved after the array. Sometimes this other data is just garbage, but in other cases it is important and can mean a complex error. Not to mention that this is a security vulnerability, since an attacker could overwrite program memory using user input.

The confusion about sizes is that strlen will count bytes until it finds '\0' (the null terminator), which means it finds 10 characters. On the other hand, size(name1) uses the actual size of the array provided by the compiler.

Due to these problems, whenever you see a C function that takes an array as an argument, it also takes the size of the array. Otherwise, there is no way to say how great it is. To avoid these problems, it is much better to use C ++ objects such as std :: string.

+3
source

Writing more records to the array than the size of the array allows you to invoke undefined behavior. A computer can store this data anywhere or not at all.

Typically, data is stored in what happens in the next memory. It could be another variable, a flow of commands, or even a check register for a bomb under your chair.

Simply put: you encoded a buffer overflow error. Do not do this.


Just for fun: Undefined behavior is behavior that the C ++ standard does not comment on. It can be anything, since the standard does not impose any restrictions on it.

In one specific case, behavior increases my bank balance from $ 10 to $ 1.8 billion: http://ideone.com/35FQW

Can you understand why this program can behave this way?

+9
source

name1 is assigned an address in memory. If you write 80 bytes, it will write more than 80 bytes to memory, starting from this point. If there is a variable stored in the address name1 + 20, then it will have its data overwritten by your 80 bytes record to name1. This is how everything works in C / C ++, they are called buffer overflows and can be used to crack programs.

+5
source

No trick here :) you write memory outside of the buffer, this is undefined bahaviour

+3
source

All Articles