The line does not end in NULL, but still behaves normally, why?

In the following code, I will copy the string to a char * string with a length of 10 characters using strncpy() .

Now, according to the strncpy() instruction "Warning: if the first n bytes of src do not have a null byte, the line placed in dest will not be null terminated." This is exactly what is happening here.

The original string is 26 characters long, and I copied 10 characters, so the null character does not fit at the end of the line.

But when I print the contents of the line, starting from 0 until getting "\ 0", this behaves fine.

Why? When there is no "\ 0" at the end, why does the loop stop at the right place?

I understand that it should give "Segmentation Error" or, at least, it should not stop there and continue to print some garbage values.

 #include <stdio.h> #include <stdlib.h> #include <string.h> #define SIZE 10 int main() { char *str ; str = malloc( sizeof( char ) * SIZE ); if( str == NULL ) exit( 1 ); memset( str, 0, sizeof( char ) * SIZE ); strncpy( str, "abcdefghijklmnopqrstuvwxyz", sizeof( char ) * SIZE ); unsigned int index; for( index = 0; str[ index ] != '\0' ; index++ ) { printf( "str[ %u ] has got : %c \n ", index, str[ index ] ); } return 0; } 

Here is the result:

  str [0] has got: a
  str [1] has got: b
  str [2] has got: c
  str [3] has got: d
  str [4] has got: e
  str [5] has got: f
  str [6] has got: g
  str [7] has got: h
  str [8] has got: i
  str [9] has got: j

Any help would be appreciated.

EDIT

Is there a way to check if a string ends with '\ 0' or not? I always thought the above loop is the final test, but now it doesn't seem to be that way.

Suppose we get a string from some function developed by another programmer. Now, as we find out that it ends in the right place with '\ 0'. Maybe this does not happen, then it will go beyond the actual size until we get "\ 0". We will never know the actual row size.

So how do we solve this situation?

Any suggestion?

+6
c string pointers
source share
6 answers

As for your editing, I think being pedantic will help to figure out some problems.

In C there is no such line as a line. There is the concept of a “C string” that the standard C library works with, which is defined as nothing more than a complete sequence of NUL characters, so there really isn’t such a thing as a “zero-terminated string" in C. So your the question is better worded as "How to determine if an arbitrary character buffer is a valid C string? "or" How to determine if a found string was a suitable string "

The answer to the first question, unfortunately, is to simply scan the buffer until you come across the NUL byte, as you do. This will give you the length of the string C.

The second question has no easy answer. Due to the fact that C does not have an actual string type with length metadata (or the ability to transfer the size of arrays by function calls), there is no real way to determine whether the above string length was the length of the intended string. This may be obvious if we start to see segfaults in the program or garbage in the output, but in general we get stuck performing string operations, scanning to the first NUL byte (usually with a high number per line length, to avoid random buffer overflow errors)

+6
source share

It just happens that the null byte is outside the allocated block.

Most likely, malloc() allocates more memory and puts the so-called security values ​​that contain zero bytes, or puts some metadata that will be used by free() later, and this metadata contains zero byte at this position.

In any case, you should not rely on this behavior. You must request ( malloc() ) another byte for the null character so that the null character arrangement is also legally allocated to you.

There is no portable way to verify the validity of a null string. It may happen that as soon as you pass the highlighted block, your program will simply crash. Or it may happen that the null character is somewhere outside the block, and later you rewrite the memory outside the block when you manipulate an incorrectly interpreted string.

Ideally, you need some function that checks whether a given address is assigned to you and belongs to the same distribution as another given address (possibly the beginning of the block). It will be slow and not worth it, and there is no standard way to do this.

In other words, if you come across a line that should be completed with a zero mark, but in fact it is not you tightening it very much, your program will work in undefined.

+15
source share

Why does it work?

In the memory you allocate, there is a byte '\0' in the right place. (For example, if you use Visual C ++ in debug mode, a zero heap manager allocates memory before passing it to your program, but that might just be luck.)

Is there a way to check if a string ends at '\0' or not?

Not. You need your strings to be either zero-terminated (which is what the std lib string function expects), or you need to transfer their length to an additional variable. If you do not have any of them, you have a mistake.

Now, as we learn that some line from some function developed by some other programmer ends in the right place with '\0' . Maybe this does not happen, then it will go beyond the actual size until we get '\0' . We will never know the actual row size.

So how do we solve this situation?

You can not. If another function clicks on it so badly, you are mistaken.

+4
source share

Sharptooth explained the likely cause of the behavior, so I'm not going to repeat it.

When allocating buffers, I always redistribute by byte, for example:

 #define SIZE 10 char* buf = malloc(sizeof(char)*(SIZE+1)); /* error-check the malloc call here */ buf[SIZE] = '\0'; 
0
source share

You are fortunate to have zero outside the allocated area of ​​space.

Try using this code on all other platforms, and you will see that it can behave differently.

0
source share

I think the correct answer is correct. More space is allocated. I modify the program as follows:

 #include <stdio.h> #include <stdlib.h> #include <string.h> #define SIZE 10 int main() { char *str ; int *p; int actual_length; str = malloc( sizeof( char ) * SIZE ); if( str == NULL ) exit( 1 ); actual_length = (int)*(str - 4) - 1 - 4; printf("actual length of str is %d\n", actual_length); p = (int*) malloc(sizeof(int)); if (p == NULL) exit(1); *p = -1; char* pc = (char*)(p - 1); pc [0] = 'z'; pc [1] = 'z'; pc [2] = 'z'; pc [3] = 'z'; memset( str, 0, sizeof( char ) * SIZE ); memcpy( str, "abcdefghijklmnopqrstuvwxyz", sizeof( char ) * SIZE ); int i; for (i = SIZE; i < actual_length; i++) str[i] = 'y'; unsigned int index; for( index = 0; str[ index ] != '\0' ; index++ ) { printf( "str[ %u ] has got : %c \n ", index, str[ index ] ); } return 0; } 

Output signal

 actual length of str is 12 str[ 0 ] has got : a str[ 1 ] has got : b str[ 2 ] has got : c str[ 3 ] has got : d str[ 4 ] has got : e str[ 5 ] has got : f str[ 6 ] has got : g str[ 7 ] has got : h str[ 8 ] has got : i str[ 9 ] has got : j str[ 10 ] has got : y str[ 11 ] has got : y str[ 12 ] has got : z str[ 13 ] has got : z str[ 14 ] has got : z str[ 15 ] has got : z str[ 16 ] has got : \377 str[ 17 ] has got : \377 str[ 18 ] has got : \377 str[ 19 ] has got : \377 

My OS is Debian Squeeze / sid.

0
source share

All Articles