Is: "std :: string can contain the character" \ 0 "by design?

The fact that std::string can actually contain the characters '\0' appears all the time. This, of course, is not compatible with C-style strings.

So, I wonder, is it by design, or is it an omission, or is it just a fact that the standard does not prohibit it, and compilers allow this to happen?

+7
source share
5 answers

I wonder what your quarrel is. '\0' is another character. There is no effective way to ban it in a general-purpose char chain. The fact that the same character has a special meaning in C is unsuccessful, but should be considered as all the restrictions that are imposed by the old code as soon as you interact with it.

This should not be a problem if you stick with code that uses only std::string .

To answer your comment, we need to look at the constructor that takes a char* , which will be basic_string(const charT* s, const Allocator& a = Allocator()) in 21.4.2 9/10 in n3242. It says that the size of the inner string is determined through traits::length(s) , which in the case of std::string is strlen , which requires its argument to be zero. So yes, if you try to build std::string from const char* , it should be null-terminated.

+11
source

There is a set of functions that take char * 'arguments and assume that the string ends with zero. If you use them carefully, you probably have lines with 0 in them.

STL lines, in contrast, deliberately allow null bytes, since they do not use 0 to terminate. Therefore, a simple answer to your question: "yes, by design."

+2
source

The standard does not say that in the case of std :: string '\0' there is any special character. Therefore, any compatible implementation of std::string should not be treated as '\0' like any special character. Unless, of course, a const char* is passed to a member function of a string that is considered null.

+1
source

By design.

C can also have null-terminated strings:

 char sFoo[4]; strncpy(sFoo,"Test",sizeof(sFoo)); 

Where sFoo contains a non-null string.

And it has null-terminated strings that can have 0, like

 struct String { char *str; size_t length; size_t capacity; }; 

String literals have NUL, but this does not always apply to strings.

So, with a trailing string, NUL is practice, but that means 0 in an invalid character.

+1
source

strncpy vs. strncat

However, strncpy and strncat etc. add a null limiter if there is room.

Actually strncpy and strncat very different:

strncpy writes a "NUL-filled string of n-bytes" to the n-byte buffer: a string of length l no more than n, so the last nl bytes are filled with NUL. Pay attention to the plural: all last bytes are nullified, mark only one. Also note that the maximum allowable value for l is indeed n, so there may be null NUL bytes: the buffer cannot contain a null-terminated string. (GCC has an unbearable function for measuring such a "NUL-filled n-byte string": strnlen .)

Conversely, strncat outputs a string with a null character to the buffer. In both cases, the string is truncated if it is too long, but in the case of strncpy string of letters will fit into the n-byte buffer, whereas in the case of strncat result of n letters will only fit into the (n + 1) -byte buffer.

This difference causes a lot of confusion for beginners and not even beginners. I even saw a lesson and books that teach "safe C programming" that confused and contradicted the information about these standard features.

These so-called โ€œsafeโ€ C string management functions (โ€œ strn* โ€) have been heavily criticized in the โ€œsafe programmingโ€ C community, and more developed (but non-standard) alternatives have been invented (in particular, the โ€œ strl* โ€ family: strlcpy . ..).

Summary:

  • strncpy will add a null terminator if there is room ;
  • strncat will always add a null terminator.
0
source

All Articles