How is std :: string implemented?

I am curious to know how std :: string is implemented and how it differs from c string? If the standard does not indicate any implementation, then any implementation with an explanation will be different in how it satisfies the string requirements specified by the standard?

+43
c ++ string std cstring
Sep 23 '09 at 13:39
source share
5 answers

Almost every compiler I used provides source code for the runtime, so no matter if you use GCC or MSVC or something else, you have the opportunity to take a look at the implementation. However, most or all of the std::string will be implemented as template code, which can make reading very difficult.

Scott Meyer's book, An Effective Version of STL , contains a chapter on std :: string implementations that provides a general overview of common variations: "Point 15: Remember the changes to string implementations.

He talks about 4 variations:

  • several implementations calculated by reference (usually called a copy on recording) - when a string object is copied without changes, the recount is increased, but the actual string data is missing. Both objects point to the same refcounted data, until one of the objects changes it, causing a "copy on write" data. The options are to store things like refcount, locks, etc.

  • implementation of the "short string optimization" (SSO). In this embodiment, the object contains a regular pointer to data, length, size of dynamically allocated buffer, etc. But if the string is short enough, it will use this area to store the string instead of dynamically allocating the buffer

In addition, Herb Sutter's “More Exceptional C ++” has an appendix (Appendix A: “Optimizations That Aren't (in a Multithreaded World)”) that discusses why re-linking a copy often causes performance problems in multithreaded applications from due to problems with synchronization. This article is also available online (but I'm not sure that it is exactly the same as in the book):

Both of these chapters would be helpful to read.

+66
Sep 23 '09 at 15:15
source share

std :: string is a class that wraps around some kind of internal buffer and provides methods for managing this buffer.

A string in C is just an array of characters

An explanation of all the nuances about how std :: string works will take too much time. Perhaps look at the gcc source code http://gcc.gnu.org to find out how they do it.

+11
Sep 23 '09 at 13:46
source share

Here is an example implementation in response on this page .

Alternatively, you can look at the gcc implementation, assuming you have gcc installed. If not, you can access their source code via SVN . Most std :: string is implemented by basic_string , so start there.

Another possible source of information is the Watcom compiler.

+6
Sep 23 '09 at 13:45
source share

The C ++ solution for strings is very different from the c version. The first and most important difference is that using the ASCIIZ solution, std :: string and std :: wstring use two iterators (pointers) to store the actual string. The basic use of string classes provides a dynamic dedicated solution, therefore, in the cost of processor overhead when working with dynamic memory, string processing is more convenient.

As you probably already know, C does not contain a built-in typical string type, it provides only a couple of string operations through the standard library. One of the main differences between C and C ++, which C ++ provides wrapped functionality, so it can be considered as a fake general type.

In C, you need to go through a string if you want to know its length, the member function std :: string :: size () is just one command (end to start). You can safely add lines to each other as long as you have memory, so there is no need to worry about buffer overflow errors (and therefore about exploits), because adding creates a larger buffer, if necessary.

As someone previously said, a string is derived from vector functionality in a standardized way, so working with multi-byte systems is simplified. You can define your own string type using typedef std :: basic_string specific_str_t; expression with any arbitrary data type in the template parameter.

I think there are enough pluses and sides to both sides:

C ++ string Pros: - Faster iteration in certain cases (with certain sizes, and no data from memory is needed to check if you are at the end of a line by comparing two pointers that can affect caching) - The buffer operation is full of functionality strings, so fewer worries about buffer problems.

C ++ line Cons: - Due to the dynamic memory allocation material, the main use can affect performance. (fortunately, you can tell the string object what the original buffer size should be, so if you do not exceed it, it will not allocate dynamic blocks from memory) - often strange and inconsistent names compared to other languages. this is a bad thing about any stl stuff, but you can use it and it creates a slightly specific C ++ feel. - Intensive use of templates makes the standard library use header-based solutions, so this greatly affects compilation time.

+4
Sep 23 '09 at 17:32
source share

It depends on the standard library that you use.

STLPort , for example, is an implementation of the C ++ standard library that implements, among other things, strings.

+3
Sep 23 '09 at 13:49
source share



All Articles