Std :: vector and std :: string redistribution strategy

What is the redistribution strategy used for std :: string and std :: vector in GCC implementations?

I am interested in the specific strategy used: when I add elements to a vector (or characters in a string), it can exceed the reserved size, and then reallocation will occur, but what will be the new size as a function of the old? In the case of deleting elements, what is the threshold for the actual redistribution and free memory (and again, what will be the new size)?

Responses to other compilers will also be appreciated.

+6
source share
4 answers

Take a look at the _M_check_len function in bits/stl_vector.h . It contains:

 const size_type __len = size() + std::max(size(), __n); 

So, the main strategy when adding n elements either doubles or increases by n, depending on which is larger. pop_back never freed. libC ++ (LLVM) does the same.

Reading code is the only way to learn strategies for string, release, etc.

+3
source

Both vector and string ensure that the complexity of push_back is a "shock-absorbed constant", which means that the time it takes to call push_back n times divided by n is limited by a constant as n increases. The only way to achieve this is to redistribute to increase the size geometrically, those. Make the new container some fixed multiple of the old size. This way, you will only have a "small amount" of redistributions.

Typical growth factors in implementations are 2 or 1.5, although any number strictly exceeding 1 will be satisfied.

This does not interact with reserve . If you call reserve(size() + 1) before each click, you can get a redistribution every time.

+3
source

All bets are disconnected from how reserve works. You use it something like this:

 std::vector<T> myV; myV.reserve(<space known to be needed>); 

so that you know that reserve does not need to be called (or any reallocation) until this place is exceeded.

+1
source

std :: vector has the size and capacity of the methods. It shouldn't be too hard to write a simple program that defines how memory is allocated. Strategies can change with each implementation, and even from version to version.

One of the strategies I've seen is to use incremental increments, which are an adaptive strategy: give more food to the hungry to avoid frequent data shuffling. But the growth factor is open to discussion. Simple duplication can grow too fast.

Further

Curious, I wrote this program. Here is the result (g ++ 4.3.3):

 capacity from 0 to 1 increased by 1 at size 1 capacity from 1 to 2 increased by 1 at size 2 capacity from 2 to 4 increased by 2 at size 3 capacity from 4 to 8 increased by 4 at size 5 capacity from 8 to 16 increased by 8 at size 9 capacity from 16 to 32 increased by 16 at size 17 capacity from 32 to 64 increased by 32 at size 33 capacity from 64 to 128 increased by 64 at size 65 capacity from 128 to 256 increased by 128 at size 129 capacity from 256 to 512 increased by 256 at size 257 capacity from 512 to 1024 increased by 512 at size 513 capacity from 1024 to 2048 increased by 1024 at size 1025 capacity from 2048 to 4096 increased by 2048 at size 2049 capacity from 4096 to 8192 increased by 4096 at size 4097 capacity from 8192 to 16384 increased by 8192 at size 8193 capacity from 16384 to 32768 increased by 16384 at size 16385 capacity from 32768 to 65536 increased by 32768 at size 32769 capacity from 65536 to 131072 increased by 65536 at size 65537 capacity from 131072 to 262144 increased by 131072 at size 131073 capacity from 262144 to 524288 increased by 262144 at size 262145 capacity from 524288 to 1048576 increased by 524288 at size 524289 

Using the initial distribution in the constructor leads to the same progression, using the initial value, not 1.

+1
source

All Articles