I am going to answer a question that I think you really wanted to ask, "should push_back() be avoided in internal loops of heavy algorithms?" and not what others seem to have read in your post that "does it matter if I call push_back before doing an unrelated view on a large vector?" In addition, Iβm going to answer from my own experience, and not waste time chasing citations and peer-reviewed articles.
Your example basically does two things that come down to the total cost of the CPU: it reads and works with the elements in the input vector, and then it must insert the elements into the output vector. You are worried about the cost of inserting elements because:
- push_back () - constant time (instant, really) when the vector has enough space previously reserved for an additional element, but slow when you have exhausted the reserved space.
- Allocating memory is expensive (
malloc() just slow , even when pedants pretend that new is something else) - Copying vector data from one region to another after redistribution is also slow : when push_back () finds that it does not have enough space, it must go and select a large vector, and then copy all the elements . (Theoretically, for vectors, which are numerous OS pages, the magic of the STL implementation can use VMMs to move them in the virtual address space without copying and in practice I have never seen one that could .
- Overfulfillment of output vectors causes problems: it causes fragmentation, which slows down future distributions; it burns the data cache, making it slower; if it is stable, it links limited free memory, leading to paging on PC to PC and crashing on embedded platforms.
- The unallocation of the output vectors causes problems, since the redistribution of the vector is an O (n) operation, therefore its redistribution m times is O (m Γ n). If the default STL allocator uses exponential redistribution (making the vector reserve twice its previous size each time it is recalculated), this makes your linear algorithm O (n + n log m).
Thus, your instinct is correct: there is always a predefined space for your vectors, where possible, not because push_back is slow, but because it can cause a slow redistribution. Also, if you look at the shrink_to_fit implementation, you will see that it also redistributes the copy, temporarily doubling the memory cost and causing further fragmentation.
Your problem is that you do not always know exactly how much space you will need for the output vectors; the usual answer is to use a heuristic and possibly custom allocator. Reserve n / 2 + k input sizes for each of your output vectors by default, where k is some margin of safety. Thus, you will usually have enough room for output if your input is reasonably balanced, and push_back can be redistributed in rare cases when it is absent. If you find that the exponential behavior of push_back is consuming too much memory (forcing you to reserve 2n elements when you really need n + 2), you can give it a special allocator that extends the vector size in smaller linear fragments - but of course this will be much slower when the vectors are really unbalanced and you end up making a lot of changes.
It is impossible to always reserve the exact amount of free space without missing input elements in advance; but if you know what the balance looks like, you can use heuristics to get a good idea about it, to get statistical performance over many iterations.
source share