Problem with invalid STL iterators when calling delete

The STL standard defines that when erasure occurs in containers such as std :: deque, std :: list etc, iterators are invalid.

My question is this, assuming a list of integers contained in std :: deque, and a pair of pointers indicating a range of elements in std :: deque, what is the correct way to remove all even elements?

So far, I have the following, however, the problem here is that the intended end is not valid after deletion:

#include <cstddef> #include <deque> int main() { std::deque<int> deq; for (int i = 0; i < 100; deq.push_back(i++)); // range, 11th to 51st element std::pair<std::size_t,std::size_t> r(10,50); std::deque<int>::iterator it = deq.begin() + r.first; std::deque<int>::iterator end = deq.begin() + r.second; while (it != end) { if (*it % 2 == 0) { it = deq.erase(it); } else ++it; } return 0; } 

Studying how std :: remove_if is implemented, it seems like a very expensive copy / shift process is going on.

  • Is there a more efficient way to achieve the above without all copies / shifts

  • In general, deleting / deleting an element is more expensive than replacing it with the next nth value in the sequence (where n is the number of deleted / deleted elements)

Note: The answers suggest that the sequence size is quite large, + 1 mil elements and that on average 1/3 of the elements will be erased.

+4
source share
4 answers

I would use Erase-Remove Idiom . I think the Wikipedia-related article even shows what you are doing - removing odd items.

The copy that remove_if does is not more expensive than what happens when you remove items from the middle of the container. It can be even more effective.

+8
source

Calling .erase() also leads to a "very expensive copy / drag down process." When you remove an element from the middle of the container, each other element after this point should be shifted by one place in the available space. If you delete several items, you incur this cost for each item to be erased. Some of the elements that are not erased will move several points, but are forced to move one place at a time, and not immediately. It is very inefficient.

The standard library algorithms std::remove and std::remove_if optimize this work. They use a smart trick so that each item moved only once. This is much, much faster than what you do yourself, contrary to your intuition.

The pseudocode is as follows:

 read_location <- beginning of range. write_location <- beginning of range. while read_location != end of range: if the element at read_location should be kept in the container: copy the element at the read_location to the write_location. increment the write_location. increment the read_location. 

As you can see, each element in the original sequence is examined exactly once, and if it needs to be saved, it is copied exactly once to the current record_record. This will never be considered again, because write_location can never work before read_location.

+5
source

Remember that deque is a continuous memory container (like a vector, and possibly a joint implementation), so removing mid-container elements necessarily means copying the next elements above the hole. You just want to make sure that you perform one iteration and copy each non-removable object directly to its final position, rather than moving all the objects one at a time during each delete. remove_if efficient and appropriate in this regard, your erase cycle erase not: it does a huge amount of unnecessary copying.

FWIW - alternatives:

  • add the β€œremote” state to your objects and mark them in place, but then every time you work with the container, you need to check yourself.
  • use a list that is implemented using pointers to the previous and next elements, so deleting a list item changes adjacent points to bypass this item: no copying, efficient iteration, just random access, smaller (i.e. inefficient) heap allocations and pointer overhead

What to choose depends on the nature, relative frequency and performance requirements of specific operations (for example, maybe you can afford slow deletion if they are performed at non-critical times, but the fastest iteration is required - whatever this is, make sure you understand their needs and consequences of various data structures).

+2
source

One alternative that was not mentioned before is to create a new deque , copy the elements you want to keep in it, and swap with the old deque .

 void filter(std::deque<int>& in, std::pair<std::size_t,std::size_t> range) { std::deque<int> out; std::deque<int>::const_iterator first = in.begin(); std::deque<int>::const_iterator curr = first + range.first; std::deque<int>::const_iterator last = first + range.second; out.reserve(in.size() - (range.second-range.first)); std::copy(first, curr, std::back_inserter(out)); while (curr != last) { if (*curr & 1) { out.push_back(*curr); } ++curr; } std::copy(last, in.end(), std::back_inserter(out)); in.swap(out); } 

I'm not sure that you have enough memory to create a copy, but it is usually simpler and easier to make a copy instead of trying to embed the erase elements from a large collection. If you still see a discrepancy in memory, then find out how many elements you are going to save by calling std::count_if and reserving them. This way you will have one memory allocation.

0
source

All Articles