C ++ python equivalent difference_update?

s1 and s2 are sets (Python or C ++ set std :: set)
To add s2 elements to s1 (set the union) you can do

Python: s1.update(s2) C++: s1.insert(s2.begin(), s2.end()); 

To remove s2 elements from s1 (set the difference) you can do

 Python: s1.difference_update(s2) 

What is the C ++ equivalent? The code

 s1.erase(s2.begin(), s2.end()); 

does not work, s1.erase () requires iterators from s1.Code

 std::set<T> s3; std::set_difference(s1.begin(), s1.end(), s2.begin(), s2.end(), std::inserter(s3, s3.end()); s1.swap(s3); 

works, but seems too complicated, at least compared to Python.

Is there an easier way?

+7
source share
5 answers

Using std::set_difference is an idiomatic way to do this in C ++. You came across one of the main differences (pun intended) between C ++ / STL and many other languages. STL does not associate operations directly with data structures. This is why std::set does not execute the difference procedure.

Basically, algorithms like std::set_difference write the result of the operation to another object. It is interesting to note that the algorithm does not require that both or both operands be actually std::set . Algorithm Definition:

Effects : copies elements of the range [first1, last1) that are not present in the range [first2, last2) , to the range starting with result . Elements in the constructed range are sorted.

Required : The resulting range must not overlap with any of the source ranges. Input ranges must be of the same operator< order.

Returns : end of the constructed range.

Difficulty : no more than 2 * ((last1 - first1) + (last2 - first2)) - 1 comparison

An interesting difference is that the C ++ version is applicable to any two sorted ranges. In most languages, you are forced to force or translate the caller (left operand) into a set before you gain access to the established difference algorithm.

This is not particularly relevant to your question, but this is the reason that various dialing algorithms are modeled as stand-alone algorithms instead of member methods.

+5
source

You should iterate over the second set:

 for( set< T >::iterator iter = s2.begin(); iter != s2.end(); ++iter ) { s1.erase( *iter ); } 

This one might be cheaper than using std::set_difference - set_difference copies unique objects to a new container, but takes linear time, and .erase doesn't copy anything, but O(n * log( n ) ) .

In other words, depending on the container, you can choose a method that will be faster for your business.

Thanks David Rodríguez - dribeas for the comment! (


EDIT : Doh! I thought about BOOST_FOREACH at the very beginning, but I was wrong that it could not be used .. - you do not need an iterator, but just a value. As user 763305 said himself.

+4
source

In C ++, there is no difference method in the set. set_difference looks much more inconvenient, since it is more general than applying the difference on two sets. Of course, you can implement your own version of the difference in place:

 template <typename T, typename Compare, typename Allocator> void my_set_difference( std::set<T,Compare,Allocator>& lhs, std::set<T,Compare,Allocator> const & rhs ) { typedef std::set<T,Comapre,Allocator> set_t; typedef typename set_t::iterator iterator; typedef typename set_t::const_iterator const_iterator; const_iterator rit = rhs.begin(), rend = rhs.end(); iterator it = lhs.begin(), end = lhs.end(); while ( it != end && rit != rend ) { if ( lhs.key_comp( *it, *rit ) ) { ++it; } else if ( lhs.key_comp( *rit, *it ) ) { ++rit; } else { ++rit; lhs.erase( it++ ); } } } 

The performance of this algorithm will be linear in size of the arguments and will not require additional copies, since it changes the first argument in place.

+4
source

You can also do this with remove_if , writing your own functor to test for existence in the set, for example

 std::remove_if(s1.begin(), s1.end(), ExistIn(s2)); 

I believe set_difference more efficient, although it probably scans both sets only once

+1
source

The Python set is unordered and is rather the equivalent of C ++ std :: unordered_set than the std :: set, which is ordered.

David Rodriguez's algorithm relies on the fact that std :: set is ordered, so the sets Lhs and rhs can be traversed as artifacts in the algorithm.

For a more general solution that works for both ordered and disordered sets, the Kirill Kirov algorithm should be safe if you use / keep the "disorder" of the Python set.

+1
source

All Articles