An empty destructor calling slow code?

I am implementing a custom iterator for a container type other than STL, and came across the following behavior, which at this stage seems a little unexpected to me.

It seems that there is significant success in performance when you define an "empty" dtor? Why??

To try to figure this out, I used a simple iterator for std :: vector to compare performance directly with the standard STL iterator. For a fair test, I just copied a simplified implementation from "vector.hpp" and experimented with adding an additional "empty" dtor:

template <typename _Myvec> class my_slow_iterator // not inheriting from anything!! { public : _Myvec::pointer _ptr; // pointer to vector element /* All of the standard stuff - essentially from "vector.hpp" */ /* An additional empty dtor */ ~my_slow_iterator () {} }; 

Then I modified std :: vector so that I can return its new iterator type and use the following for comparison: sort a vector of 2,000,000 random numbers averaged over three runs:

 std::vector vec; // fill via rand(); int tt = clock(); std::sort(vec.begin(), vec.end()); tt = clock() - tt; // elapsed time in ms 

I got the following results (VS2010, Release build, _ITERATOR_DEBUG_LEVEL 0, etc.):

  • Using the standard STL iterator: 550 ms.
  • Using my_slow_iterator when deleting an empty dtor: 560 ms.
  • Using my_slow_iterator with empty dtor turned on: 900 ms.

It seems that an empty dtor in this case causes a deceleration of about 40%.

Obviously, if dtor is empty, then why is it needed, but I expected that simple "empty" functions like this would be rejected and optimized at compile time. If this is not the case, then I would like to understand what happens if this type of problem has consequences in more complex cases.

EDIT: compiled with O2 optimization.

EDIT: dig a little further, it seems that a similar effect occurs with a copy of ctor. Initially (and in the above tests) my_slow_iterator does not have a copy-ctor identifier, so it uses the default compiler.

If I define the following instance-ctor (which does nothing more than I expected from the compiler created):

 my_slow_iterator ( const my_slow_iterator<_Myvec> &_src ) : _ptr(_src._ptr) {} 

I see the following results for the same test as above:

  • Using my_slow_iterator , dtor removed, copy-ctor enabled: 690ms
  • Using my_slow_iterator , dtor enabled, copy-ctor enabled: 980 ms

This is another (albeit not so sharp) performance.

Why / how are the default functions for the compiler much more efficient? Does the user ctor / dtor implicitly determine to do something in the background?

+4
source share
1 answer

I remember experiencing something similar with GCC (-O3) on Linux. The code for the user-defined destructor, although it is empty and in the header file, was emitted, while the default destructor created by the compiler did not give any instructions. This puzzled me, and in the end I made the code work without an explicit destructor (although due to the ability to add assert() , so empty was desirable - it was not empty in debug builds).

+1
source

All Articles