C ++ 11 moving semantics against pointers - performance measurement

for my use case, I have to very quickly insert and remove data packets from the list.

In my opinion, there are two general ways to solve this problem:

  • insert / delete pointers to these packages
  • insert / delete copies using move semantics

In any case, I thought that a solution with pointers should be the most efficient, even with the disadvantage of your own garbage collection.

I implemented 3 test cases for a better comparison:

  • using copies
  • using move semantics
  • using pointers

To measure the elapsed time for each test case, I also implemented a timer class:

class HighPerformanceTimer { public: enum TimerResolution { SECONDS = 1, MILLISECONDS = SECONDS * 1000, MICROSECONDS = MILLISECONDS * 1000, NANOSECONDS = MICROSECONDS * 1000 }; explicit HighPerformanceTimer(const TimerResolution resolution = NANOSECONDS) : m_resolution(resolution) , m_frequency(0.0) , m_startTime({0}) , m_stopTime({0}) {} ~HighPerformanceTimer(void) {} bool Init(void) { LARGE_INTEGER frequency; if(0 == ::QueryPerformanceFrequency(&frequency)) { return false; } /* Check for zero divisor. */ if(0 == frequency.QuadPart) { return false; } /* Change frequency to double for internal timer resolution. */ switch(m_resolution) { case NANOSECONDS: m_frequency = frequency.QuadPart / (NANOSECONDS * 1.0); break; case MICROSECONDS: m_frequency = frequency.QuadPart / (MICROSECONDS * 1.0); break; case MILLISECONDS: m_frequency = frequency.QuadPart / (MILLISECONDS * 1.0); break; default: /** * SECONDS * m_frequency has a resolution in seconds by default */ m_frequency = frequency.QuadPart * 1.0; break; } return true; } void Start(void) { ::QueryPerformanceCounter(&m_startTime); } void Stop(double& intervall) { ::QueryPerformanceCounter(&m_stopTime); intervall = ((m_stopTime.QuadPart - m_startTime.QuadPart) / m_frequency); } private: const TimerResolution m_resolution; double m_frequency; LARGE_INTEGER m_startTime; LARGE_INTEGER m_stopTime; CRITICAL_SECTION m_timerLock; }; 

And here is my main class:

 class Packet { public: Packet(const uint8* data, uint16 length) { if(0 != length) { if(nullptr == data) { m_data = std::vector<uint8>(length, 0x00); } else { m_data.assign(data, data + length); } } } Packet(const Packet& rhs) : m_data(rhs.m_data) {} Packet& operator=(const Packet& rhs) { m_data = rhs.m_data; return *this; } Packet(const Packet&& rhs) : m_data(rhs.m_data) {} Packet& operator=(const Packet&& rhs) { m_data = rhs.m_data; return *this; } std::vector<uint8> m_data; }; void Measurement_1(void); void Measurement_2(void); void Measurement_3(void); constexpr uint16 payloadLength = 15000; uint8 payload[payloadLength]; /* Initialize high performance timer. */ HighPerformanceTimer hpt(HighPerformanceTimer::MICROSECONDS); int main(void) { hpt.Init(); /* Fill packet data. */ for(unsigned int j = 0; j < payloadLength; ++j) { payload[j] = 0xFF; } Measurement_1(); Measurement_2(); Measurement_3(); return EXIT_SUCCESS; } void Measurement_1(void) { /* Measurement with copies. */ double result[25]; for(unsigned int k = 0; k < 25; ++k) { /* Start measurement. */ double timeElapsed = 0.0; std::list<Packet> mylist; hpt.Start(); /* Begin insertion. */ for(unsigned int i = 0; i < 1000; ++i) { Packet f(payload, payloadLength); mylist.push_back(f); } /* End insertion. */ /* Begin removal. */ for(unsigned int i = 0; i < 1000; ++i) { Packet f = mylist.front(); mylist.pop_front(); } /* End removal. */ hpt.Stop(timeElapsed); result[k] = timeElapsed; /* Stop measurement. */ } for(unsigned int i = 0; i < 25; ++i) { std::cout << "with copies: " << std::setprecision(3) << std::fixed << result[i] << std::endl; } } void Measurement_2(void) { /* Measurement with move semantics. */ double result[25]; for(unsigned int k = 0; k < 25; ++k) { /* Start measurement. */ double timeElapsed = 0.0; std::list<Packet> mylist; hpt.Start(); /* Begin insertion. */ for(unsigned int i = 0; i < 1000; ++i) { Packet f(payload, payloadLength); mylist.push_back(std::move(f)); } /* End insertion. */ /* Begin removal. */ for(unsigned int i = 0; i < 1000; ++i) { Packet f = std::move(mylist.front()); mylist.pop_front(); } /* End removal. */ hpt.Stop(timeElapsed); result[k] = timeElapsed; /* Stop measurement. */ } for(unsigned int i = 0; i < 25; ++i) { std::cout << "with moves: " << std::setprecision(3) << std::fixed << result[i] << std::endl; } } void Measurement_3(void) { /* Measurement with pointers. */ double result[25]; for(unsigned int k = 0; k < 25; ++k) { /* Start measurement. */ double timeElapsed = 0.0; std::list<Packet*> mylist; hpt.Start(); /* Begin insertion. */ for(unsigned int i = 0; i < 1000; ++i) { mylist.push_back(new Packet(payload, payloadLength)); } /* End insertion. */ /* Begin removal. */ Packet* f = nullptr; for(unsigned int i = 0; i < 1000; ++i) { f = mylist.front(); if(nullptr != f) { mylist.pop_front(); delete f; } } /* End removal. */ hpt.Stop(timeElapsed); result[k] = timeElapsed; /* Stop measurement. */ } for(unsigned int i = 0; i < 25; ++i) { std::cout << "with pointers: " << std::setprecision(3) << std::fixed << result[i] << std::endl; } } 

At the moment, I have a problem when I get almost the same measurement result for the pointer version, as well as for the semantics of movement and even the copy version.

Did I make a mistake somewhere?

Congratulations!

+6
source share
1 answer

The move constructor actually copies another vector. It calls the vector copy constructor.

Here's the fix:

 Packet(Packet&& rhs) : m_data(std::move(rhs.m_data)) {} 

As for the timings, then on my pointers the PC was the fastest.

  • Directions: 5500 to avg
  • Move before fix: 8100 on avg
  • Move after fix: 5500 on avg
  • Copies: 8700 on avg

The results were pretty consistent (in microseconds).

Edit: After fixing, the move operator with sufficient run time showed similar performance with the pointer time. The increase in testing time showed similar ratios.

Results in debug mode:

  • Pointers: 16500 on avg
  • Move after fix: 19000 on avg
  • Copies: 30,000 on avg

In debug builds, pointers are faster, probably due to implementation (number of built-in functions).

+4
source

All Articles