Std deque amazingly slow

I just found out that the standard std-deque is really slow when compared to my "home" version of the stack, which use a pre-allocated array.
This is the code of my stack:

template <class T> class FastStack { public: T* st; int allocationSize; int lastIndex; public: FastStack(int stackSize); FastStack(); ~FastStack(); inline void resize(int newSize); inline void push(T x); inline void pop(); inline T getAndRemove(); inline T getLast(); inline void clear(); }; template <class T> FastStack<T>::FastStack() { lastIndex = -1; st = NULL; } template <class T> FastStack<T>::FastStack(int stackSize) { st = NULL; this->allocationSize = stackSize; st = new T[stackSize]; lastIndex = -1; } template <class T> FastStack<T>::~FastStack() { delete [] st; } template <class T> void FastStack<T>::clear() { lastIndex = -1; } template <class T> T FastStack<T>::getLast() { return st[lastIndex]; } template <class T> T FastStack<T>::getAndRemove() { return st[lastIndex--]; } template <class T> void FastStack<T>::pop() { --lastIndex; } template <class T> void FastStack<T>::push(T x) { st[++lastIndex] = x; } template <class T> void FastStack<T>::resize(int newSize) { if (st != NULL) delete [] st; st = new T[newSize]; } 

.
I run this simple test for deque:

 std::deque<int> aStack; int x; HRTimer timer; timer.Start(); for (int i = 0; i < 2000000000; i++) { aStack.push_back(i); x = aStack.back(); if (i % 100 == 0 && i != 0) for (int j = 0; j < 100; j++) aStack.pop_back(); } double totalTime = timer.Stop(); stringstream ss; ss << "std::deque " << totalTime; log(ss.str()); 

.
Using a std stack with a vector as a container (as suggested by Michael Cohn)

 std::stack<int, std::vector<int>> bStack; int x; HRTimer timer; timer.Start(); for (int i = 0; i < 2000000000; i++) { bStack.push(i); x = bStack.top(); if (i % 100 == 0 && i != 0) for (int j = 0; j < 100; j++) bStack.pop(); } double totalTime = timer.Stop(); stringstream ss; ss << "std::stack " << totalTime; log(ss.str()); 

.
And the same for my FastStack:

 FastStack<int> fstack(200); int x; HRTimer timer; timer.Start(); for (int i = 0; i < 2000000000; i++) { fstack.push(i); x = fstack.getLast(); if (i % 100 == 0 && i != 0) for (int j = 0; j < 100; j++) fstack.pop(); } double totalTime = timer.Stop(); stringstream ss; ss << "FastStack " << totalTime; log(ss.str()); 

.
The results after 4 runs are as follows:
deque 15.529
deque 15.3756
deque 15.429
deque 15.4778

stack 6.19099
stack 6.1834
stack 6.19315
stack 6.19841

FastStack 3.01085
FastStack 2.9934
FastStack 3.02536
FastStack 3.00937

Results in seconds, and I ran the code on the Intel i5 3570k (default clock). I used the VS2010 compiler with all the available optimizations. I know that my FastStack has many limitations, but there are many situations when they can be used and when it can give a good boost! (I used it in one project, where I get 2x acceleration compared to std :: queue).
So now my question is:
Are there other "inhibitors" in C ++ that everyone uses, but no one knows about them? EDIT: I don't want to be offensive, I'm just wondering if you know some unknown “accelerations” like this.

+7
source share
2 answers

You are comparing apples and oranges. Dequeue DOUBLE-ENDED, which requires it to be a little different than the simple stack you implemented. Try running the test instead of std::stack<int, std::vector<int> > and see how you do it. std :: stack is a container adapter, and if you use a vector as a base container, it should be almost as fast as your home implementation.

One of the sides is that the std :: stack implementation does not have the ability to set the size in advance, so in situations where you KNOW what the maximum size should be, it will be a little slower at first, since it has to be re-assigned several times. After that, it will be almost the same.

+21
source

If you know the maximum number of elements that will be on the stack at any given time, you should use std::vector , on which you reserve bandwidth. Even if you don’t know the capacity, for the stack you should use std::vector , which will grow several times (higher cost than the previously allocated vector during growth), but never shrinks, so after a while it will stop allocating memory.

The performance issue is that std::deque will allocate blocks as needed and release them when they are no longer needed (following some strategy), so if you fill and clear std::deque , they will often be allocated and freed continuously.

+14
source

All Articles