Why do I need to enable optimization in g ++ for easy access to the array?

Question

Why do I need to enable optimization in g ++ for easy access to the array?

I wrote a simple Gaussian elimination algorithm using std::vector of double in C ++ (gcc / Linux). Now I saw that the execution time depends on the level of compiler optimization (up to 5 times faster with -O3 ). I wrote a small test program and got similar results. The problem is not in the distribution of the vector, nor when resizing, etc.

It is a simple fact that the statement:

 v[i] = x + y / z;

(or something like that) is much slower without optimization. I think the problem is the index operator. Without compiler optimization, std::vector is slower than raw double *v , but when I turn on optimization, performance is equal and, to my surprise, even access to raw double *v is faster.

Is there an explanation for this behavior? I'm really not a professional developer, but I thought that the compiler should be able to pass statements like the ones above directly enough to the hardware instructions. Why is optimization necessary and, more importantly, what is the lack of optimization? (If they are not, I wonder why optimization is not the standard.)

Here is my vector test code:

 const long int count = 100000; const double pi = 3.1416; void C_array (long int size) { long int start = time(0); double *x = (double*) malloc (size * sizeof(double)); for (long int n = 0; n < count; n++) for (long int i = 0; i < size; i++) x[i] = i; //x[i] = pi * (in); printf ("C array : %li s\n", time(0) - start); free (x); } void CPP_vector (long int size) { long int start = time(0); std::vector<double> x(size); for (long int n = 0; n < count; n++) for (long int i = 0; i < size; i++) x[i] = i; //x[i] = pi * (in); printf ("C++ vector: %li s\n", time(0) - start); } int main () { printf ("Size of vector: "); long int size; scanf ("%li", &size); C_array (size); CPP_vector (size); return 0; }

I got some weird results. The standard g ++ compiler creates a runtime of 8 s (array C) or 18 s ( std::vector ) for a vector size of 20,000. If I use a more complex line in //.. , the runtime is 8/15 s (yes, faster ) If I turn on -O3 , then the runtime is 5/5 s for 40,000 vector sizes.

+7

c ++ performance c compiler-optimization

Professional_Amateur Oct 28 '14 at 19:09

source share

1 answer

fjardon · Answer 1 · 2014-10-28T19:23:54+0000

Why do we need releases for optimization / debugging?

Optimization can completely change the order of commands, exclude variables, calls to built-in functions and make executable code so far from the source code that you cannot debug it. Thus, one of the reasons for not using optimization is to maintain the ability to debug code. When your code (when you think your code) is fully debugged, you can enable optimizations to create a release build.

Why is debugging code slow?

It should be borne in mind that the debug version of STL may contain additional checks of the boundaries and validity of iterators. This can slow down the code by 10 times. This is known as a problem with Visual C ++ STL, but in your case you are not using it. I do not know the state of the art of gcc STL.
Another possibility is that you are accessing memory in a non-linear sequence, making many cache misses. In debug mode, the compiler will respond to your code and produce this inefficient code. But when optimization is turned on, it can rewrite your calls as sequential and not produce missed caches.

What to do?

You can try to show a simple compiled example demonstrating behavior. Then we could compile and look at the assembly to explain what really happens. The size of the data you process is important if you encounter a caching problem.

References

Visual C ++ STL runs slowly in debug mode: http://marknelson.us/2011/11/28/vc-10-hash-table-performance-problems/
What does the debug version of STL with Visual C ++ do: http://channel9.msdn.com/Series/C9-Lectures-Stephan-T-Lavavej-Advanced-STL/C9-Lectures-Stephan-T-Lavavej-Advanced-STL -3-of-n
Lack of cache and its impact: http://channel9.msdn.com/Events/Build/2014/2-661 , especially from 29'27 "
Cache again: https://www.youtube.com/watch?v=fHNmRkzxHWs at 36'34 "

Why do I need to enable optimization in g ++ for easy access to the array?

More articles: