C ++ vector array operator with high computational cost?

I always knew that rich C ++ abstractions have certain computational overheads, but I got the impression that these overheads would be close to insignificant if the correct compiler optimizers were applied. I was curious what exactly would be the size of this overhead, so I wrote a simple test to determine this. A test is a template function that takes a container variable, assigns a value to each element in the container, and then sums the values ​​in the container in a separate loop. This process is repeated for a given number of cycles.

What I discovered, to my great concern, was that the implementation of the vector took almost 3 times the standard implementation of the array. Going through a huge selection of compiler optimizers without any success, I decided to bite the bullet and the eyeball with the assembly code directly to try to see what caused the fine for the time. I included some assembly directives that allowed me to pinpoint where the array indexing operation took place, and analyzed the code in detail. What I discovered, to the point of complete confusion, was that the difference between the implementation of the vector and the implementation of the array was completely insignificant. Assembly code can be found here .

This is the command I used to create the binary:

g++ -O3 vectorArrayOp.cpp -o vectorArrayOp 

This is the command I used to build the assembly:

 g++ -O3 -DTAGASM vectorArrayOp.cpp -S -o vectorArrayOp.s 

This is the result that I observe when running the binary:

 gmurphy@interloper:Reference$ ./vectorArrayOp Duration 0.027678 Duration 0.090212 

The results do not change when the calculated value is included in the stdout stream, I deleted them for clarity. My system specifications are as follows (I saw the same results on my AMD):

 Linux 3.2.0-32-generic x86_64 GNU/Linux Intel(R) Xeon(R) CPU X5550 @ 2.67GH g++ (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3 

The code follows, I would appreciate it if someone could give me some idea of ​​why the timings are so different when the assembly is so similar.

 #include <vector> #include <iostream> #include <sys/time.h> #ifdef TAGASM #define ASMTAG(X) asm(X) #else #define ASMTAG(X) #endif enum { DataSize=1024, NumTests=(1<<16) } ; struct ReturnValue {ReturnValue(float _d, int _t):d(_d), t(_t){} float d; int t;} ; template <typename Container, typename Type> ReturnValue runTest(Container &c, Type value) { int tagValue(0); timeval startTime; gettimeofday(&startTime, NULL); for(int i=0; i<NumTests; i++) { for(int j=0; j<DataSize; j++) { ASMTAG("preassign"); c [j] = value ; ASMTAG("postassign"); } for(int j=0; j<DataSize; j++) { ASMTAG("preadd"); tagValue += c [j] ; ASMTAG("postadd"); } } timeval endTime; gettimeofday(&endTime, NULL); float duration((endTime.tv_sec-startTime.tv_sec)+ (endTime.tv_usec-startTime.tv_usec)/1000000.0); //tagValue is returned in case the optimising compiler might try to remove the loops return ReturnValue(duration, tagValue) ; } int main() { int *arrayData = new int [DataSize]; std::vector <int> vectorData(DataSize, 0) ; ReturnValue ad = runTest(arrayData, 1); ReturnValue vd = runTest(vectorData, 1); std::cout<<"Duration "<<ad.d<<std::endl; std::cout<<"Duration "<<vd.d<<std::endl; delete [] arrayData; return 0 ; } 
+8
c ++ profiling linux g ++
source share
1 answer
 % g++-4.4 -O3 vectorArrayOp.cpp -o vectorArrayOp % ./vectorArrayOp Duration 0.008581 Duration 0.008775 % g++-4.5 -O3 vectorArrayOp.cpp -o vectorArrayOp % ./vectorArrayOp Duration 0.008634 Duration 0.008588 % g++-4.6 -O3 vectorArrayOp.cpp -o vectorArrayOp % ./vectorArrayOp Duration 0.01731 Duration 0.081696 % g++-4.7 -O3 vectorArrayOp.cpp -o vectorArrayOp % ./vectorArrayOp Duration 0.008618 Duration 0.008612 % clang++ -O3 vectorArrayOp.cpp -o vectorArrayOp % ./vectorArrayOp Duration 0.066484 Duration 0.066435 

Based on these results, this is probably a regression of compiler performance in g ++ 4.6.

+11
source share

All Articles