Starting from this article - Igor Ostrovsky's processor cache effects gallery - I wanted to play with his examples on my machine. This is my code for the first example that examines how touching different cache lines affects runtime:
#include <iostream> #include <time.h> using namespace std; int main(int argc, char* argv[]) { int step = 1; const int length = 64 * 1024 * 1024; int* arr = new int[length]; timespec t0, t1; clock_gettime(CLOCK_REALTIME, &t0); for (int i = 0; i < length; i += step) arr[i] *= 3; clock_gettime(CLOCK_REALTIME, &t1); long int duration = (t1.tv_nsec - t0.tv_nsec); if (duration < 0) duration = 1000000000 + duration; cout<< step << ", " << duration / 1000 << endl; return 0; }
Using different values ββfor the step, I do not see a run-time jump:
step, microseconds 1, 451725 2, 334981 3, 287679 4, 261813 5, 254265 6, 246077 16, 215035 32, 207410 64, 202526 128, 197089 256, 195154
I would expect to see something similar:
But starting at 16, the operating time is halved every time we double the step.
I am testing it on Ubuntu13, Xeon X5450 and compiling it with: g ++ -O0. Is something wrong with my code, or are the results really ok? Any understanding of what I am missing will be greatly appreciated.
source share