I made a program that does matrix multiplication (without optimization).
for(i=0; i<a_r; i++) for(j=0;j<b_c; j++) for(k=0;k<a_c; k++) c[i][j]=c[i][j]+a[i][k]*b[k][j];
Depending on how I allocate the memory, the calculation time is different.
It is important to note three points:
- I record only the calculation time, not the allocation time (plus the distribution time is negligible compared to the calculation time).
- I initialize matrices with random numbers before calculating. I use huge matrices (2500 int * 2500 int). I choose this size to use RAM, but not a swap.
- This phenomenon disappears if it does not use RAM.
I am testing this program with three different distributions:
Test 1 : global distribution
Int a[2500][2500], b[2500][2500], c[2500][2500]; Main {matrix multiplication a*b=c}
The calculation time is fairly constant (about 41 s).
Test 2 : dynamic allocation (arrays)
Main{ int **a,**b,**c; a=(int **) malloc(sizeof(int)*2500); for( i=0;i<a_c; i++) a[i]=(int *) malloc(sizeof(int)*2500); โฆ matrix multiplication a*b=c }
When I execute the program several times in raw , I get this runtime: 260 seconds, 180, 110, 110 ... If I wait about 5 seconds and run the program again, I get the same results.
Test 3 : dynamic allocation (rows)
Main{ int *a, *b, *c; a=(int *) malloc(sizeof(int)*2500*2500); โฆ (same for b and c) matrix multiplication a*b=c }
The calculation time is quite constant (about 44 s).
I think test 2 is less efficient because of how the data is stored in memory. How to explain in this article an article (today the site is missing) or in this question . Some way to go through memory is more efficient, and as a result, some way of allocating memory gives you a more efficient program.
But (in test 2), I donโt know why the program is faster over time. Does anyone have an idea to explain this phenomenon? Thanks in advance.
PS I did these tests on Intel Xeon E5-2625 with CentOS 6.3 and Linux kernel 3.11.1. EDIT: Frequency scaling is disabled on my computer. The frequency is constant.
Here is the code:
#include <stdlib.h>