Is reading “zero” from memory faster than reading other values?

I am starting a memory access experiment that uses a 2D matrix, with each row being the size of a memory page. An experiment consists in reading each element using a row / column, and then also writing to each element using a row / column. An accessible matrix has been declared with a global scope to facilitate programming requirements.

The point of this question is that when the test matrix is ​​declared, the values ​​are statically initialized to zero by the compiler, and the results that I found were very interesting. When I first read the operations, i.e.

rowMajor_read(); colMajor_read(); rowMajor_write(); colMajor_write(); 

Then my colMajor_read operation completed very quickly. enter image description here

However, if I do write operations before reading, we have:

 rowMajor_write(); colMajor_write(); rowMajor_read(); colMajor_read(); 

enter image description here

And the operation of reading the main columns increased by almost an order of magnitude.

I thought this should have something to do with how the compiler optimizes the code. Since the global matrix was identically equal to zero for each element, did the read compiler completely delete it? Or is it somehow “easier” to read a value from memory that is identical to zero?

I do not pass any special compiler commands regarding optimizations, but I declared my functions this way.

 inline void colMajor_read(){ register int row, col; register volatile char temp __attribute__((unused)); for(col = 0; col < COL_COUNT; col++) for(row = 0; row < ROW_COUNT; row++) temp = testArray[row][col]; } 

Because I was having problems when the compiler completely removed the temp variable from the specified function since it was never used. I think having volatile and __attribute__((unused)) redundant, but I turned it on nonetheless. I got the impression that the optimizers were not implemented in a mutable variable.

Any ideas?


I looked at the generated assembly and the results are identical for the colMajor_read function. Version for assembly (assembly): http://pastebin.com/C8062fYB

+8
c memory time
source share
1 answer

Check the memory usage of your process before and after writing the values ​​to the matrix. For example, if it is stored in a .bss section on Linux, nullified pages will be mapped to a single read-only page with copy-on-write semantics. So, even if you read a bunch of addresses, you can read the same page of physical memory over and over again.

This page http://madalanarayana.wordpress.com/2014/01/22/bss-segment/ has a good explanation.

If this is the case, reset the matrix again and repeat the read test, and it will no longer be faster.

+7
source share

All Articles