Why the number of commands not deterministic in Linux performance counters

To be able to profile the runtime of applications whose binaries will actually run under the simulator (NS-3 / DCE). I wanted to use linux performance counters, I expected that the number of commands for an application that does not have determinism is not deterministic. I could not be more erroneous according to linux performance limits, let's say a simple example:

$ (perf stat -c -- sleep 1 2>&1 && perf stat -c -- sleep 1 2>&1) |grep instructions
        669218 instructions              #    0,61  insns per cycle
        682286 instructions              #    0,58  insns per cycle

1) What is the source of this non-determinism? This is due to the low level branch prediction and other CPU engines.

2) Another question, is there a way to find out the number of instructions passed to the CPU (as opposed to the number of instructions in the output example) to get the amount of code executed in a deterministic way?

+4
source share
1 answer

Summary

1) Non-determinism is caused by a team change sleep 1not from predicting branching or other microarchitectural features.

2) You can find the number of instructions received using the hardware even counter, if your processor supports it. However, this will differ from the number of pending instructions (this is what usually executes commands for instructions).

Details:

sleep , . , , .

, instructions:u instructions:k . :

perf stat -e instructions:k,instructions:u,instructions sleep 1

:

Performance counter stats for 'sleep 1':

       373,044 instructions:k            #    0.00  insns per cycle        
       199,795 instructions:u            #    0.00  insns per cycle        
       572,839 instructions              #    0.00  insns per cycle        

   1.001018153 seconds time elapsed

Performance counter stats for 'sleep 1':

       379,722 instructions:k            #    0.00  insns per cycle        
       199,970 instructions:u            #    0.00  insns per cycle        
       579,519 instructions              #    0.00  insns per cycle        

   1.000986201 seconds time elapsed

, sleep 1 . . .

+1

All Articles