I tried to measure / control the use of all these 60 cores on a Xeon Phi (Knights Corner, in-order processor) at a relatively high frequency, say at least every 0.1 s, which gives up to 10 Hz.
I tried the latest PAPI library. But it only supports PAPI_TOT_INS, which is a counter of completed instructions. This will not work, because I really need something related to instructions issued every 0.1 s but not finished. Several instructions issued in different cycles can end in one cycle. The question of instructions depends on whether the kernel is stopped or not.
Other commands, available as "top" and "perf", operate at a frequency of 1 Hz, which is too slow for my measurement. I need a higher frequency. And I also need to synchronize the measurement with the vital phases of my codes. So, the Intel Vtune profile does not work for me.
Is there a possible way to control the issue of Xeon Phi instructions or any other activities related to their use? I understand that there are hardware counters, but reading them seems very difficult to me. Maybe I can derive this usage by measuring the processor time for each thread?
Thank.
source
share