I want to see which pages will be available to my program.
You can simulate a CPU and receive this data. Options:
- 1) valgrind is a dynamic user space binary translator with good toolkit support. Try the cachegrind tool - it will emulate even L1 / L2 caches; You can also try to create a new tool for recording all memory accesses (for example, with page granularity).
- 2) qemu is a dynamic translator, both system-wide and system-wide. No tools in the original qemu as I know.
- 3) bochs - system processor emulator (very slow). You can easily crack a memory access code to get a memory log.
- 4) PTLsim - www.ptlsim.org/papers/PTLsim-ISPASS-2007.pdf
However, this is due to the overhead of setting protection bits for all memory pages.
Is this too much overhead?
Now the question is how to handle TLB gaps in user space for a Linux program.
You cannot handle a pass or in user space or in kernel space (on x86 and many other popular platforms). This is because most platforms control TLB skipping in hardware :. MMU (part of the CPU / chipset) will go through the page tables and will receive the physical address transparently. Only if some bits are set or when the address area is not displayed, a page fault interrupt is generated and delivered to the kernel.
In addition, it seems that there is no way to drop TLB in modern processors ( but 386DX was able to do this )
You can try to detect a missed TLB by the entered delay. But this delay may be hidden due to the abnormal start of the TLB search.
In addition, most hardware events (memory access, tlb access, tlb hits, tlb misses) are counted using hardware performance monitoring (this part of the processor is used by Vtune, CodeAnalyst and oprofile). Unfortunately, these are only global event counters, and you cannot activate more than 2-4 events at a time. The good news is that you can set the perfmon counter to interrupt when any count is reached. You will then receive (via interrupt) the instruction address ($ eip) where the account was reached. Thus, you can find the TLB-miss-heavy hot spot with this hardware (it is found in every modern x86 processor, both Intel and amd).
osgx source share