I would like to create a complete instruction for tracing the execution of the program, collect some statistics, etc. At first I tried to use the ptrace linux function to jump through the program (using the tutorial here ). This creates two processes: tracked and debugger, and they exchange data through signals. I only have about 16 thousand instructions per second (on Atom with a frequency of 1.6 GHz), so this is too slow for something non-trivial.
I thought that communication between processes through signals is too slow, so I tried to configure debugging in the same process as execution: set the trap flag and create a signal handler. When a software interrupt is used to create syscall, the trap flag must be saved, the kernel will use its own flags - I thought so. But my program is somehow killed by the SIGTRAP signal.
Here is what I created:
#include <stdio.h> #include <unistd.h> #include <signal.h> int cycle = 0; void trapHandler(int signum) { if (cycle % 262144 == 0) { write(STDOUT_FILENO," trap\n",6); } cycle += 1; } void startTrace() { // set up signal handler signal(SIGTRAP, trapHandler); // set trap flag asm volatile("pushfl\n" "orl $0x100, (%esp)\n" "popfl\n" ); } void printRock() { char* s = "Rock\n"; asm( "movl $5, %%edx\n" // message length "movl %0, %%ecx\n" // message to write "movl $1, %%ebx\n" // file descriptor (stdout) "movl $4, %%eax\n" // system call number (sys_write) "int $0x80\n" // sycall : // no output regs : "r"(s) // input text : "edx","ecx","ebx","eax" ); } int main() { startTrace(); // some computation int x = 0; int i; for (i = 0; i < 100000; i++) { x += i*2; } printRock(); write(STDOUT_FILENO,"Paper\n",6); write(STDOUT_FILENO,"Scissors\n",9); }
At startup, this gives:
trap trap trap Rock Paper trap Trace/breakpoint trap (core dumped)
So, now we get about 250 thousand instructions per second, but still slow, but non-trivial execution. But there is that core dump that seems to occur between two write calls. In GDB, we see where this happens:
Dump of assembler code for function __kernel_vsyscall: 0xb76f3414 <+0>: push %ecx 0xb76f3415 <+1>: push %edx 0xb76f3416 <+2>: push %ebp 0xb76f3417 <+3>: mov %esp,%ebp 0xb76f3419 <+5>: sysenter 0xb76f341b <+7>: nop 0xb76f341c <+8>: nop 0xb76f341d <+9>: nop 0xb76f341e <+10>: nop 0xb76f341f <+11>: nop 0xb76f3420 <+12>: nop 0xb76f3421 <+13>: nop 0xb76f3422 <+14>: int $0x80 => 0xb76f3424 <+16>: pop %ebp 0xb76f3425 <+17>: pop %edx 0xb76f3426 <+18>: pop %ecx 0xb76f3427 <+19>: ret
And backtrace:
Program terminated with signal SIGTRAP, Trace/breakpoint trap.
It seems that the syscalls that happen with int 80 are fine, but the write calls use the VIDSO / vsyscall kernel somehow somehow (I did not know about this function, described in more detail here ). This may be due to the use of sysenter rather than int 80 , perhaps the trap flag is saved when entering the kernel. I do not quite understand what happens with recursive __kernel_vsyscall calls. I also don't understand why there is an int 80 call inside the __kernel_vsyscall function.
Does anyone have a suggestion on what is happening and how to fix it? Maybe you can disable VDSO / vsysicall? Or can you override the __kernel_vsyscall function using the int 80 function rather than sysenter ?