Get instruction pointer on segmentation error or crash (for x86 JIT compiler project)?

I am implementing a backend for the JavaScript JIT compiler that generates x86 code. Sometimes, as a result of errors, I get segmentation errors. It is hard to track what caused them. Therefore, I was wondering if there was any β€œsimple” way to capture segmentation errors and other similar failures and get the address of the instruction that caused the error. That way, I could map the address back to the compiled x86 assembly, or even go back to the source code.

This should work on Linux, but ideally on any POSIX-compatible system. In the worst case scenario, if I cannot catch the seg error and get the IP address in my JIT run, I would like to catch it from the outside (kernel log?) And maybe just give the compiler a dump of a large file with address mappings with instructions that I could map to a Python script or something.

Any ideas / suggestions are welcome. Feel free to share your debugging tips if you have ever worked on your own compiler project.

+4
source share
2 answers

If you use sigaction , you can define a signal handler that takes 3 arguments:

 void (*sa_sigaction)(int signum, siginfo_t *info, void *ucontext) 

The third argument passed to the signal handler is a pointer to a data structure specific to the OS and architecture. On linux, its a ucontext_t , which is defined in the header file <sys/ucontext.h> . In this case, uc_mcontext is mcontext_t (machine context), which for x86 contains all the registers during the signal in gregs . This way you can access

 ucontext->uc_mcontext.gregs[REG_EIP] (32 bit mode) ucontext->uc_mcontext.gregs[REG_RIP] (64 bit mode) 

to get a pointer to a failure instruction instruction.

+3
source

Use a sigaction with the SA_SIGINFO flag and a signal handler with the prototype void (*handler)(int signum, siginfo_t *info, void *data) . When the signal handler is called, info->si_addr will contain the value of the instruction pointer where the error occurred.

Keep in mind that the state of the process is undefined after receiving SISEGV, which was not generated using raise () or kill (). If you can use

0
source

All Articles