Debug LLVM IR

I built an LLVM-targetting interface that produces some IR. Subsequently, and fully expected, the IR output in some cases is incorrect (as in, it looks correct, but as a result, the program crashes when executed). However, I did not find many useful tools to solve this problem.

I tried using lli, but the error message output is impressively useless (when you assume that the interpreter can give very accurate error information).

I searched for converting IR code to C and then debugging it using Visual Studio, but it seems this function has been removed from LLVM.

I also learned how to work with GDB. However, the DWARF debugging information format is quite specific to several existing languages, and in addition, the source that I translated using my interface is correct, it is the IR that was created, which is incorrect, so the debugging symbols for the original source will not be ' t be too useful - for example, I will need to see a bunch of intermediate register values ​​that do not match any source variable, or a breakpoint in compiler-generated functions.

What tools and methods exist for debugging LLVM IR output?

+8
llvm
source share
2 answers

I am not sure I fully understand your problem. You say that your compiler (from X to LLVM IR) produces the wrong output (invalid LLVM IR) and you don’t know how to debug it? In other words, there are two possibilities:

  • The IR generated by your compiler is incorrect - you can specify some instructions and say - this is not what I wanted to create.
  • IR seems correct, but does not give the results that I expected from him.

I assume this is (1) you are talking about (because this is what this question was talking about before you updated it)

Then this would not be a problem related to LLVM. Suppose you are writing a compiler from the X language to your own code. Your own code is incorrect - how do you debug a problem? Well, you are debugging your compiler, obviously. You are trying to find the last place where the compiler understood the input was correct, or the first place where it became incorrect. How you do this depends on the architecture of your compiler. However, something that helps a lot has a printed representation of the other intermediate levels in your compiler.

For example, Clang (which produces LLVM IR from C, C ++, and Objective-C) can reset its full AST. Therefore, by looking at the AST for the wrong code, you can shorten the code fragment to help determine if the problem is in the interface (source C β†’ AST) or the gen code (AST β†’ LLVM IR). The LLVM backend (compiles LLVM IR into native code) also has several intermediate levels (primarily SelectionDAG and MI) that can be checked for debugging. These are just examples of other existing YMMV compilers with yours.

+3
source share

Will Diez described how he implemented this:
https://groups.google.com/d/msg/llvm-dev/O4Dj9FW1gtM/ovnm6dqoJJsJ

Hello to all,

For my own purposes, I wrote a pass that does exactly what you all describe: add debug metadata to LLVM IR.

As a pass, he had to solve the problem "This file must exist on the disk somewhere so gdb can find it", which I decided to dump on / tmp / somewhere. Not a great solution (who removes them?), But it worked pretty well.

Another interesting problem is how to coexist with any existing debugging metadata, which can be useful for simultaneously debugging IR inline conversions with source C for test passes like SAFECode, ASan / TSan.

Quick example:

(gdb) break main Breakpoint 1 at 0x4010b1: file /home/wdietz2/magic/test/unit/test_loop.c, line 9. (gdb) r Starting program: /home/wdietz2/llvm/32-obj-make/projects/magic/test/Output/test_loop Breakpoint 1, main (argc=<value optimized out>, argv=<value optimized out>) at /home/wdietz2/magic/test/unit/test_loop.c:9 9 unsigned k = 0; Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.80.el6_3.5.x86_64 libgcc-4.4.6-4.el6.x86_64 libstdc++-4.4.6-4.el6.x86_64 (gdb) n 10 source(argc != 0, &k); (gdb) n 14 %and.iiii104 = and i64 %4, 70368744177660 (gdb) n 15 %5 = load i8** @global, align 8 (gdb) n 18 store i32 16843009, i32* %6, align 1 (gdb) n 19 store i8 1, i8* getelementptr inbounds ([1 x i8]* @array, i64 0, i64 0), align 1 (gdb) n 20 call coldcc void @runtime_func() nounwind (gdb) n 11 while(i-- > argc) (gdb) n 23 %and.iiii85 = and i64 %7, 70368744177660 (gdb) n 14 while(j++ < i) k += j; (gdb) n 11 while(i-- > argc) (gdb) n 14 while(j++ < i) k += j; (gdb) n 102 %77 = load i8** @global, align 8 (gdb) n 105 %79 = load i32* %78, align 4 (gdb) n 106 %cmp7.iii = icmp ne i32 %79, 0 (gdb) n 108 call void @llvm.memset.p0i8.i64(i8* %add.ptr.iiii86, i8 %conv8.iii, i64 4, i32 1, i1 false) nounwind (gdb) n 14 while(j++ < i) k += j; (gdb) n 15 while(j-- > 0) k *= k + j; (gdb) n 95 %69 = load i8** @global, align 8 (gdb) n 98 %71 = load i32* %70, align 4 (gdb) 

The passage itself is quite simple - a complex problem that it solves by emitting IR to disk and reasoning that the instruction * includes which line, which really should not be a problem if it is executed correctly in LLVM. If desired, I can make the code available upon request.

In short, it seemed to me that this worked well for me, and LLVM itself would be great!

Unfortunately, the code seems to be unavailable.

+1
source share

All Articles