Call Python code from LLVM JIT

I am writing the language lexer / parser / compiler in python, which should run in LLVM JIT-VM (using llvm-py ) later. The first two steps are pretty simple, but (even if I haven't started the compilation task yet), I see a problem when my code wants to call Python-Code (in general) or interact with Python / parser / compiler vocabulary (respectively). My main problem is that the code should be able to dynamically load additional code into the virtual machine at runtime, and therefore it should run the whole lexer / parser / compiler chain in Python from the VM.

First of all: is this possible, or is the VM "unmutable" after it starts?

If at the moment I see 3 possible solutions (I am open to other offers)

  • The "break" of the virtual machine allows you to directly call the Python functions of the main process (possibly registering it as an LLVM function, which is somehow redirected to the main process). I have not found anything about this, and in any case, I am not sure if this is a good idea (security, etc.).
  • Compile the runtime (statically or dynamically at run time) in LLVM-Assembly / -IR. This requires that the IR code can modify the virtual machine in which it runs in
  • Compile the runtime (statically) into the library and load it directly into the virtual machine. Again, it should be able to add features (etc.) to the virtual machine in which it runs.
+6
source share
2 answers

As Eli said, it doesn't stop you from accessing the Python C-API. When you call an external function from LLVM JIT, it effectively uses dlopen() in the process space, so if you are working from inside llvmpy, you already have all the Python interpreter characters, you can even interact with the active interpreter that called ExecutionEngine, or you You can use the new Python interpreter if necessary.

To get started, create a new C file with our evaluator.

 #include <Python.h> void python_eval(const char* s) { PyCodeObject* code = (PyCodeObject*) Py_CompileString(s, "example", Py_file_input); PyObject* main_module = PyImport_AddModule("__main__"); PyObject* global_dict = PyModule_GetDict(main_module); PyObject* local_dict = PyDict_New(); PyObject* obj = PyEval_EvalCode(code, global_dict, local_dict); PyObject* result = PyObject_Str(obj); // Print the result if you want. // PyObject_Print(result, stdout, 0); } 

Here is a little Makefile to compile:

 CC = gcc LPYTHON = $(shell python-config --includes) CFLAGS = -shared -fPIC -lpthread $(LPYTHON) .PHONY: all clean all: $(CC) $(CFLAGS) cbits.c -o cbits.so clean: -rm cbits.c 

Then we start with the usual LLVM template, but use ctypes to load the shared object of our cbits.so shared library into the global process space to have the python_eval character. Then just create a simple LLVM module with a function, select a line with some Python source using ctypes and pass a pointer to ExecutionEngine, which runs the JIT'd function from our module, which in turn passes the Python source to a C-function that calls the C-API Python and then returns back to LLVM JIT.

 import llvm.core as lc import llvm.ee as le import ctypes import inspect ctypes._dlopen('./cbits.so', ctypes.RTLD_GLOBAL) pointer = lc.Type.pointer i32 = lc.Type.int(32) i64 = lc.Type.int(64) char_type = lc.Type.int(8) string_type = pointer(char_type) zero = lc.Constant.int(i64, 0) def build(): mod = lc.Module.new('call python') evalfn = lc.Function.new(mod, lc.Type.function(lc.Type.void(), [string_type], False), "python_eval") funty = lc.Type.function(lc.Type.void(), [string_type]) fn = lc.Function.new(mod, funty, "call") fn_arg0 = fn.args[0] fn_arg0.name = "input" block = fn.append_basic_block("entry") builder = lc.Builder.new(block) builder.call(evalfn, [fn_arg0]) builder.ret_void() return fn, mod def run(fn, mod, buf): tm = le.TargetMachine.new(features='', cm=le.CM_JITDEFAULT) eb = le.EngineBuilder.new(mod) engine = eb.create(tm) ptr = ctypes.cast(buf, ctypes.c_voidp) ax = le.GenericValue.pointer(ptr.value) print 'IR'.center(80, '=') print mod mod.verify() print 'Assembly'.center(80, '=') print mod.to_native_assembly() print 'Result'.center(80, '=') engine.run_function(fn, [ax]) if __name__ == '__main__': # If you want to evaluate the source of an existing function # source_str = inspect.getsource(mypyfn) # If you want to pass a source string source_str = "print 'Hello from Python C-API inside of LLVM!'" buf = ctypes.create_string_buffer(source_str) fn, mod = build() run(fn, mod, buf) 

You should draw the following conclusion:

 =======================================IR======================================= ; ModuleID = 'call python' declare void @python_eval(i8*) define void @call(i8* %input) { entry: call void @python_eval(i8* %input) ret void } =====================================Result===================================== Hello from Python C-API inside of LLVM! 
+6
source

You can call external C functions from LLVM JIT-ed code. What else you need?

These external functions will be found at runtime, which means that if you associate Python with your virtual machine, you can call the Python C API functions.

"VM" is probably less magical than you think :-) In the end, it's just machine code that is emitted at run time to the buffer and executed from there. To the extent that this code has access to other characters during its operation, it can do everything that any other code in this process can do.

+2
source

All Articles