I was asked what the ABI mismatch is. I think it’s best to explain with a simple example.
Consider a slightly dumb function:
int f(int a, int b, int (*g)(int, int)) { return g(a * 2, b * 3) * 4; }
Compile it for x64 / Windows and for x64 / Linux.
For x64 / Windows, the compiler emits something like:
f: sub rsp,28h lea edx,[rdx+rdx*2] add ecx,ecx call r8 shl eax,2 add rsp,28h ret
For x64 / Linux, something like:
f: sub $0x8,%rsp lea (%rsi,%rsi,2),%esi add %edi,%edi callq *%rdx add $0x8,%rsp shl $0x2,%eax retq
Given the various traditional notation of assembly language in Windows and Linux, it is obvious that there are significant differences in the code.
On Windows, it is expected that a will go to ECX (lower half of the RCX register), b to EDX (lower half of the RDX register) and g to R8 . This is due to the x64 / Windows calling convention, which is part of the application binary interface (ABI). The code also prepares g arguments in ECX and EDX .
The Linux version expects a in EDI (lower half of the RDI register), b in ESI (lower half of the RSI register) and g in RDX . This is prescribed by the calling convention System V AMD64 ABI (used for Linux and other Unix-like x64 operating systems). The code prepares g arguments in EDI and ESI .
Now imagine that we are running a Windows program that somehow extracts the body f from a Linux-oriented module and calls it:
int g(int a, int b); typedef int (*G)(int, int); typedef int (*F)(int, int, G); F f = (F) load_linux_module_and_get_symbol("module.so", "f"); int result = f(3, 4, &g);
What will happen? Since Windows functions expect their arguments in ECX , EDX and R8 , the compiler places the actual arguments in these registers:
mov edx,4 lea r8,[g] lea ecx,[rdx-1] call qword ptr [f1]
But the Linux-oriented version of f looking for values elsewhere. In particular, it searches for the address g in RDX . We just initialized its lower half to 4, so there’s virtually no chance that the RDX will contain anything that makes sense. Most likely, the program crashes.
Running Windows-oriented code on a Linux system will have the same effect.
Thus, we cannot run someone else’s code, but using thunk. Thunk is part of a low-level code that rebuilds arguments to allow calls between pieces of code, following different sets of rules. (Perhaps Thunks can do something else because ABI effects cannot be limited by the calling convention.) Normally, you cannot write thunk in a high-level programming language.
Note that in our scenario we need to provide thunks for f ('host-to-foreign') and g ('foreign-to-host').