Asm Call instruction - how does it work?

I would like to get a clear explanation in the Windows environment (PE executables) of how the CALL instructions XXXXXXXXXXXXXXX work. I am studying the PE format, but I was rather confused about the connection between the CALL ADDRESS statement, importing a function from dll and how CALL ADDRESS accesses the code in the DLL. In addition, ASLR and other security features can be moved around the DLL, how do executables handle this?

+4
source share
3 answers

It (i.e., directly calls the import with a normal relative call) does not work, and why it is not.

To call the imported function, you go to the Import Address Table (IAT) page. In short, entries in the IAT first indicate function names (i.e., they begin as a copy of the import name table), and these pointers are modified to indicate actual functions by the loader.

The IAT has a fixed address, but can be moved if the image was reinstalled, so calling through it involves only one indirect direction - therefore, call r/m used with a memory operand (which is just a simple constant) to call imported functions, for example call [0x40206C] .

+7
source

Jan 22, 2013: added more simpler concrete examples and discussion, since (A) the wrong answer was chosen as the solution, and (B) my original answer did not seem to be understood by some readers, including the OP. Sorry, mea culpa. I just sent the answer in a hurry, adding an example of the code that I already had at hand.


How do I interpret the question.

You are asking,

"I studied the PE format, but I was rather confused about the relationship between the CALL ADDRESS statement, importing a function from a DLL, and how CALL ADDRESS accesses the code in a DLL."

The term CALL ADDRESS does not make much sense at the C ++ level, so I assume that you mean CALL ADDRESS at the assembly or machine code level.

The problem is that when a DLL is loaded at some other address than the preferred one, how do call instructions relate to DLL functions?


In short.

  • At the machine code level, a call with the specified address works, causing a minimal transfer procedure consisting of a single jmp instruction. The jmp command calls the DLL function through a table lookup. Typically, an import library for a DLL exports both the DLL function itself and the __imp__ name __imp__ , as well as a wrapper without such a name prefix, for example. __imp__MessageBoxA@16 and _MessageBoxA@16 .

Ie, except that I & rsquo; ve came up with the names below, assembler usually translates

call MessageBox

in

call MessageBox_forwarder
; all here
MessageBox_forwarder: jmp ds:[MessageBox_tableEntry]

When the DLL loads, the loader places the corresponding addresses in the table (s).

  • At the assembly language level, a call with a subroutine specified as soon as the identifier can match either call with forwarding or call directly with the DLL function by looking up the table, depending on the type declared for the identifier.

  • There may be more than one DLL function address table, even for import from a single DLL. But in general, they are considered one large table, and then briefly called the โ€œimport address tableโ€ or IAT . The IAT table (or, more precisely, the tables) is in a fixed place in the image, i.e. They move with the code when it is loaded somewhere not preferred, rather than at a fixed address.

The currently selected answer to the solution is incorrect :

  • The response states that โ€œthis does not work, and why it is not how it is done,โ€ where, apparently, โ€œthisโ€ refers to the CALL ADDRESS. But using CALL ADDRESS in assembly or at the machine code level is great for calling a DLL function. If it is correctly executed,

  • The response claims that the IAT has a fixed address. But this isn & rsquo; t.


CALL ADDRESS is working fine.

Let's look at a specific CALL ADDRESS statement in which the address has a very well-known DLL function, namely, calling the MessageBoxA Windows API function from the [user32.dll] library:

 call MessageBoxA 

No problem using this instruction.

As you will see below, at the machine code level, this call instruction itself contains only an offset, which causes the call to go to the jmp instruction, which looks for the address of the DLL program in the table of import addresses of the function pointers, which are usually fixed by the loader when loading the corresponding DLL.

To be able to verify machine code, here is the full 32-bit x86 assembly language program using this specific example instruction:

 .model flat, stdcall option casemap :none ; Case sensitive identifiers, please. _as32bit textequ <DWord ptr> public start ExitProcess proto stdcall :DWord MessageBoxA_t typedef proto stdcall :DWord, :DWord, :DWord, :DWord extern MessageBoxA : MessageBoxA_t extern _imp__MessageBoxA@16 : ptr MessageBoxA_t MB_ICONINFORMATION equ 0040h MB_SETFOREGROUND equ 00010000h infoBoxOptions equ MB_ICONINFORMATION or MB_SETFOREGROUND .const boxtitle_1 db "Just FYI 1 (of 3):", 0 boxtitle_2 db "Just FYI 2 (of 3):", 0 boxtitle_3 db "Just FYI 3 (of 3):", 0 boxtext db "There's intelligence somewhere in the universe", 0 .code start: push infoBoxOptions push offset boxtitle_1 push offset boxtext push 0 call MessageBoxA ; Call #1 - to jmp to DLL-func. push infoBoxOptions push offset boxtitle_2 push offset boxtext push 0 call ds:[ _imp__MessageBoxA@16 ] ; Call #2 - directly to DLL-func. push infoBoxOptions push offset boxtitle_3 push offset boxtext push 0 call _imp__MessageBoxA@16 ; Call #3 - same as #2, due to type of identifier. push 0 ; Exit code, 0 indicates success. call ExitProcess end 

Build and link using Microsoft tooling & rsquo ;, where the /debug linker option asks the linker to create a PDB debugging information file for use with the Visual Studio debugger:

  [d: \ dev \ test \ call]
 > ml / nologo / c asm_call.asm
  Assembling: asm_call.asm

 [d: \ dev \ test \ call]
 > link / nologo asm_call.obj kernel32.lib user32.lib / entry: start / subsystem: windows / debug

 [d: \ dev \ test \ call]
 > dir asm * / b
 asm_call.asm
 asm_call.exe
 asm_call.ilk
 asm_call.obj
 asm_call.pdb

 [d: \ dev \ test \ call]
 > _

One of the easiest ways to debug this is to start Visual Studio (program [devenv.exe]), and in Visual Studio - [Debug โ†’ Step in] or just press F11:

  [d: \ dev \ test \ call]
 > devenv asm_call.exe

 [d: \ dev \ test \ call]
 > _

enter image description here

In the figure above, showing the Visual Studio 2012 debugger in action, the left most large red arrow shows you the address information in the machine code instruction, namely 0000004E hex (note: the least 0000004E byte is at the lowest address, first in memory), and the other is large red the arrow shows you that, surprisingly, this rather small magic number in some way denotes the _MessageBoxA@16 function, which, as the debugger knows, is located at 01161064h .

  • The address data in the CALL ADDRESS statement represents an offset that refers to the address of the next instruction, so a correction is not required to fix the location of the DLL.

  • The address accessed by the call contains only jmp ds:[IAT_entry_for_MessageBoxA] .

  • This forwarder code comes from the import library, and not from the DLL, so it also does not need to be fixed (but, apparently, it receives special processing, as well as the address of the DLL function).

The second invocation command directly does what jmp does for the first, namely the search for the address of the DLL function in the IAT table.

The third call command can now be identical to the second at the machine code level. Apparently not as good as emulating Visual C ++ declspec( dllimport ) in an assembly. The above kind of declaration is one way, possibly in combination with a text symbol.


IAT does not have a fixed address.

The following C ++ program reports the address to which it was loaded, which DLL function it imports from which modules, and where the various IAT tables are located.

When is he & rsquo; built using the modern version of the Microsoft toolbar, by default, it is usually loaded with a different address each time it starts.

You can prevent this behavior by using the /dynamicbase:no linker option.

 #include <assert.h> // assert #include <stddef.h> // ptrdiff_t #include <sstream> using std::ostringstream; #undef UNICODE #define UNICODE #include <windows.h> template< class Result, class SomeType > Result as( SomeType const p ) { return reinterpret_cast<Result>( p ); } template< class Type > class OffsetTo { private: ptrdiff_t offset_; public: ptrdiff_t asInteger() const { return offset_; } explicit OffsetTo( ptrdiff_t const offset ): offset_( offset ) {} }; template< class ResultPointee, class SourcePointee > ResultPointee* operator+( SourcePointee* const p, OffsetTo<ResultPointee> const offset ) { return as<ResultPointee*>( as<char const*>( p ) + offset.asInteger() ); } int main() { auto const pImage = as<IMAGE_DOS_HEADER const*>( ::GetModuleHandle( nullptr ) ); assert( pImage->e_magic == IMAGE_DOS_SIGNATURE ); auto const pNTHeaders = pImage + OffsetTo<IMAGE_NT_HEADERS const>( pImage->e_lfanew ); assert( pNTHeaders->Signature == IMAGE_NT_SIGNATURE ); auto const& importDir = pNTHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT]; auto const pImportDescriptors = pImage + OffsetTo<IMAGE_IMPORT_DESCRIPTOR const>( importDir.VirtualAddress //+ importSectionHeader.PointerToRawData ); ostringstream stream; stream << "I'm loaded at " << pImage << ", and I'm using...\n"; for( int i = 0; pImportDescriptors[i].Name != 0; ++i ) { auto const pModuleName = pImage + OffsetTo<char const>( pImportDescriptors[i].Name ); DWORD const offsetNameTable = pImportDescriptors[i].OriginalFirstThunk; DWORD const offsetAddressTable = pImportDescriptors[i].FirstThunk; // The module "IAT" auto const pNameTable = pImage + OffsetTo<IMAGE_THUNK_DATA const>( offsetNameTable ); auto const pAddressTable = pImage + OffsetTo<IMAGE_THUNK_DATA const>( offsetAddressTable ); stream << "\n* '" << pModuleName << "'"; stream << " with IAT at " << pAddressTable << "\n"; stream << "\t"; for( int j = 0; pNameTable[j].u1.AddressOfData != 0; ++j ) { auto const pFuncName = pImage + OffsetTo<char const>( 2 + pNameTable[j].u1.AddressOfData ); stream << pFuncName << " "; } stream << "\n"; } MessageBoxA( 0, stream.str().c_str(), "FYI:", MB_ICONINFORMATION | MB_SETFOREGROUND ); } 

enter image description here


A self-replicating Windows native code program.

Finally, from my original answer, here is the Microsoft assembler program (MASM), which I made for another purpose, which illustrates some of the problems, because by its nature (it creates a source as source code, which, when built and run, creates one same source code, etc.), it should be completely relocatable code and with the smallest help of a regular program loader:

 .model flat, stdcall option casemap :none ; Case sensitive identifiers, please. dword_aligned textequ <4> ; Just for readability. ; Windows API functions: extern ExitProcess@4 : proc ; from [kernel32.dll] extern GetStdHandle@4 : proc ; from [kernel32.dll] extern WriteFile@20 : proc ; from [kernel32.dll] extern wsprintfA: proc ; from [user32.dll] STD_OUTPUT_HANDLE equ -11 ; The main code. GlobalsStruct struct dword_aligned codeStart dword ? outputStreamHandle dword ? GlobalsStruct ends globals textequ <(GlobalsStruct ptr [edi])> .code startup: jmp code_start ; Trampolines to add references to these functions. myExitProcess: jmp ExitProcess@4 myGetStdHandle: jmp GetStdHandle@4 myWriteFile: jmp WriteFile@20 mywsprintfA: jmp wsprintfA ;------------------------------------------------------------------ ; ; The code below is reproduced, so it all relative. code_start: jmp main prologue: byte ".model flat, stdcall", 13, 10 byte "option casemap :none", 13, 10 byte 13, 10 byte " extern ExitProcess@4 : proc", 13, 10 byte " extern GetStdHandle@4 : proc", 13, 10 byte " extern WriteFile@20 : proc", 13, 10 byte " extern wsprintfA: proc", 13, 10 byte 13, 10 byte " .code", 13, 10 byte "startup:", 13, 10 byte " jmp code_start", 13, 10 byte 13, 10 byte "jmp ExitProcess@4 ", 13, 10 byte "jmp GetStdHandle@4 ", 13, 10 byte "jmp WriteFile@20 ", 13, 10 byte "jmp wsprintfA", 13, 10 byte 13, 10 byte "code_start:", 13, 10 prologue_nBytes equ $ - prologue epilogue: byte "code_end:", 13, 10 byte " end startup", 13, 10 epilogue_nBytes equ $ - epilogue dbDirective byte 4 dup( ' ' ), "byte " dbDirective_nBytes equ $ - dbDirective numberFormat byte " 0%02Xh", 0 numberFormat_nBytes equ $ - numberFormat comma byte "," windowsNewline byte 13, 10 write: push 0 ; space for nBytesWritten mov ecx, esp ; &nBytesWritten push 0 ; lpOverlapped push ecx ; &nBytesWritten push ebx ; nBytes push eax ; &s[0] push globals.outputStreamHandle call myWriteFile pop eax ; nBytesWritten ret displayMachineCode: dmc_LocalsStruct struct dword_aligned numberStringLen dword ? numberString byte 16*4 DUP( ? ) fileHandle dword ? nBytesWritten dword ? byteIndex dword ? dmc_LocalsStruct ends dmc_locals textequ <[ebp - sizeof dmc_LocalsStruct].dmc_LocalsStruct> mov ebp, esp sub esp, sizeof dmc_LocalsStruct ; Output prologue that makes MASM happy (placing machine code data in context): ; lea eax, prologue mov eax, globals.codeStart add eax, prologue - code_start mov ebx, prologue_nBytes call write ; Output the machine code bytes. mov dmc_locals.byteIndex, 0 dmc_lineLoop: ; loop start ; Output a db directive ;lea eax, dbDirective mov eax, globals.codeStart add eax, dbDirective - code_start mov ebx, dbDirective_nBytes call write dmc_byteIndexingLoop: ; loop start ; Create string representation of a number mov ecx, dmc_locals.byteIndex mov eax, 0 ;mov al, byte ptr [code_start + ecx] mov ebx, globals.codeStart mov al, [ebx + ecx] push eax ;push offset numberFormat mov eax, globals.codeStart add eax, numberFormat - code_start push eax lea eax, dmc_locals.numberString push eax call mywsprintfA add esp, 3*(sizeof dword) mov dmc_locals.numberStringLen, eax ; Output string representation of number lea eax, dmc_locals.numberString mov ebx, dmc_locals.numberStringLen call write ; Are we finished looping yet? inc dmc_locals.byteIndex mov ecx, dmc_locals.byteIndex cmp ecx, code_end - code_start je dmc_finalNewline and ecx, 07h jz dmc_after_byteIndexingLoop ; Output a comma ; lea eax, comma mov eax, globals.codeStart add eax, comma - code_start mov ebx, 1 call write jmp dmc_byteIndexingLoop ; loop end dmc_after_byteIndexingLoop: ; New line ; lea eax, windowsNewline mov eax, globals.codeStart add eax, windowsNewline - code_start mov ebx, 2 call write jmp dmc_lineLoop; ; loop end dmc_finalNewline: ; New line ; lea eax, windowsNewline mov eax, globals.codeStart add eax, windowsNewline - code_start mov ebx, 2 call write ; Output epilogue that makes MASM happy: ; lea eax, epilogue mov eax, globals.codeStart add eax, epilogue - code_start mov ebx, epilogue_nBytes call write mov esp, ebp ret main: sub esp, sizeof GlobalsStruct mov edi, esp call main_knownAddress main_knownAddress: pop eax sub eax, main_knownAddress - code_start mov globals.codeStart, eax push STD_OUTPUT_HANDLE call myGetStdHandle mov globals.outputStreamHandle, eax call displayMachineCode ; Well behaved process exit: push 0 ; Process exit code, 0 indicates success. call myExitProcess code_end: end startup 

And here is the self-reproducing conclusion:

 .model flat, stdcall option casemap :none extern ExitProcess@4 : proc extern GetStdHandle@4 : proc extern WriteFile@20 : proc extern wsprintfA: proc .code startup: jmp code_start jmp ExitProcess@4 jmp GetStdHandle@4 jmp WriteFile@20 jmp wsprintfA code_start: byte 0E9h, 03Bh, 002h, 000h, 000h, 02Eh, 06Dh, 06Fh byte 064h, 065h, 06Ch, 020h, 066h, 06Ch, 061h, 074h byte 02Ch, 020h, 073h, 074h, 064h, 063h, 061h, 06Ch byte 06Ch, 00Dh, 00Ah, 06Fh, 070h, 074h, 069h, 06Fh byte 06Eh, 020h, 063h, 061h, 073h, 065h, 06Dh, 061h byte 070h, 020h, 03Ah, 06Eh, 06Fh, 06Eh, 065h, 00Dh byte 00Ah, 00Dh, 00Ah, 020h, 020h, 020h, 020h, 065h byte 078h, 074h, 065h, 072h, 06Eh, 020h, 020h, 045h byte 078h, 069h, 074h, 050h, 072h, 06Fh, 063h, 065h byte 073h, 073h, 040h, 034h, 03Ah, 020h, 070h, 072h byte 06Fh, 063h, 00Dh, 00Ah, 020h, 020h, 020h, 020h byte 065h, 078h, 074h, 065h, 072h, 06Eh, 020h, 020h byte 047h, 065h, 074h, 053h, 074h, 064h, 048h, 061h byte 06Eh, 064h, 06Ch, 065h, 040h, 034h, 03Ah, 020h byte 070h, 072h, 06Fh, 063h, 00Dh, 00Ah, 020h, 020h byte 020h, 020h, 065h, 078h, 074h, 065h, 072h, 06Eh byte 020h, 020h, 057h, 072h, 069h, 074h, 065h, 046h byte 069h, 06Ch, 065h, 040h, 032h, 030h, 03Ah, 020h byte 070h, 072h, 06Fh, 063h, 00Dh, 00Ah, 020h, 020h byte 020h, 020h, 065h, 078h, 074h, 065h, 072h, 06Eh byte 020h, 020h, 077h, 073h, 070h, 072h, 069h, 06Eh byte 074h, 066h, 041h, 03Ah, 020h, 070h, 072h, 06Fh byte 063h, 00Dh, 00Ah, 00Dh, 00Ah, 020h, 020h, 020h byte 020h, 02Eh, 063h, 06Fh, 064h, 065h, 00Dh, 00Ah byte 073h, 074h, 061h, 072h, 074h, 075h, 070h, 03Ah byte 00Dh, 00Ah, 020h, 020h, 020h, 020h, 06Ah, 06Dh byte 070h, 020h, 020h, 020h, 020h, 020h, 063h, 06Fh byte 064h, 065h, 05Fh, 073h, 074h, 061h, 072h, 074h byte 00Dh, 00Ah, 00Dh, 00Ah, 06Ah, 06Dh, 070h, 020h byte 045h, 078h, 069h, 074h, 050h, 072h, 06Fh, 063h byte 065h, 073h, 073h, 040h, 034h, 00Dh, 00Ah, 06Ah byte 06Dh, 070h, 020h, 047h, 065h, 074h, 053h, 074h byte 064h, 048h, 061h, 06Eh, 064h, 06Ch, 065h, 040h byte 034h, 00Dh, 00Ah, 06Ah, 06Dh, 070h, 020h, 057h byte 072h, 069h, 074h, 065h, 046h, 069h, 06Ch, 065h byte 040h, 032h, 030h, 00Dh, 00Ah, 06Ah, 06Dh, 070h byte 020h, 077h, 073h, 070h, 072h, 069h, 06Eh, 074h byte 066h, 041h, 00Dh, 00Ah, 00Dh, 00Ah, 063h, 06Fh byte 064h, 065h, 05Fh, 073h, 074h, 061h, 072h, 074h byte 03Ah, 00Dh, 00Ah, 063h, 06Fh, 064h, 065h, 05Fh byte 065h, 06Eh, 064h, 03Ah, 00Dh, 00Ah, 020h, 020h byte 020h, 020h, 065h, 06Eh, 064h, 020h, 073h, 074h byte 061h, 072h, 074h, 075h, 070h, 00Dh, 00Ah, 020h byte 020h, 020h, 020h, 062h, 079h, 074h, 065h, 020h byte 020h, 020h, 020h, 020h, 020h, 020h, 020h, 030h byte 025h, 030h, 032h, 058h, 068h, 000h, 02Ch, 00Dh byte 00Ah, 06Ah, 000h, 08Bh, 0CCh, 06Ah, 000h, 051h byte 053h, 050h, 0FFh, 077h, 004h, 0E8h, 074h, 0FEh byte 0FFh, 0FFh, 058h, 0C3h, 08Bh, 0ECh, 083h, 0ECh byte 050h, 08Bh, 007h, 005h, 005h, 000h, 000h, 000h byte 0BBh, 036h, 001h, 000h, 000h, 0E8h, 0D7h, 0FFh byte 0FFh, 0FFh, 0C7h, 045h, 0FCh, 000h, 000h, 000h byte 000h, 08Bh, 007h, 005h, 057h, 001h, 000h, 000h byte 0BBh, 00Fh, 000h, 000h, 000h, 0E8h, 0BFh, 0FFh byte 0FFh, 0FFh, 08Bh, 04Dh, 0FCh, 0B8h, 000h, 000h byte 000h, 000h, 08Bh, 01Fh, 08Ah, 004h, 019h, 050h byte 08Bh, 007h, 005h, 066h, 001h, 000h, 000h, 050h byte 08Dh, 045h, 0B4h, 050h, 0E8h, 02Ah, 0FEh, 0FFh byte 0FFh, 083h, 0C4h, 00Ch, 089h, 045h, 0B0h, 08Dh byte 045h, 0B4h, 08Bh, 05Dh, 0B0h, 0E8h, 08Fh, 0FFh byte 0FFh, 0FFh, 0FFh, 045h, 0FCh, 08Bh, 04Dh, 0FCh byte 081h, 0F9h, 068h, 002h, 000h, 000h, 074h, 02Bh byte 083h, 0E1h, 007h, 074h, 013h, 08Bh, 007h, 005h byte 06Eh, 001h, 000h, 000h, 0BBh, 001h, 000h, 000h byte 000h, 0E8h, 06Bh, 0FFh, 0FFh, 0FFh, 0EBh, 0AAh byte 08Bh, 007h, 005h, 06Fh, 001h, 000h, 000h, 0BBh byte 002h, 000h, 000h, 000h, 0E8h, 058h, 0FFh, 0FFh byte 0FFh, 0EBh, 086h, 08Bh, 007h, 005h, 06Fh, 001h byte 000h, 000h, 0BBh, 002h, 000h, 000h, 000h, 0E8h byte 045h, 0FFh, 0FFh, 0FFh, 08Bh, 007h, 005h, 03Bh byte 001h, 000h, 000h, 0BBh, 01Ch, 000h, 000h, 000h byte 0E8h, 034h, 0FFh, 0FFh, 0FFh, 08Bh, 0E5h, 0C3h byte 083h, 0ECh, 008h, 08Bh, 0FCh, 0E8h, 000h, 000h byte 000h, 000h, 058h, 02Dh, 04Ah, 002h, 000h, 000h byte 089h, 007h, 06Ah, 0F5h, 0E8h, 098h, 0FDh, 0FFh byte 0FFh, 089h, 047h, 004h, 0E8h, 023h, 0FFh, 0FFh byte 0FFh, 06Ah, 000h, 0E8h, 084h, 0FDh, 0FFh, 0FFh code_end: end startup 
+4
source

Linker. When your executable is linked, the linker replaces and bases all the DLLs. Due to virtual memory, all processes are loaded with the same base address, simplifying addressing. Because the DLL is a PIL (position-independent code), the loader can reinstall the DLL for the application. Since the code refers to characters that the user can move, he should never worry about his location.

EDIT: I just realized that this is not so: the Linux dynamic libraries are PIL, and Windows is not (so we have to reinstall it at all).

0
source

All Articles