if you call CreateProcess internally call ZwCreateThread[Ex] to create the first thread in the process
when creating a thread, you (if you call ZwCreateThread ) or initialize the CONTEXT system for a new thread β here Eip(i386) or Rip(amd64) is the entry point for the thread. if you do, you can enter any address. but when you call the word Create[Remote]Thread[Ex] - as I say - the system fills in CONTEXT , and it sets up self-learning as the entry point of the thread. your entry point is saved in the Eax(i386) or Rcx(amd64) register.
The name of this procedure depends on the version of Windows.
it used to be BaseThreadStartThunk or BaseProcessStartThunk (in the case of CreateProcess )) from kernel32.dll .
but now the system points to RtlUserThreadStart from ntdll.dll . RtlUserThreadStart usually calls BaseThreadInitThunk from kernel32.dll (except for built-in (boot) applications such as smss.exe and chkdsk.exe , which do not have kernel32.dll in their own address space at all). BaseThreadInitThunk already calls your source entry point for the thread, and after (if) it returns - RtlExitUserThread .
The main goal of this general thread start wrapper is to set up a high-level SEH filter. just because we can call the SetUnhandledExceptionFilter function. if the flow starts directly from your entry point, without a shell, the functionality of the Top-Level Correction Filter becomes inaccessible.
but regardless of the entry point of the stream β the stream in user space β NEVER starts execution from that point!
earlier, when the user mode thread starts - system insert APC for the thread with LdrInitializeThunk as the Apc routine - this is done by copying (saving) the CONTEXT stream to the user stack, and then calls KiUserApcDispatcher , which calls LdrInitializeThunk . when LdrInitializeThunk finished - we will return to KiUserApcDispatcher , which called NtContinue with the stored CONTEXT stream - only after this starting point of the beginning of the stream begins.
but now the system does some optimization in this process - it copies (saves) the CONTEXT stream to the user stack and a direct call to LdrInitializeThunk . at the end of this function, NtContinue is called - and the entry point of the stream is executed.
therefore, EVERY thread starts in user mode from LdrInitializeThunk . (this function with the exact name exists and is called in all versions of Windows from nt4 to win10)
what does this function do? what is this for? can you listen to the DLL_THREAD_ATTACH notification? when starting a new thread (with the exception for special processed systems, for example LdrpWorkCallback ) - it scans the loaded DLL list and calls the entry points to the DLL with DLL_THREAD_ATTACH (of course, if the DLL has an entry point and DisableThreadLibraryCalls not called for this DLL). but how is this implemented? thanks to LdrInitializeThunk which call LdrpInitialize LdrpInitializeThread LdrpCallInitRoutine (for DLL DLL)
when the first thread in the process begins - this is a special case. You need to complete many additional tasks to initialize the process. at this time only two modules are loaded - EXE and ntdll.dll . LdrInitializeThunk call LdrpInitializeProcess for this job. if very briefly:
- various process structures are initialized
- loading all DLLs (and their dependents) to which the EXEs are statically linked - but don't call them EP!
- called
LdrpDoDebuggerBreak - this function looks like - the debugger is attached to the process, and if so - int 3 - so the debugger receives an exception message - STATUS_BREAKPOINT - most debuggers can start debugging the user interface only from this moment. however there is a debugger (s) that debugs the LdrInitializeThunk debugging LdrInitializeThunk - all my screenshots are from this kind of debugger - the important point is that while in the process of code execution there is no
ntdll.dll (and maybe from kernel32.dll ) - code from another DLL, any third-party code has not yet been executed in the process. - optional downloadable dll dim for processing - the Shim Engine is initialized. but this is optional
- go through the loaded DLL list and call its EP with
DLL_PROCESS_DETACH TLS TLS Initialization and Callbacks (if exists)
CalledZwTestAlert - This call check exists APC in the queue stream and execute it. this point exists in all versions from NT4 to win 10. This allows, for example, to create a process in a suspended state and then insert an APC call ( QueueUserAPC ) to it. ( PROCESS_INFORMATION.hThread ) - as a result, this call will be executed after the process is fully initialized, all DLL_PROCESS_DETACH , but before the entry point EXE. in the context of the first process.
- and NtContinue is finally called - itβs restoring the saved context of the stream and we finally jump into the EP stream
read also stream CreateProcess