Process Initiation


Overview

Taking a simple C program, we look at its initiation using NTSD. There are four segments to the program's life cycle:
  1. Loading until the NTSD's entry breakpoint.
  2. From the entry breakpoint to the start of the actual module code, during which it runs as an APC.
  3. The running of the program, which consists of a compiler preamble, the code body, followed (perhaps) by a compiler postamble, until the hitting of NTSD's exit breakpoint.
  4. Process teardown from the exit breakpoint until the removal of the thread as a schedulable entity.

Exhibits

The Details

Predawn

Process creation occurs on two levels: NT and Windows. CreateProcess is the Windows call which creates both a process and the initial thread in the process. This uses NT calls to create the process on the OS level, and talks to csrss to "register" the process with the Windows-subsystem server.

The CreateProcess routine is a #define for either CreateProcessA or CreateProcessW, depending on the operating system's overal option: ASCII or Unicode. This routine is visable to Win32 users as an export of Kernel32.dll, a dll which is built in nt/private/windows/base. The code for CreateProcess is in nt/private/windows/base/client/create.c.

Createprocess does several things, the goal of which is to leave us with an APC queue for the initial thread of the new process. To achieve this:

  1. CreateProcess opens and verifies the exe file. Before loading the image, it is mapped by a call to LdrQueryImageFileExecutionOptions. This looks for registry key \Registry\Machine\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\ Using NtOpenKey, handing an Object Attributes block created by the macro InitializeObjectAttributes.
  2. A section object is wrapped around this exe (NtCreateSection). The section object is queried for entry point and stack information. Then it creates a process object wrapped around this section object, NtCreateProcess, which creates stack and code segments, apparantly. BasePushProcessParameters is called on the process object to push arguments onto the stack of the newly created process (this is a trans-context operation). BaseCreateStack, nt/private/windows/base/client/support.c, is called in preparation for thread creation, to allocate stack space for the initial thread (each thread has it's own stack). BaseCreateStack manipulates the TEB to reflect the stack creation. It calls NtAllocateVirtualMemory to allocate and commit the stack and optionally a guard page.
  3. BaseInitializeContext is called, nt/private/windows/base/client/i386/context.c, giving the starting address learned by the previous query to the section object (a long time ago!), to fill in a CONTEXT block. NtCreateThread is passed this block when the thread is being setup. The actual push onto the thread's stack in the thread's context might be in: KiInitializeContextThread in nt/private/ntos/ke/i386/thredini.c. Although we are in kernel mode, perhaps the memory for the new thread is mapped into the address space for the old process, since a full context switch does not seen to intervene.

Back in CreateProcess, windows, the process ends w/ a registration of the new process with csrss. (Some LPC magic here!)

Dawn

The queued APC is picked up by the kernel mode routine KiDeliverApc (nt/private/ntos/ke/apcsup.c). It individually calls out kernel and user mode APC's. The user mode part is handed off to KiUserApcDispatcher for handling (nt/private/ntos/rlt/i386/userdisp.asm). It's not clear what intervenes between KiDeliverApc and this routine, but this routine's documentation says that it is already in user mode (run "on return from kernel mode"). Indeed, this function is located in user memory, as part of the ntdll.dll. Furthermore, this function is passed CONTEXT, the continuation environment, as its argument.

Although APC's can do whatever their function pointer points to, this APC has been setup with certain specific function pointers to carry out its task of dll initialization. We are now in the context of the new process. At the bottom of the stack is the return to KiUserApcDispatcher and a pointer to KiUserApcDispatcher's important argument: the continuation environment built to start the users code. This CONTEXT structure (see nt/public/sdk/inc/nti386.h) is also in the user stack, put there by NtCreateThread.

  1. KiUserApcDispatcher calls LdrpInitialize, IPL 0, user mode,
  2. If Peb->BeingDebugged is true, a call to DbgBreakPoint in LdrpInitializeProcess stops the action before dll initialization, for debugging,
  3. on return to KiUserApcDispatcher, it int 2e's to NtContinue passing on its own passed in argument (type CONTEXT see which should be a continuation onto the start of user-executable code.

Morning sickness

The CONTEXT record was created by CreateProcess, in the case of an Windows create process, and contains the start address according to the loaded exe, and also a thunking location according to the kernel32.dll visable when CreateProcess was run. In fact, the NtContinue thunks on through BaseProcessStartThunk, see nt/private/windows/base/client/context.c. The true starting address is in eax of the CONTEXT, BaseProcessStartThunk is found in eip.

This causes problems when the location of kernel32.dll does not agree in the two contexts:

  1. Where CONTEXT was created: the caller of CreateProcess,
  2. Where CONTEXT was used: the context of the newly created process.
There is no intrinsic reason why kernel32.dll should not move. This must be enforced outside of the Windows NT system.

Dawn to Dusk

The change from APC to normal thread is reflected in the return location on the stack. Now the Top of Stack is BaseProcessStart, which has an ExitThread if the user were to return at this point. However, this apparantly does not survive, as the stack trace at the exit break point indicates. The Top of this Stack as a return address inside the user's program, and it calls into system exit as a Kernel32 function. I do not think BaseProcessStart expects to regain control.

It is difficult to say whether the APC and the user code operate in the same thread, or are they two different threads. Although there is kernel mode operations which punctuate the two code runs, they are a single, contiguous run entity, sharing stack, and so forth.

Author

Burton Rosenberg
16 August 1998


Exhibits



burtonr@citrix.com
11 Feb 1999