( Original text by dtm )
bout This Paper
The following document is a result of self-research of malicious software (malware) and its interaction with the Windows Application Programming Interface (WinAPI). It details the fundamental concepts behind how malware is able to implant malicious payloads into other processes and how it is possible to detect such functionality by monitoring communication with the Windows operating system. The notion of observing calls to the API will also be illustrated by the procedure of hooking certain functions which will be used to achieve the code injection techniques.
Disclaimer: Since this was a relatively accelerated project due to some time constraints, I would like to kindly apologise in advance for any potential misinformation that may be presented and would like to ask that I be notified as soon as possible so that it may revised. On top of this, the accompanying code may be under-developed for practical purposes and have unforseen design flaws.
Introduction
In the present day, malware are developed by cyber-criminals with the intent of compromising machines that may be leveraged to perform activities from which they can profit. For many of these activities, the malware must be able survive out in the wild, in the sense that they must operate covertly with all attempts to avert any attention from the victims of the infected and thwart detection by anti-virus software. Thus, the inception of stealth via code injection was the solution to this problem.
Section I: Fundamental Concepts
Inline Hooking
Inline hooking is the act of detouring the flow of code via hotpatching. Hotpatching is defined as the modification of code during the runtime of an executable image[1]. The purpose of inline hooking is to be able to capture the instance of when the program calls a function and then from there, observation and/or manipulation of the call can be accomplished. Here is a visual representation of how normal execution works:
Normal Execution of a Function Call
| Program | ------ calls function -----> | Function | (execution of function)
versus execution of a hooked function:

This can be separated into three steps. To demonstrate this process, the WinAPI function MessageBox 15 will be used.
- Hooking the function
To hook the function, we first require the intermediate function which must replicate parameters of the targetted function. Microsoft Developer Network (MSDN) defines
int WINAPI MessageBox(
_In_opt_ HWND hWnd,
_In_opt_ LPCTSTR lpText,
_In_opt_ LPCTSTR lpCaption,
_In_ UINT uType
);
So the intermediate function may be defined like so:
int WINAPI HookedMessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType) {
// our code in here
}
Once this exists, execution flow has somewhere for the code to be redirected. To actually hook the
; MessageBox
8B FF mov edi, edi
55 push ebp
8B EC mov ebp, esp
versus the hooked function:
; MessageBox
68 xx xx xx xx push <HookedMessageBox> ; our intermediate function
C3 ret
Here I have opted to use the
- Capturing the function call
When the program calls
int WINAPI HookedMessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType) {
TCHAR szMyText[] = TEXT("This function has been hooked!");
}
- Resuming normal execution
To forward this parameter, execution needs to continue to the original
int WINAPI HookedMessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType) {
TCHAR szMyText[] = TEXT("This function has been hooked!");
// restore the original bytes of MessageBox
// ...
// continue to MessageBox with the replaced parameter and return the return value to the program
return MessageBox(hWnd, szMyText, lpCaption, uType);
}
If rejecting the call to
int WINAPI HookedMessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType) {
return IDNO; // IDNO defined as 7
}
API Monitoring
The concept of API monitoring follows on from function hooking. Because gaining control of function calls is possible, observation of all of the parameters is also possible, as previously mentioned hence the name API monitoring. However, there is a small issue which is caused by the availability of different high-level API calls that are unique but operate using the same set of API at a lower level. This is called function wrapping, defined as subroutines whose purpose is to call a secondary subroutine. Returning to the
+---------+
| Program |
+---------+
/ \
| |
+------------+ +------------+
| Function A | | Function B |
+------------+ +------------+
| |
+-------------------------------+
| user32.dll, kernel32.dll, ... |
+-------------------------------+
+---------+ +-------- hook -----------------> |
| API | <---- + +-------------------------------------+
| Monitor | <-----+ | ntdll.dll |
+---------+ | +-------------------------------------+
+-------- hook -----------------> | User mode
-----------------------------------------------------
Kernel mode
Here is what the
Here is
user32!MessageBoxA -> user32!MessageBoxExA -> user32!MessageBoxTimeoutA -> user32!MessageBoxTimeoutW
and
user32!MessageBoxW -> user32!MessageBoxExW -> user32!MessageBoxTimeoutW
The call hierarchy both funnel into
int WINAPI MessageBoxTimeoutW(
HWND hWnd,
LPCWSTR lpText,
LPCWSTR lpCaption,
UINT uType,
WORD wLanguageId,
DWORD dwMilliseconds
);
To log the usage:
int WINAPI MessageBoxTimeoutW(HWND hWnd, LPCWSTR lpText, LPCWSTR lpCaption, UINT uType, WORD wLanguageId, DWORD dwMilliseconds) {
std::wofstream logfile; // declare wide stream because of wide parameters
logfile.open(L"log.txt", std::ios::out | std::ios::app);
logfile << L"Caption: " << lpCaption << L"\n";
logfile << L"Text: " << lpText << L"\n";
logfile << L"Type: " << uType << :"\n";
logfile.close();
// restore the original bytes
// ...
// pass execution to the normal function and save the return value
int ret = MessageBoxTimeoutW(hWnd, lpText, lpCaption, uType, wLanguageId, dwMilliseconds);
// rehook the function for next calls
// ...
return ret; // return the value of the original function
}
Once the hook has been placed into
Code Injection Primer
For the purposes of this paper, code injection will be defined as the insertion of executable code into an external process. The possibility of injecting code is a natural result of the functionality allowed by the WinAPI. If certain functions are stringed together, it is possible to access an existing process, write data to it and then execute it remotely under its context. In this section, the relevant techniques of code injection that was covered in the research will be introduced.
DLL Injection
Code can come from a variety of forms, one of which is a Dynamic Link Library (DLL). DLLs are libraries that are designed to offer extended functionality to an executable program which is made available by exporting subroutines. Here is an example DLL that will be used for the remainder of the paper:
extern "C" void __declspec(dllexport) Demo() {
::MessageBox(nullptr, TEXT("This is a demo!"), TEXT("Demo"), MB_OK);
}
bool APIENTRY DllMain(HINSTANCE hInstDll, DWORD fdwReason, LPVOID lpvReserved) {
if (fdwReason == DLL_PROCESS_ATTACH)
::CreateThread(nullptr, 0, (LPTHREAD_START_ROUTINE)Demo, nullptr, 0, nullptr);
return true;
}
When a DLL is loaded into a process and initialised, the loader will call
CreateRemoteThread
DLL injection via the CreateRemoteThread 7 function utilises this function to execute a remote thread in the virtual space of another process. As mentioned above, all that is required to execute a DLL is to have it load into the process by forcing it to execute the
void injectDll(const HANDLE hProcess, const std::string dllPath) {
LPVOID lpBaseAddress = ::VirtualAllocEx(hProcess, nullptr, dllPath.length(), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
::WriteProcessMemory(hProcess, lpBaseAddress, dllPath.c_str(), dllPath.length(), &dwWritten);
HMODULE hModule = ::GetModuleHandle(TEXT("kernel32.dll"));
LPVOID lpStartAddress = ::GetProcAddress(hModule, "LoadLibraryA"); // LoadLibraryA for ASCII string
::CreateRemoteThread(hProcess, nullptr, 0, (LPTHREAD_START_ROUTINE)lpStartAddress, lpBaseAddress, 0, nullptr);
}
MSDN defines LoadLibrary as:
HMODULE WINAPI LoadLibrary(
_In_ LPCTSTR lpFileName
);
It takes a single parameter which is the path name to the desired library to load. The
- Allocating virtual memory in the target process
Using
Virtual Address Space of Target Process
+--------------------+
| |
VirtualAllocEx +--------------------+
Allocated memory ---> | Empty space |
+--------------------+
| |
+--------------------+
| Executable |
| Image |
+--------------------+
| |
| |
+--------------------+
| kernel32.dll |
+--------------------+
| |
+--------------------+
- Writing the DLL path to allocated memory
Once memory has been initialised, the path to the DLL can be injected into the allocated memory returned by
Virtual Address Space of Target Process
+--------------------+
| |
WriteProcessMemory +--------------------+
Inject DLL path ----> | "..\..\myDll.dll" |
+--------------------+
| |
+--------------------+
| Executable |
| Image |
+--------------------+
| |
| |
+--------------------+
| kernel32.dll |
+--------------------+
| |
+--------------------+
- Get address of
LoadLibrary
Since all system DLLs are mapped to the same address space across all processes, the address of
- Loading the DLL
The address of
Virtual Address Space of Target Process
+--------------------+
| |
+--------------------+
+--------- | "..\..\myDll.dll" |
| +--------------------+
| | |
| +--------------------+ <---+
| | myDll.dll | |
| +--------------------+ |
| | | | LoadLibrary
| +--------------------+ | loads
| | Executable | | and
| | Image | | initialises
| +--------------------+ | myDll.dll
| | | |
| | | |
CreateRemoteThread v +--------------------+ |
LoadLibraryA("..\..\myDll.dll") --> | kernel32.dll | ----+
+--------------------+
| |
+--------------------+
SetWindowsHookEx
Windows offers developers the ability to monitor certain events with the installation of hooks by using the SetWindowsHookEx 6 function. While this function is very common in the monitoring of keystrokes for keylogger functionality, it can also be used to inject DLLs. The following code demonstrates DLL injection into itself:
int main() {
HMODULE hMod = ::LoadLibrary(DLL_PATH);
HOOKPROC lpfn = (HOOKPROC)::GetProcAddress(hMod, "Demo");
HHOOK hHook = ::SetWindowsHookEx(WH_GETMESSAGE, lpfn, hMod, ::GetCurrentThreadId());
::PostThreadMessageW(::GetCurrentThreadId(), WM_RBUTTONDOWN, (WPARAM)0, (LPARAM)0);
// message queue to capture events
MSG msg;
while (::GetMessage(&msg, nullptr, 0, 0) > 0) {
::TranslateMessage(&msg);
::DispatchMessage(&msg);
}
return 0;
}
HHOOK WINAPI SetWindowsHookEx(
_In_ int idHook,
_In_ HOOKPROC lpfn,
_In_ HINSTANCE hMod,
_In_ DWORD dwThreadId
);
takes a
QueueUserAPC
DLL injection with QueueUserAPC 5 works similar to that of
int injectDll(const std::string dllPath, const DWORD dwProcessId, const DWORD dwThreadId) {
HANDLE hProcess = ::OpenProcess(PROCESS_ALL_ACCESS, false, dwProcessId);
HANDLE hThread = ::OpenThread(THREAD_ALL_ACCESS, false, dwThreadId);
LPVOID lpLoadLibraryParam = ::VirtualAllocEx(hProcess, nullptr, dllPath.length(), MEM_COMMIT, PAGE_READWRITE);
::WriteProcessMemory(hProcess, lpLoadLibraryParam, dllPath.data(), dllPath.length(), &dwWritten);
::QueueUserAPC((PAPCFUNC)::GetProcAddress(::GetModuleHandle(TEXT("kernel32.dll")), "LoadLibraryA"), hThread, (ULONG_PTR)lpLoadLibraryParam);
return 0;
}
One major difference between this and
Process Hollowing
Process hollowing, AKA RunPE, is a popular method used to evade anti-virus detection. It allows the injection of entire executable files to be loaded into a target process and executed under its context. Often seen in crypted applications, a file on disk that is compatible with the payload is selected as the host and is created as a process, has its main executable module hollowed out and replaced. This procedure can be broken up into four stages.
- Creating a host process
In order for the payload to be injected, the bootstrap must first locate a suitable host. If the payload is a .NET application, the host must also be a .NET application. If the payload is a native executable defined to use the console subsystem, the host must also reflect the same attributes. The same is applied to x86 and x64 programs. Once the host has been chosen, it is created as a suspended process using
Executable Image of Host Process
+--- +--------------------+
| | PE |
| | Headers |
| +--------------------+
| | .text |
| +--------------------+
CreateProcess + | .data |
| +--------------------+
| | ... |
| +--------------------+
| | ... |
| +--------------------+
| | ... |
+--- +--------------------+
- Hollowing the host process
For the payload to work correctly after injection, it must be mapped to a virtual address space that matches its
typedef struct _IMAGE_OPTIONAL_HEADER {
WORD Magic;
BYTE MajorLinkerVersion;
BYTE MinorLinkerVersion;
DWORD SizeOfCode;
DWORD SizeOfInitializedData;
DWORD SizeOfUninitializedData;
DWORD AddressOfEntryPoint; // <---- this is required later
DWORD BaseOfCode;
DWORD BaseOfData;
DWORD ImageBase; // <----
DWORD SectionAlignment;
DWORD FileAlignment;
WORD MajorOperatingSystemVersion;
WORD MinorOperatingSystemVersion;
WORD MajorImageVersion;
WORD MinorImageVersion;
WORD MajorSubsystemVersion;
WORD MinorSubsystemVersion;
DWORD Win32VersionValue;
DWORD SizeOfImage; // <---- size of the PE file as an image
DWORD SizeOfHeaders;
DWORD CheckSum;
WORD Subsystem;
WORD DllCharacteristics;
DWORD SizeOfStackReserve;
DWORD SizeOfStackCommit;
DWORD SizeOfHeapReserve;
DWORD SizeOfHeapCommit;
DWORD LoaderFlags;
DWORD NumberOfRvaAndSizes;
IMAGE_DATA_DIRECTORY DataDirectory[IMAGE_NUMBEROF_DIRECTORY_ENTRIES];
} IMAGE_OPTIONAL_HEADER, *PIMAGE_OPTIONAL_HEADER;
This is important because it is more than likely that absolute addresses are involved within the code which is entirely dependent on its location in memory. To safely map the executable image, the virtual memory space starting at the described
Executable Image of Host Process
+--- +--------------------+
| | |
| | |
| | |
| | |
| | |
NtUnmapViewOfSection + | |
| | |
| | |
| | |
| | |
| | |
| | |
+--- +--------------------+
- Injecting the payload
To inject the payload, the PE file must be parsed manually to transform it from its disk form to its image form. After allocating virtual memory with
Executable Image of Host Process
+--- +--------------------+
| | PE |
| | Headers |
+--- +--------------------+
| | |
| | |
WriteProcessMemory + | |
| |
| |
| |
| |
| |
| |
+--------------------+
To convert the PE file to an image, all of the sections must be individually read from their file offsets and then placed correctly into their correct virtual offsets using
typedef struct _IMAGE_SECTION_HEADER {
BYTE Name[IMAGE_SIZEOF_SHORT_NAME];
union {
DWORD PhysicalAddress;
DWORD VirtualSize;
} Misc;
DWORD VirtualAddress; // <---- virtual offset
DWORD SizeOfRawData;
DWORD PointerToRawData; // <---- file offset
DWORD PointerToRelocations;
DWORD PointerToLinenumbers;
WORD NumberOfRelocations;
WORD NumberOfLinenumbers;
DWORD Characteristics;
} IMAGE_SECTION_HEADER, *PIMAGE_SECTION_HEADER;
Executable Image of Host Process
+--------------------+
| PE |
| Headers |
+--- +--------------------+
| | .text |
+--- +--------------------+
WriteProcessMemory + | .data |
+--- +--------------------+
| | ... |
+---- +--------------------+
| | ... |
+---- +--------------------+
| | ... |
+---- +--------------------+
- Execution of payload
The final step is to point the starting address of execution to the payload’s aforementioned
typedef struct _CONTEXT
{
ULONG ContextFlags;
ULONG Dr0;
ULONG Dr1;
ULONG Dr2;
ULONG Dr3;
ULONG Dr6;
ULONG Dr7;
FLOATING_SAVE_AREA FloatSave;
ULONG SegGs;
ULONG SegFs;
ULONG SegEs;
ULONG SegDs;
ULONG Edi;
ULONG Esi;
ULONG Ebx;
ULONG Edx;
ULONG Ecx;
ULONG Eax; // <----
ULONG Ebp;
ULONG Eip;
ULONG SegCs;
ULONG EFlags;
ULONG Esp;
ULONG SegSs;
UCHAR ExtendedRegisters[512];
} CONTEXT, *PCONTEXT;
To modify the starting address, the
Atom Bombing
The Atom Bombing is a code injection technique that takes advantage of global data storage via Windows’s global atom table. The global atom table’s data is accessible across all processes which is what makes it a viable approach. The data stored in the table is a null-terminated C-string type and is represented with a 16-bit integer key called the atom, similar to that of a map data structure. To add data, MSDN provides a GlobalAddAtom 4 function and is defined as:
ATOM WINAPI GlobalAddAtom(
_In_ LPCTSTR lpString
);
where
UINT WINAPI GlobalGetAtomName(
_In_ ATOM nAtom,
_Out_ LPTSTR lpBuffer,
_In_ int nSize
);
Passing in the identifying atom returned from
Atom bombing works by forcing the target process to load and execute code placed within the global atom table and this relies on one other crucial function,
VOID CALLBACK APCProc( UINT WINAPI GlobalGetAtomName(
_In_ ATOM nAtom,
_In_ ULONG_PTR dwParam -> _Out_ LPTSTR lpBuffer,
_In_ int nSize
); );
However, the underlying implementation of
NTSTATUS NTAPI NtQueueApcThread( UINT WINAPI GlobalGetAtomName(
_In_ HANDLE ThreadHandle, // target process's thread
_In_ PIO_APC_ROUTINE ApcRoutine, // APCProc (GlobalGetAtomName)
_In_opt_ PVOID ApcRoutineContext, -> _In_ ATOM nAtom,
_In_opt_ PIO_STATUS_BLOCK ApcStatusBlock, _Out_ LPTSTR lpBuffer,
_In_opt_ ULONG ApcReserved _In_ int nSize
); );
Here is a visual representation of the code injection procedure:
Atom bombing code injection
+--------------------+
| |
+--------------------+
| lpBuffer | <-+
| | |
+--------------------+ |
+---------+ | | | Calls
| Atom | +--------------------+ | GlobalGetAtomName
| Bombing | | Executable | | specifying
| Process | | Image | | arbitrary
+---------+ +--------------------+ | address space
| | | | and loads shellcode
| | | |
| NtQueueApcThread +--------------------+ |
+---------- GlobalGetAtomName ----> | ntdll.dll | --+
+--------------------+
| |
+--------------------+
This is a very simplified overview of atom bombing but should be adequate for the remainder of the paper. For more information on atom bombing, please refer to enSilo’s AtomBombing: Brand New Code Injection for Windows 27.
Section II: UnRunPE
UnRunPE is a proof-of-concept (PoC) tool that was created for the purposes of applying API monitoring theory to practice. It aims to create a chosen executable file as a suspended process into which a DLL will be injected to hook specific functions utilised by the process hollowing technique.
Code Injection Detection
From the code injection primer, the process hollowing method was described with the following WinAPI call chain:
-
CreateProcess
-
NtUnmapViewOfSection
-
VirtualAllocEx
-
WriteProcessMemory
-
GetThreadContext
-
SetThreadContext
-
ResumeThread
A few of these calls do not have to be in this specific order, for example,
Following the theory of API monitoring, it is best to hook the lowest, common point but when it comes it malware, it should ideally be the lowest possible that is accessible. Assuming a worst case scenario, the author may attempt to skip the higher-level WinAPI functions and directly call the lowest function in the call hierarchy, usually found in the
-
NtCreateUserProcess
-
NtUnmapViewOfSection
-
NtAllocateVirtualMemory
-
NtWriteVirtualMemory
-
NtGetContextThread
-
NtSetContextThread
-
NtResumeThread
Code Injection Dumping
Once the necessary functions are hooked, the target process is executed and each of the hooked functions’ parameters are logged to keep track of the current progress of the process hollowing and the host process. The most significant hooks are
UnRunPE Demonstration
For the demonstration, I have chosen to use a trojanised binary that I had previously created as an experiment. It consists of the main executable

105
Section III: Dreadnought
Dreadnought is a PoC tool that was built upon UnRunPE to support a wider variety of code injection detection, namely, those listed in Code Injection Primer. To engineer such an application, a few augmentations are required.
Detecting Code Injection Method
Because there are so many methods of code injection, differentiating each technique was a necessity. The first approach to this was to recognise a “trigger” API call, that is, the API call which would peform the remote execution of the payload. Using this would do two things: identify the completion of and, to an extent, the type of the code injection. The type can be categorised into four groups:
- Section: Code injected as/into a section
- Process: Code injected into a process
- Code: Generic code injection or shellcode
- DLL: Code injected as DLLs

Process%2BInjection%25281%2529.png1024x768
Process Injection Info Graphic[4] by Karsten Hahn 2
Each trigger API is listed underneath Execute. When either of these APIs have been reached, Dreadought will perform a code dumping method that matches the assumed injection type in a similar fashion to what occurs with process hollowing in UnRunPE. Reliance on this is not enough because there is still potential for API calls to be mixed around to achieve the same functionality as displayed from the stemming of arrows.
Heuristics
For Dreadnought to be able to determine code injection methods more accurately, a heuristic should be involved as an assist. In the development, a very simplistic heuristic was applied. Following the process injection infographic, every time an API was hooked, it would increase the weight of one or more of the associated code injection types stored within a map data structure. As it traces each API call, it will start to favour a certain type. Once the trigger API has been entered, it will identify and compare the weights of the relevant types and proceed with an appropriate action.
Dreadnought Demonstration
Process Injection — Process Hollowing

51
DLL Injection — SetWindowsHookEx

29
DLL Injection — QueueUserAPC

21
Code Injection — Atom Bombing

25

13

15
Conclusion
This paper aimed to bring a technical understanding of code injection and its interaction with the WinAPI. Furthermore, the concept of API monitoring in userland was entertained with the malicious use of injection methods utilised by malware to bypass anti-virus detection. The following presents the current status of Dreadnought as of this writing.
Limitations
Dreadnought’s current heuristic and detection design is incredibly poor but was sufficient enough for theoretical demonstration purposes. Practical use may not be ideal since there is a high possibility that there will be collateral with respect to the hooked API calls during regular operations with the operating system. Because of the impossibility to discern benign from malicious behaviour, false positives and negatives may arise as a result.
With regards to Dreadnought and its operations within userland, it may not be ideal use when dealing with sophisticated malware, especially those which have access to direct interactions with the kernel and those which have the capabilities to evade hooks in general.
PoC Repositories
References
- [1] https://www.blackhat.com/presentations/bh-usa-06/BH-US-06-Sotirov.pdf 12
- [2] https://www.codeproject.com/Articles/7914/MessageBoxTimeout-API 3
- [3] https://blog.ensilo.com/atombombing-brand-new-code-injection-for-windows 27
- [4] http://struppigel.blogspot.com.au/2017/07/process-injection-info-graphic.html 9
- ReactOs 6
- NTAPI Undocumented Functions 16
- ntcoder 9
- GitHub — Process Hacker 21
- YouTube — MalwareAnalysisForHedgehogs 6
- YouTube — OALabs 7