M1ssing Register Access Controls Leak EL0 State
M1RACLES (CVE-2021-30747) is a covert channel vulnerability in the Apple Silicon “M1” chip.
A flaw in the design of the Apple Silicon “M1” chip allows any two applications running under an OS to covertly exchange data between them, without using memory, sockets, files, or any other normal operating system features. This works between processes running as different users and under different privilege levels, creating a covert channel for surreptitious data exchange.
The vulnerability is baked into Apple Silicon chips, and cannot be fixed without a new silicon revision.
Watch video piped in real time through the covert channel!
The ARM system register encoded as
s3_5_c15_c10_1 is accessible from EL0, and contains two implemented bits that can be read or written (bits 0 and 1). This is a per-cluster register that can be simultaneously accessed by all cores in a cluster. This makes it a two-bit covert channel that any arbitrary process can use to exchange data with another cooperating process. A demo app to access this register is available here.
A malicious pair of cooperating processes may build a robust channel out of this two-bit state, by using a clock-and-data protocol (e.g. one side writes 1x to send data, the other side writes 00 to request the next bit). This allows the processes to exchange an arbitrary amount of data, bound only by CPU overhead. CPU core affinity APIs can be used to ensure that both processes are scheduled on the same CPU core cluster. A PoC demonstrating this approach to achieve high-speed, robust data transfer is available here. This approach, without much optimization, can achieve transfer rates of over 1MB/s (less with data redundancy).
The original purpose of this register is unknown, but it is not believed to have been made accessible to EL0 intentionally, thus making this a silicon errata.
Who is affected?
All Apple M1 users, running any operating system on bare metal.
Am I affected?
- macOS users: At least versions 11.0 and onwards are affected.
- Linux users: Versions 5.13 and onwards are affected.
- OpenBSD users: Hi Mark!
- AmigaOS users: Look, Apple bought PASemi but the AmigaOne X1000 CPU doesn’t count as Apple Silicon, sorry.
- Newton OS users: I guess those are technically Apple Silicon but…
- iOS users: See below
Are other Apple CPUs affected?
Maybe, but I don’t have an iPhone or a DTK to test it. Feel free to report back if you try it. The A14 has been confirmed as also affected, which is expected, as it is a close relative of the M1.
Are non-Apple CPUs affected?
Are VMs affected?
No. Correctly implemented hypervisors should disable guest accesses to this register by default, and this feature works correctly on the M1, which mitigates the issue. Both Hypervisor.framework (on macOS) and KVM (on Linux) do this, and are not affected.
How can I protect myself?
The only mitigation available to users is to run your entire OS as a VM.
Does the mitigation have a performance impact?
Yes, running your entire OS as a VM has a performance impact.
That sounds bad.
Well, yeah. Don’t do that, it’d be silly.
Is there really no other way?
Mitigating this under macOS properly would require turning its entire VM hypervisor framework design on its head. We are not aware of any plans by Apple to do so at this time.
It’s a bit easier on Linux, but it still requires fairly intrusive changes due to the design of the M1, and comes at a performance cost for guest VMs. We’re not in a huge rush to do this. Sorry.
How was this bug found?
I was working on figuring out how the M1 CPU works to port Linux to it. Not understanding Apple proprietary features could lead to this sort of vulnerability. I found something, and it turned out to be an Apple proprietary bug, instead of an Apple proprietary feature, that they themselves also weren’t aware of.
How was this bug disclosed?
I e-mailed firstname.lastname@example.org. They acknowledged the vulnerability and assigned it CVE-2021-30747. I published this disclosure 90 days after the initial disclosure to Apple.
Was this responsibly disclosed?
I tried, but I also talked about it on public IRC before I knew it was a bug and not a feature, so I couldn’t do much about that part.
Is the vulnerability fixed in future Apple Silicon chips?
We do not have information on Apple’s plans for silicon mitigations. An educated guess based on silicon design timelines would be that the flaw will likely affect the next generation of Apple Silicon after M1, but might be fixed in the subsequent one.
Can malware use this vulnerability to take over my computer?
Can malware use this vulnerability to steal my private information?
Can malware use this vulnerability to rickroll me?
Yes. I mean, it could also rickroll you without using it.
Can this be exploited from Java apps?
Wait, people still use Java?
Can this be exploited from Flash applets?
Can I catch BadBIOS from this vulnerability?
Wait, is this even real?
So what’s the real danger?
If you already have malware on your computer, that malware can communicate with other malware on your computer in an unexpected way.
Chances are it could communicate in plenty of expected ways anyway.
That doesn’t sound too bad.
Honestly, I would expect advertising companies to try to abuse this kind of thing for cross-app tracking, more than criminals. Apple could catch them if they tried, though, for App Store apps (see below).
Wait. Oh no. Some game developer somewhere is going to try to use this as a synchronization primitive, aren’t they. Please don’t. The world has enough cursed code already. Don’t do it. Stop it. Noooooooooooooooo
What about iOS?
iOS is affected, like all other OSes. There are unique privacy implications to this vulnerability on iOS, as it could be used to bypass some of its stricter privacy protections. For example, keyboard apps are not allowed to access the internet, for privacy reasons. A malicious keyboard app could use this vulnerability to send text that the user types to another malicious app, which could then send it to the internet.
However, since iOS apps distributed through the App Store are not allowed to build code at runtime (JIT), Apple can automatically scan them at submission time and reliably detect any attempts to exploit this vulnerability using static analysis (which they already use). We do not have further information on whether Apple is planning to deploy these checks (or whether they have already done so), but they are aware of the potential issue and it would be reasonable to expect they will. It is even possible that the existing automated analysis already rejects any attempts to use system registers directly.
What about APTs?
They have better exploits anyway. They don’t care.
So you’re telling me I shouldn’t worry?
Really, nobody’s going to actually find a nefarious use for this flaw in practical circumstances. Besides, there are already a million side channels you can use for cooperative cross-process communication (e.g. cache stuff), on every system. Covert channels can’t leak data from uncooperative apps or systems.
Actually, that one’s worth repeating: Covert channels are completely useless unless your system is already compromised.
So how is this a vulnerability if you can’t exploit it?
It violates the OS security model. You’re not supposed to be able to send data from one process to another secretly. And even if harmless in this case, you’re not supposed to be able to write to random CPU system registers from userspace either.
It was fairly lucky that the bug can be mitigated in VMs (as the register still responds to VM-related access controls); had this not been the case, the impact would have been more severe.
How did this happen anyway?
Someone in Apple’s silicon design team made a boo-boo. It happens. Engineers are human.
But Bloomberg says China hacked TSMC and put this in?!
Good time to buy TSMC stock then!*
* This site is for informational purposes only and is not intended to be a solicitation, offering or recommendation of any security, commodity, derivative, investment management service or advisory service and is not commodity trading advice. This site does not intend to provide investment, tax or legal advice on either a general basis or specific to any client accounts or portfolios. This website does not represent that the securities, products, or services discussed on this site are suitable or appropriate for any or all investors.
Wait, didn’t you say on Twitter that this could be mitigated really easily?
Yeah, but originally I thought the register was per-core. If it were, then you could just wipe it on context switches. But since it’s per-cluster, sadly, we’re kind of screwed, since you can do cross-core communication without going into the kernel. Other than running in EL1/0 with TGE=0 (i.e. inside a VM guest), there’s no known way to block it.
Can’t the OS just write garbage to the register to break apps using it?
No. It would have to do it so fast that it would peg a CPU core continuously, and you’d still get data through even with such noise. Lowering the signal-to-noise ratio almost never works for covert channels, and this case is particularly futile due to its high bandwidth.
Aren’t bugs like this rare and critical?
No, all CPUs have silly errata like this, you just don’t hear about it most of the time. Some vendors even occassionally hide some of these errata and don’t disclose them properly, because it makes them look bad. I hear some of them rhyme with “doorbell”.
But I’ve only heard about Spectre and Meltdown and…?
Because those are the ones that the discoverers chose to hype up. To be fair, those were kind of bad.
So what’s the point of this website?
Poking fun at how ridiculous infosec clickbait vulnerability reporting has become lately. Just because it has a flashy website or it makes the news doesn’t mean you need to care.
If you’ve read all the way to here, congratulations! You’re one of the rare people who doesn’t just retweet based on the page title 🙂
But how are journalists supposed to know which bugs are bad and which bugs aren’t?
Talk to people. In particular, talk to people other than the people who discovered the bug. The latter may or may not be honest about the real impact.
If you hear the words “covert channel”… it’s probably overhyped. Most of these come from paper mills who are endlessly recycling the same concept with approximately zero practical security impact. The titles are usually clickbait, and sometimes downright deceptive.
I came here from a news site and they didn’t tell me any of this at all!
Then perhaps you should stop reading that news site, just like they stopped reading this site after the first 2 paragraphs.
Are all news sites bad?
Nah, a few actually contacted me before running stories and got the facts and did a good job.
If this bug doesn’t matter, why did you go through all the trouble of putting this site and the demo together?
Honestly, I just wanted to play Bad Apple!! over an M1 vulnerability. You have to admit that’s kind of cool.
Can you go into more details about the possible mitigations and why you can’t just fix this in, like, 5 lines of code?
Sure. ARMv8 was originally designed to support Type 1 hypervisors. That’s like Xen for you x86 people: a small hypervisor (EL2) runs both the “host” OS and the “guest” OSes under it (EL1). Later, it was extended to support Type 2 hypervisors (“Virtualization Host Extensions”). That’s like KVM for you x86 people: the hypervisor is the host OS and both run at EL2.
Mitigating the problem requires running your OS at EL1, where the problem register can be disabled, and then having at least some kind of minimal hypervisor at EL2 to deal with those traps (otherwise running an app that uses the register would just crash your machine instead).
The macOS virtualization framework only supports running as a Type 2 hypervisor. So, to fix this, they’d have to re-design the entire thing to work as a Type 1 hypervisor.
Linux supports both modes, where KVM on ARMv8 can run as a little Type 1 hypervisor built into the OS, or as a Type 2 hypervisor like on x86. Running in Type 1 mode (“non-VHE”) would make mitigating the vulnerability possible. However, in their infinite wisdom, Apple decided to only support Type 2 (VHE) mode on Apple Silicon chips, in violation of the ARM architecture specification which requires Type 1 support (non-VHE). So you can’t actually run Linux in Type 1 mode on Apple Silicon. In fact, we had to patch Linux to work around this violation of the spec, because on every other ARM chip, it’ll always start in non-VHE mode and only switch to VHE mode later.
Nothing actually stops you from making a Type 1 hypervisor work in VHE mode (VHE mode adds features required to run as Type 2, but doesn’t remove anything), so it is possible to do Type 1 virtualization on Apple Silicon and work around this. However, because VHE mode changes the way virtualization works, Type 1 hypervisors meant to work in non-VHE mode won’t work in VHE mode without changes. So Linux would need a bunch of rework of its non-VHE Type 1 code to make it possible to use in VHE mode, where it was never intended to work because the ARM specification requires non-VHE mode to always be available.
Basically, Apple decided to break the ARM spec by removing a mandatory feature, because they figured they’d never need to use that feature for macOS. And then it turned out that removing that feature made it much harder for existing OSes to mitigate this vulnerability. Yay.
If you want to play around with this, you should know that setting the Apple-proprietary register bit
HACR_EL2 to 1 will make accesses to the problem register (and a few others) trap to EL2, but only when running in VHE guest mode (with
HCR_EL2.TGE = 0). They won’t trap in EL2/EL0 (VHE host) mode.
Who are you, anyway?
Any closing thoughts?
If you want to help me work on porting and upstreaming Linux for Apple Silicon, I have a Patreon and a GitHub Sponsors. I promise most of the time I’m working on Linux, not writing silly vulnerability PoCs! 🙂
Oh yeah, this vulnerability was found using m1n1. It’s cool, you should check it out! I’m also turning it into a minimal hypervisor to investigate macOS’s usage of the M1 hardware… but thanks to this bug being mitigated by VMs, that will also turn m1n1 into a (somewhat) practical mitigation for this bug, without the overhead of a “full” VM. You lose virtualization features in the guest OS, though, as the M1 does not support nested virtualization.