Execute only memory

In February 2016, Dave Hansen sent a patch to add execute-only code support on Linux’ userland on amd64, making use of Memory Protection Keys:

Memory Protection Keys provide a mechanism for enforcing page-based protections, but without requiring modification of the page tables when an application changes protection domains.

Rick Edgecombe from Intel added support for this method in the Linux kernel for kernel-land, as well as in KVM and Qemu, and gave a talk about it at the Linux Plumber Conference 2019: Touch but don’t look: Running the kernel in execute-only memory

Android 10, released in 2019, came with execute-only-memory (XOM) for AArch64, but it was disabled in Android 11, since PAN (Privilege Access Never) was shown to be completely broken by Siguza. Its support was thus removed from the Linux Kernel in 2020, but re-enabled in March 2021 in the form of Enhanced Privileged Access Never (EPAN) by Vladimir Murzi from ARM, allowing Privileged Access Never to be used with Execute-only mappings.

In 2020, the Playstation 5, based on FreeBSD 11 on x86, was released. It implemented execute-only-memory via a custom hypervisor.

In January 2023, Theo De Raadt sent a call for testing for execute-only on amd64:

[……] the idea here is to have code (text segments) not be readable. Or in a more generic sense, if you mprotect a region with only PROT_EXEC, it is not readable.

This has a number of nice characteristics. It makes BROP techniques not work as well (when accompanied by the effects of many other migitations [sic]), it makes complex gadgets containing side effects harder to use (if the registers involved in the side effect contain pointers to code), etc etc.

It doesn’t make “complex gadgets containing side effects harder to use”: I have yet to see same real-world gadget chain that has to read some data from the .text segment for anything else than leaks. What is does, is that it prevents attacker from building ROP gadgets on the fly. Moreover, ASLR and exec-on-fork are already taking care of BROP.

There is also a kernel-side to this mitigation: syscalls like write and read will detect if the memory they’re trying to access is either the main text segment, the libc mapping, or ld.so, and abort if it’s the case. De Raadt claimed in his CanSecWest 2023 talk that this kills BROP, but it doesn’t, if only because d206297a49fb3e9af7f0b077338e0a2d62400bdd2e2866ef56c566b3ec45dc04 and de58dae456e48bfb0dc645d5e3f06b634d1af45102d6dd3ade99fe746f0303ed . Not that it matters much anyway, since BROP wasn’t reproduced, and is already mitigated by exec on fork.

A PKU memory key is instantiated for all memory which is PROT_EXEC-only, and that key is told to block memory reads. Instruction reads are still permitted. Now some of you may know how PKU works, so you will say that userland can change the behaviour and make the memory readable. That is true. Until a system call happens. Then we force it back to blocking read. Or a memory fault, we force it back. Or an interrupt, even a clock interrupt. We force it back. Generally if an attacker is trying to read code it is because they don’t have a lot of (turing complete or a subset) of flexibility and want more information. Imagine they are able to generate a the [sic] “wrpkru” sequence to disable it, and then do something else? My guess is if they can do two things in a row, then they already have power, and won’t be doing this. So this is a protection method against a lower-level attack construction. The concept is this: If you can bypass this to gain a tiny foothold, you would not have bothered, because you have more power and would be reaching higher.

This isn’t a new idea, it was presented by Mingwei Zhang, Ravi Sahita (Intel Labs) Daiping Liu (University of Delaware), at Blackhat 2018, and an implementation was provided, successfully running on CentOS and Ubuntu.

Surprisingly, tests have been written for this feature.

Unfortunately, this usage of PKU will collide with V8’s Control Flow Integrity scheme; only one of them might be used at a time.

That blocks the classic “BROP” attack method of trying to write the text segments out a socket for offline gadget study.

BROP is not a “classic attack method”, it’s exotic as best, has never been seen in the wild, and was hardly reproduced in a lab.

Other architectures also have enforced execute-only code in userland:

powerpc64 via Virtual Page Class Key
sparc64 using split software TLB
riscv64 and aarch64 via MMU
x86 via, nothing)/best) effort
mips via the Read Inhibit bit on recent CPU
hppa via the GATEWAY instruction

In 2023, in his CanSecWest talk, De Raadt said “On every kernel entry, if the RPKU register has been changed kill the process”, which is way more efficient than simply turning it back on.

Amusingly, during the talk, De Raadt pointed out a low-hanging bypasses for execute-only memory:

ELF headers and relocation tables are still readable, providing locations of all the symbols.
Use ptrace.

This isn’t a silver-bullet, but it’s a really nice low-cost defence, making it harder to dump the .text segment; albeit relocations might be enough to find gadgets, since both compilations flags and source code are known, and some side-channel attacks are powerful enough to read execute-only memory. Anyway, coupled with library order randomization and libc symbols randomization, it makes ROP really annoying.

The “turn the protection again every time the kernel gets code access” is interesting, albeit not very useful: an attacker with enough control to be able to call the right instructions to disable this mitigation surely has enough control to read the entire .text segment in one go directly after. It’s worth noticing that execute-only might break some fine-grained CFI schemes but this isn’t a concern here since OpenBSD doesn’t use one.

It does require some hacks to make debuggers/stacktraces work though.