Execute only memory
In February 2016, Dave Hansen sent a patch to add execute-only code support on Linux’ userland on amd64, making use of Memory Protection Keys:
Memory Protection Keys provide a mechanism for enforcing page-based protections, but without requiring modification of the page tables when an application changes protection domains.
Rick Edgecombe from Intel added support for this method in the Linux kernel for kernel-land, as well as in KVM and Qemu, and gave a talk about it at the Linux Plumber Conference 2019: Touch but don’t look: Running the kernel in execute-only memory
Android 10, released in 2019, came with execute-only-memory (XOM) for AArch64, but it was disabled in Android 11, since PAN (Privilege Access Never) was shown to be completely broken by Siguza. Its support was thus removed from the Linux Kernel in 2020, but re-enabled in March 2021 in the form of Enhanced Privileged Access Never (EPAN) by Vladimir Murzi from ARM, allowing Privileged Access Never to be used with Execute-only mappings.
In 2020, the Playstation 5, based on FreeBSD 11 on x86, was released. It implemented execute-only-memory via a custom hypervisor.
In January 2023, Theo De Raadt sent a call for testing for execute-only on amd64:
[……] the idea here is to have code (text segments) not be readable. Or in a more generic sense, if you
mprotect
a region with onlyPROT_EXEC
, it is not readable.
This has a number of nice characteristics. It makes BROP techniques not work as well (when accompanied by the effects of many other migitations [sic]), it makes complex gadgets containing side effects harder to use (if the registers involved in the side effect contain pointers to code), etc etc.
It doesn’t make “complex gadgets containing side effects harder to use”:
I have yet to see same real-world gadget chain that has to read some data from
the .text
segment for anything else than leaks. What is does, is that it
prevents attacker from building ROP gadgets on the fly. Moreover, ASLR and
exec-on-fork are already taking care of BROP.
There is also a kernel-side to this mitigation:
syscalls like write
and read
will detect if the memory they’re trying to
access is either the main text
segment, the libc mapping, or ld.so, and abort
if it’s the case. De Raadt claimed
in his CanSecWest 2023 talk
that this kills BROP, but it doesn’t, if only because
d206297a49fb3e9af7f0b077338e0a2d62400bdd2e2866ef56c566b3ec45dc04
and de58dae456e48bfb0dc645d5e3f06b634d1af45102d6dd3ade99fe746f0303ed
.
Not that it matters much anyway, since BROP wasn’t
reproduced,
and is already mitigated by exec on fork.
A PKU memory key is instantiated for all memory which is
PROT_EXEC
-only, and that key is told to block memory reads. Instruction reads are still permitted. Now some of you may know how PKU works, so you will say that userland can change the behaviour and make the memory readable. That is true. Until a system call happens. Then we force it back to blocking read. Or a memory fault, we force it back. Or an interrupt, even a clock interrupt. We force it back. Generally if an attacker is trying to read code it is because they don’t have a lot of (turing complete or a subset) of flexibility and want more information. Imagine they are able to generate a the [sic] “wrpkru” sequence to disable it, and then do something else? My guess is if they can do two things in a row, then they already have power, and won’t be doing this. So this is a protection method against a lower-level attack construction. The concept is this: If you can bypass this to gain a tiny foothold, you would not have bothered, because you have more power and would be reaching higher.
This isn’t a new idea, it was presented by Mingwei Zhang, Ravi Sahita (Intel Labs) Daiping Liu (University of Delaware), at Blackhat 2018, and an implementation was provided, successfully running on CentOS and Ubuntu.
Surprisingly, tests have been written for this feature.
Unfortunately, this usage of PKU will collide with V8’s Control Flow Integrity scheme; only one of them might be used at a time.
That blocks the classic “BROP” attack method of trying to write the text segments out a socket for offline gadget study.
BROP is not a “classic attack method”, it’s exotic as best, has never been seen in the wild, and was hardly reproduced in a lab.
Other architectures also have enforced execute-only code in userland:
- powerpc64 via Virtual Page Class Key
- sparc64 using split software TLB
- riscv64 and aarch64 via MMU
- x86 via, nothing)/best) effort
- mips via the Read Inhibit bit on recent CPU
- hppa via the
GATEWAY
instruction
In 2023, in his CanSecWest talk, De Raadt said “On every kernel entry, if the RPKU register has been changed kill the process”, which is way more efficient than simply turning it back on.
Amusingly, during the talk, De Raadt pointed out a low-hanging bypasses for execute-only memory:
- ELF headers and relocation tables are still readable, providing locations of all the symbols.
- Use
ptrace
.
This isn’t a silver-bullet, but it’s a really nice low-cost defence,
making it harder to dump the .text
segment; albeit relocations might be
enough to find gadgets, since both compilations flags and source code are
known, and some side-channel
attacks are
powerful enough to read execute-only memory.
Anyway, coupled with library order randomization and
libc symbols randomization, it makes ROP really annoying.
The “turn the protection again every time the kernel gets code access” is
interesting, albeit not very useful: an attacker with enough control to be
able to call the right instructions to disable this mitigation surely has
enough control to read the entire .text
segment in one go directly after.
It’s worth noticing that execute-only might break some fine-grained CFI schemes but this
isn’t a concern here since OpenBSD doesn’t use one.
It does require some hacks to make debuggers/stacktraces work though.