Userland heap management
In 1996, OpenBSD
phkmalloc, written by Poul-Henning Kamp, thanks to Thorsten Lockert.
sbrk, growing a contiguous region of memory, managed pages with a
simple index, and chunks with a bitmap.
In October 2003, Ted Unangst added support for guard pages, based on Bruce Perens’ Electric fence. It was never was enabled by default, since it murders the performances, and causes important fragmentation issues.
With linked-list based allocator, unlink-attack were quite common. Apparently, Stefan Esser had a private version of safe-unlink around mid-2002, published it on bugtraq in December 2003, and (re?) added it in his Hardened-PHP patch 0.2.7 in April 2004.
The 1st August 2001, Thierry Deval added mmmap-based randomization to malloc.
In August 2004, Windows XP SP2 was released, with safe-unlinking for the Windows heap allocator.
In April 2005, GLIBC 2.3.5 was released, also with safe-unlink.
phkmalloc, it uses
mmap instead of
sbrk, returns pages at
random offsets, manages regions and their size in a hashtable (instead of a
linked-list), and free’ed regions are stored in a cache with a randomized
re-use to make UAF harder to exploit. All the metadata are
completely out of band, making them harder to corrupt. It was also possible to
mark pages in the cache as
PROT_NONE, but it wasn’t enabled by default
because of the monstrous performance impact.
It also came with optional support for guard pages, based on Ted Unangst’s
implementation, consisting in mapping a zone of a configurable length
PROT_NONE, right behind the buffer. It would be nice to be able to have guard
pages only between large chunks, as a middle ground between “no guards” and “no
Otto-malloc landed in OpenBSD 4.4, on November 2008.
Around 2013, Google published PartitionAlloc as part of the WebKit Template Framework, based on WebKit’s RenderArena‘s custom allocator, to use in Google Chrome, based on the cool idea of partitioning allocations by types/sizes/lifetime. It has been kinda replaced in 2015 by Oilpan.
In December 2015, Daniel Micay added added a check for the integrity of the junk in free’ed chunks placed in quarantine, before effectively free’ing them, to catch use-after-frees. He also implemented canaries at the end of allocations, for small allocations, to detect linear heap overflows, put behind an option since it slightly increases memory consumption. That implementation had the drawback that some meta data ended up next to the allocated region itself, so Moerbeek rewrote the canary code for small allocations to use out-of-band data only and also added canaries for large objects.
Amusingly, the canaries behind
posix_memalign weren’t correctly initialised
until April 2017.
Since September 2017, Otto Moerbeek made delayed free mandatory. He also added optional kinda-deterministic double-free checks, probabilistic being the default.
Scudo, mainly written my Kostya Kortchinsky, isn’t based on otto’s malloc, but on LLVM Sanitizer’s CombinedAllocator, and was born in 2016. It makes used of GWP-ASAN, as explained by Dmitry Vyukov during his Linux Plumbers 2019 presentation. It has partitions for its fast path (small allocations), and guard page on the slow one (large allocations), as well as quarantines for delayed frees, as well as various other cool stuff. It isn’t as secure as OpenBSD’s one, but it’s deliberate trade-off for massive performances, since it plans on becoming Android’s default allocator, as well as Google’s one. For example, it’s using inline metadata with checksumming, to improve data locality.
GrapheneOS’s hardened_malloc is heavily based on OpenBSD’s otto’s malloc, and in August 2019, Shawn Webb announced that he was planning on integrating hardened_malloc in HardenedBSD to replace jemalloc.
Nowadays, all serious allocators (at least the ones from
Google Clang, …)
are implementing security mitigations, like junking on
free and on
randomisation, forced alignment, quarantine/delayed frees, …
but are usually less secure than Otto malloc, since they care more about
performances than security.
Interestingly, otto malloc isn’t really tailored for environment where an attacker has scripting abilities, like in web-browsers, which are arguably one of the biggest/softest attack surface nowadays: It’s lacking defenses like WebKit’s Gigacage, or type-based isolation. But since browsers are usually coming with their own custom allocator, it wouldn’t change much anyway.
Otto malloc is an impressive piece of software, one the first seriously hardened allocator, and still the most secure one publicly deployed in production.
It’ll be interesting to see what will happen in the next following years, with the democratisation of things like memory tagging and GWP-ASAN. Something else that would be nice to have is comments on other allocator mitigations, like Apple‘s IsoHeap, Gigacage, branch-less checks via pointers poisoning, … and why otto-malloc isn’t implementing them.