dequis.org

notes

debugging use after free

Summarizing the options available to debug use after free bugs (sometimes also covering double free)

Nothing new here, just organizing my own thoughts

Terms used below:

Junking is filling uninitialized/freed memory with non-zero values
Poisoning is crashing on bad memory reads, by mapping with PROT_NONE or similar
rr is https://rr-project.org/, complements junking letting you find out the original malloc/free with a watchpoint and reverse-continue
Fast means faster than valgrind, I don't care about performance much more than that

Written from the point of view of an x86_64 linux user. Most things work with mac OS too.

The world is complicated and there's no clear winner for every situation.

Valgrind memcheck

Category: Emulator

URL: http://valgrind.org/

Pros:

Extremely reliable
No instrumentation
Covers everything that is executed
Does not reuse old addresses for new allocations
Catches many other bugs, and (almost) every bad memory access

Cons:

Slow
Not available on some platforms
False positives on some libraries, usually requiring suppression files.
It's an emulator, and sometimes it runs into unimplemented CPU instructions (but it's rare)
Does not play nicely with rr

tis-interpreter

Category: Source code interpreter

URL: https://github.com/TrustInSoft/tis-interpreter/

Pros:

Even more reliable than valgrind
Covers everything that is executed
Does not reuse old addresses for new allocations
- Because it doesn't have the concept of numeric address
Catches even more bugs than valgrind, even in places where code isn't emitted
Catches every bad memory access, for real this time
- You can't accidentally end up in a valid address if all your addresses are symbols

Cons:

Even slower than valgrind
Source-level interpreter, won't work with binaries
Using libraries (other than libc) is a pain in the ass
Crashes as soon as the first issue is found
Probably too good at finding errors
Unusable for large programs
Does not play nicely with rr

Address sanitizer (asan)

Category: Compile-time instrumentation

URL: https://github.com/google/sanitizers/wiki/AddressSanitizer

Pros:

Very reliable
Poisoning
Does not reuse old addresses for new allocations
Catches many other bugs
Fancy error reporting
Works with rr
Fast (average 2x slowdown)

Cons:

Build-time instrumentation
May require rebuilding libraries if you want to catch bugs in their code too
- But there are interceptors for libc malloc/free/etc, and the poisoned memory applies to everybody
Crashes as soon as the first issue is found
- Might find bugs you don't want to find. Might find yourself sending patches to chromium.
Allocates 20 terabytes of address space. Normally not an issue
Picky about library load ordering / LD_PRELOAD (but there are ways around it)

libasan.so runtime only

Category: memory allocator

URL: http://btorpey.github.io/blog/2014/03/27/using-clangs-address-sanitizer/ and https://github.com/google/sanitizers/wiki/AddressSanitizerAsDso (outdated)

Env vars: LD_PRELOAD=/usr/lib/libasan.so

This is basically just using the libc interceptors and none of the instrumentation. Seems to initialize new memory with 0xBE and freed with 0x01

Pros:

Does not reuse old addresses for new allocations
Can be added to anything with LD_PRELOAD
No instrumentation
Catches double free
Works with rr
Fast

Cons:

No poisoning
Doesn't catch many other bugs that asan normally catches
Allocates 20 terabytes of address space. Normally not an issue
Might break with large programs

libdislocator

Category: memory allocator

URL: https://github.com/mirrorer/afl/blob/master/libdislocator/README.dislocator

Env vars: LD_PRELOAD=./libdislocator.so

This is a poisoning allocator included as part of the AFL fuzzer.

Pros:

Very reliable
Poisoning
Does not reuse old addresses for new allocations
Can be added to anything with LD_PRELOAD
No instrumentation
Catches double free and some overflows
Works with rr

Cons:

Uses 4kb of physical memory and 8kb of virtual memory for even the smallest allocation.
Even if you have a lot of ram, each allocation is one memory mapping and there's a limit of 65535 of them per process on linux.
Unusable for large programs

glibc junking

Category: memory allocator

URL: https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html

Env vars: MALLOC_PERTURB_=255 MALLOC_MMAP_PERTURB_=255 MALLOC_CHECK_=3

Pros:

You probably already have it
No instrumentation
Catches double free
Works with rr
Fast

Cons:

No poisoning
Reuses old addresses for new allocations
Unreliable (only increases chances of finding bugs)
MALLOC_PERTURB_ is just a suggestion, doesn't mean every malloc/free will be filled with junk

jemalloc junking

Category: memory allocator

URL: http://jemalloc.net/jemalloc.3.html

Env vars: LD_PRELOAD=/usr/lib/libjemalloc.so MALLOC_CONF=junk:true,tcache:false

tcache:false allows it to crash on some instances of double free. Not reliably, but better than never.

FreeBSD default.

Pros:

Can be added to anything with LD_PRELOAD
No instrumentation
More reliable than glibc junking (at least it will write every time)
Works with rr
Fast

Cons:

No poisoning
Reuses old addresses for new allocations
Does not catch double free

omalloc junking

Category: memory allocator

URL: https://man.openbsd.org/malloc.conf.5 and https://github.com/emeryberger/Malloc-Implementations/tree/master/allocators/omalloc

Env vars: LD_PRELOAD=libomalloc.so MALLOC_OPTIONS=CFGJ

OpenBSD default. The second link seems to contain a linux port of an older version (which is what I tested this time)

Pros:

Does not reuse old addresses for new allocations... most of the time.
Partial poisoning through the freeguard (F) option of MALLOC_OPTIONS
- Bunch of other fun options in there too
More reliable than glibc junking (at least it will write every time)
No instrumentation
Catches double free and overflows
Fast

Cons:

Sometimes reuses old addresses.
Freeguard only poisons memory in completely freed pages
- Makes sense for performance/resource usage reasons. You don't want to end up like libdislocator

tcmalloc_debug

Category: memory allocator

Env vars: LD_PRELOAD=/usr/lib/libtcmalloc_debug.so

URL: http://goog-perftools.sourceforge.net/doc/tcmalloc.html

Pros:

Does not reuse old addresses for new allocations
Can be added to anything with LD_PRELOAD
No instrumentation
More reliable than glibc junking (at least it will write every time)
Catches double free
Works with rr
Fast

Cons:

No poisoning

Python debug hook junking

Category: memory allocator

URL: https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pythonmalloc

Not general purpose obviously but I happen to be dealing with python-related issues right now.

Python 3.6 or newer: PYTHONMALLOC=debug, otherwise build --with-pydebug

This is about the default allocator, pyalloc, but one of the options above can be used with PYTHONMALLOC=malloc (or combined with the hooks, malloc_debug), with the caveat that it's ~5x slower than pyalloc.

Pros:

Catches overflows and a few python-specific checks
No instrumentation (on python 3.6 or newer)
Works with rr
Fast

Cons:

No poisoning
Reuses old addresses for new allocations
Not general purpose, only covers allocations done through python's API
The GIL checks mean your application won't start if it uses openssl

dequis.org notes debugging use after free

Valgrind memcheck

tis-interpreter

Address sanitizer (asan)

libasan.so runtime only

libdislocator

glibc junking

jemalloc junking

omalloc junking

tcmalloc_debug

Python debug hook junking

Options if you use windows

dequis.org

notes

debugging use after free