Summarizing the options available to debug use after free bugs (sometimes also covering double free)
Nothing new here, just organizing my own thoughts
Terms used below:
- Junking is filling uninitialized/freed memory with non-zero values
- Poisoning is crashing on bad memory reads, by mapping with
PROT_NONE
or similar - rr is https://rr-project.org/, complements junking letting you find out the original malloc/free with a watchpoint and
reverse-continue
- Fast means faster than valgrind, I don't care about performance much more than that
Written from the point of view of an x86_64 linux user. Most things work with mac OS too.
The world is complicated and there's no clear winner for every situation.
Valgrind memcheck
Category: Emulator
URL: http://valgrind.org/
Pros:
- Extremely reliable
- No instrumentation
- Covers everything that is executed
- Does not reuse old addresses for new allocations
- Catches many other bugs, and (almost) every bad memory access
Cons:
- Slow
- Not available on some platforms
- False positives on some libraries, usually requiring suppression files.
- It's an emulator, and sometimes it runs into unimplemented CPU instructions (but it's rare)
- Does not play nicely with rr
tis-interpreter
Category: Source code interpreter
URL: https://github.com/TrustInSoft/tis-interpreter/
Pros:
- Even more reliable than valgrind
- Covers everything that is executed
- Does not reuse old addresses for new allocations
- Because it doesn't have the concept of numeric address
- Catches even more bugs than valgrind, even in places where code isn't emitted
- Catches every bad memory access, for real this time
- You can't accidentally end up in a valid address if all your addresses are symbols
Cons:
- Even slower than valgrind
- Source-level interpreter, won't work with binaries
- Using libraries (other than libc) is a pain in the ass
- Crashes as soon as the first issue is found
- Probably too good at finding errors
- Unusable for large programs
- Does not play nicely with rr
Address sanitizer (asan)
Category: Compile-time instrumentation
URL: https://github.com/google/sanitizers/wiki/AddressSanitizer
Pros:
- Very reliable
- Poisoning
- Does not reuse old addresses for new allocations
- Catches many other bugs
- Fancy error reporting
- Works with rr
- Fast (average 2x slowdown)
Cons:
- Build-time instrumentation
- May require rebuilding libraries if you want to catch bugs in their code too
- But there are interceptors for libc malloc/free/etc, and the poisoned memory applies to everybody
- Crashes as soon as the first issue is found
- Might find bugs you don't want to find. Might find yourself sending patches to chromium.
- Allocates 20 terabytes of address space. Normally not an issue
- Picky about library load ordering / LD_PRELOAD (but there are ways around it)
libasan.so runtime only
Category: memory allocator
URL: http://btorpey.github.io/blog/2014/03/27/using-clangs-address-sanitizer/ and https://github.com/google/sanitizers/wiki/AddressSanitizerAsDso (outdated)
Env vars: LD_PRELOAD=/usr/lib/libasan.so
This is basically just using the libc interceptors and none of the instrumentation. Seems to initialize new memory with 0xBE
and freed with 0x01
Pros:
- Does not reuse old addresses for new allocations
- Can be added to anything with
LD_PRELOAD
- No instrumentation
- Catches double free
- Works with rr
- Fast
Cons:
- No poisoning
- Doesn't catch many other bugs that asan normally catches
- Allocates 20 terabytes of address space. Normally not an issue
- Might break with large programs
libdislocator
Category: memory allocator
URL: https://github.com/mirrorer/afl/blob/master/libdislocator/README.dislocator
Env vars: LD_PRELOAD=./libdislocator.so
This is a poisoning allocator included as part of the AFL fuzzer.
Pros:
- Very reliable
- Poisoning
- Does not reuse old addresses for new allocations
- Can be added to anything with
LD_PRELOAD
- No instrumentation
- Catches double free and some overflows
- Works with rr
Cons:
- Uses 4kb of physical memory and 8kb of virtual memory for even the smallest allocation.
- Even if you have a lot of ram, each allocation is one memory mapping and there's a limit of 65535 of them per process on linux.
- Unusable for large programs
glibc junking
Category: memory allocator
URL: https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html
Env vars: MALLOC_PERTURB_=255 MALLOC_MMAP_PERTURB_=255 MALLOC_CHECK_=3
Pros:
- You probably already have it
- No instrumentation
- Catches double free
- Works with rr
- Fast
Cons:
- No poisoning
- Reuses old addresses for new allocations
- Unreliable (only increases chances of finding bugs)
MALLOC_PERTURB_
is just a suggestion, doesn't mean every malloc/free will be filled with junk
jemalloc junking
Category: memory allocator
URL: http://jemalloc.net/jemalloc.3.html
Env vars: LD_PRELOAD=/usr/lib/libjemalloc.so MALLOC_CONF=junk:true,tcache:false
tcache:false
allows it to crash on some instances of double free. Not reliably, but better than never.
FreeBSD default.
Pros:
- Can be added to anything with
LD_PRELOAD
- No instrumentation
- More reliable than glibc junking (at least it will write every time)
- Works with rr
- Fast
Cons:
- No poisoning
- Reuses old addresses for new allocations
- Does not catch double free
omalloc junking
Category: memory allocator
URL: https://man.openbsd.org/malloc.conf.5 and https://github.com/emeryberger/Malloc-Implementations/tree/master/allocators/omalloc
Env vars: LD_PRELOAD=libomalloc.so MALLOC_OPTIONS=CFGJ
OpenBSD default. The second link seems to contain a linux port of an older version (which is what I tested this time)
Pros:
- Does not reuse old addresses for new allocations... most of the time.
- Partial poisoning through the freeguard (F) option of
MALLOC_OPTIONS
- Bunch of other fun options in there too
- More reliable than glibc junking (at least it will write every time)
- No instrumentation
- Catches double free and overflows
- Fast
Cons:
- Sometimes reuses old addresses.
- Freeguard only poisons memory in completely freed pages
- Makes sense for performance/resource usage reasons. You don't want to end up like libdislocator
tcmalloc_debug
Category: memory allocator
Env vars: LD_PRELOAD=/usr/lib/libtcmalloc_debug.so
URL: http://goog-perftools.sourceforge.net/doc/tcmalloc.html
Pros:
- Does not reuse old addresses for new allocations
- Can be added to anything with
LD_PRELOAD
- No instrumentation
- More reliable than glibc junking (at least it will write every time)
- Catches double free
- Works with rr
- Fast
Cons:
- No poisoning
Python debug hook junking
Category: memory allocator
URL: https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pythonmalloc
Not general purpose obviously but I happen to be dealing with python-related issues right now.
Python 3.6 or newer: PYTHONMALLOC=debug
, otherwise build --with-pydebug
This is about the default allocator, pyalloc, but one of the options above can be used with PYTHONMALLOC=malloc
(or combined with the hooks, malloc_debug
), with the caveat that it's ~5x slower than pyalloc.
Pros:
- Catches overflows and a few python-specific checks
- No instrumentation (on python 3.6 or newer)
- Works with rr
- Fast
Cons:
- No poisoning
- Reuses old addresses for new allocations
- Not general purpose, only covers allocations done through python's API
- The GIL checks mean your application won't start if it uses openssl