Summarizing the options available to debug use after free bugs (sometimes also covering double free)

Nothing new here, just organizing my own thoughts

Terms used below:

  • Junking is filling uninitialized/freed memory with non-zero values
  • Poisoning is crashing on bad memory reads, by mapping with PROT_NONE or similar
  • rr is https://rr-project.org/, complements junking letting you find out the original malloc/free with a watchpoint and reverse-continue
  • Fast means faster than valgrind, I don't care about performance much more than that

Written from the point of view of an x86_64 linux user. Most things work with mac OS too.

The world is complicated and there's no clear winner for every situation.

Valgrind memcheck

Category: Emulator

URL: http://valgrind.org/

Pros:

  • Extremely reliable
  • No instrumentation
  • Covers everything that is executed
  • Does not reuse old addresses for new allocations
  • Catches many other bugs, and (almost) every bad memory access

Cons:

  • Slow
  • Not available on some platforms
  • False positives on some libraries, usually requiring suppression files.
  • It's an emulator, and sometimes it runs into unimplemented CPU instructions (but it's rare)
  • Does not play nicely with rr

tis-interpreter

Category: Source code interpreter

URL: https://github.com/TrustInSoft/tis-interpreter/

Pros:

  • Even more reliable than valgrind
  • Covers everything that is executed
  • Does not reuse old addresses for new allocations
    • Because it doesn't have the concept of numeric address
  • Catches even more bugs than valgrind, even in places where code isn't emitted
  • Catches every bad memory access, for real this time
    • You can't accidentally end up in a valid address if all your addresses are symbols

Cons:

  • Even slower than valgrind
  • Source-level interpreter, won't work with binaries
  • Using libraries (other than libc) is a pain in the ass
  • Crashes as soon as the first issue is found
  • Probably too good at finding errors
  • Unusable for large programs
  • Does not play nicely with rr

Address sanitizer (asan)

Category: Compile-time instrumentation

URL: https://github.com/google/sanitizers/wiki/AddressSanitizer

Pros:

  • Very reliable
  • Poisoning
  • Does not reuse old addresses for new allocations
  • Catches many other bugs
  • Fancy error reporting
  • Works with rr
  • Fast (average 2x slowdown)

Cons:

  • Build-time instrumentation
  • May require rebuilding libraries if you want to catch bugs in their code too
    • But there are interceptors for libc malloc/free/etc, and the poisoned memory applies to everybody
  • Crashes as soon as the first issue is found
    • Might find bugs you don't want to find. Might find yourself sending patches to chromium.
  • Allocates 20 terabytes of address space. Normally not an issue
  • Picky about library load ordering / LD_PRELOAD (but there are ways around it)

libasan.so runtime only

Category: memory allocator

URL: http://btorpey.github.io/blog/2014/03/27/using-clangs-address-sanitizer/ and https://github.com/google/sanitizers/wiki/AddressSanitizerAsDso (outdated)

Env vars: LD_PRELOAD=/usr/lib/libasan.so

This is basically just using the libc interceptors and none of the instrumentation. Seems to initialize new memory with 0xBE and freed with 0x01

Pros:

  • Does not reuse old addresses for new allocations
  • Can be added to anything with LD_PRELOAD
  • No instrumentation
  • Catches double free
  • Works with rr
  • Fast

Cons:

  • No poisoning
  • Doesn't catch many other bugs that asan normally catches
  • Allocates 20 terabytes of address space. Normally not an issue
  • Might break with large programs

libdislocator

Category: memory allocator

URL: https://github.com/mirrorer/afl/blob/master/libdislocator/README.dislocator

Env vars: LD_PRELOAD=./libdislocator.so

This is a poisoning allocator included as part of the AFL fuzzer.

Pros:

  • Very reliable
  • Poisoning
  • Does not reuse old addresses for new allocations
  • Can be added to anything with LD_PRELOAD
  • No instrumentation
  • Catches double free and some overflows
  • Works with rr

Cons:

  • Uses 4kb of physical memory and 8kb of virtual memory for even the smallest allocation.
  • Even if you have a lot of ram, each allocation is one memory mapping and there's a limit of 65535 of them per process on linux.
  • Unusable for large programs

glibc junking

Category: memory allocator

URL: https://www.gnu.org/software/libc/manual/html_node/Memory-Allocation-Tunables.html

Env vars: MALLOC_PERTURB_=255 MALLOC_MMAP_PERTURB_=255 MALLOC_CHECK_=3

Pros:

  • You probably already have it
  • No instrumentation
  • Catches double free
  • Works with rr
  • Fast

Cons:

  • No poisoning
  • Reuses old addresses for new allocations
  • Unreliable (only increases chances of finding bugs)
  • MALLOC_PERTURB_ is just a suggestion, doesn't mean every malloc/free will be filled with junk

jemalloc junking

Category: memory allocator

URL: http://jemalloc.net/jemalloc.3.html

Env vars: LD_PRELOAD=/usr/lib/libjemalloc.so MALLOC_CONF=junk:true,tcache:false

tcache:false allows it to crash on some instances of double free. Not reliably, but better than never.

FreeBSD default.

Pros:

  • Can be added to anything with LD_PRELOAD
  • No instrumentation
  • More reliable than glibc junking (at least it will write every time)
  • Works with rr
  • Fast

Cons:

  • No poisoning
  • Reuses old addresses for new allocations
  • Does not catch double free

omalloc junking

Category: memory allocator

URL: https://man.openbsd.org/malloc.conf.5 and https://github.com/emeryberger/Malloc-Implementations/tree/master/allocators/omalloc

Env vars: LD_PRELOAD=libomalloc.so MALLOC_OPTIONS=CFGJ

OpenBSD default. The second link seems to contain a linux port of an older version (which is what I tested this time)

Pros:

  • Does not reuse old addresses for new allocations... most of the time.
  • Partial poisoning through the freeguard (F) option of MALLOC_OPTIONS
    • Bunch of other fun options in there too
  • More reliable than glibc junking (at least it will write every time)
  • No instrumentation
  • Catches double free and overflows
  • Fast

Cons:

  • Sometimes reuses old addresses.
  • Freeguard only poisons memory in completely freed pages
    • Makes sense for performance/resource usage reasons. You don't want to end up like libdislocator

tcmalloc_debug

Category: memory allocator

Env vars: LD_PRELOAD=/usr/lib/libtcmalloc_debug.so

URL: http://goog-perftools.sourceforge.net/doc/tcmalloc.html

Pros:

  • Does not reuse old addresses for new allocations
  • Can be added to anything with LD_PRELOAD
  • No instrumentation
  • More reliable than glibc junking (at least it will write every time)
  • Catches double free
  • Works with rr
  • Fast

Cons:

  • No poisoning

Python debug hook junking

Category: memory allocator

URL: https://docs.python.org/3/whatsnew/3.6.html#whatsnew36-pythonmalloc

Not general purpose obviously but I happen to be dealing with python-related issues right now.

Python 3.6 or newer: PYTHONMALLOC=debug, otherwise build --with-pydebug

This is about the default allocator, pyalloc, but one of the options above can be used with PYTHONMALLOC=malloc (or combined with the hooks, malloc_debug), with the caveat that it's ~5x slower than pyalloc.

Pros:

  • Catches overflows and a few python-specific checks
  • No instrumentation (on python 3.6 or newer)
  • Works with rr
  • Fast

Cons:

  • No poisoning
  • Reuses old addresses for new allocations
  • Not general purpose, only covers allocations done through python's API
  • The GIL checks mean your application won't start if it uses openssl

Options if you use windows