Lesson 11:
Memory Management
Gist
- A silly tweet by Steve Blackburn, who is one of the world’s preeminent GC researchers.
- The calamitous failure of manual memory management.
- Some terminology:
- collector: Short for garbage collector. The runtime system that calls
free
for you.
- mutator: The actual program you’re trying to run, whose garbage is being collected.
- live: Will be accessed again in the future.
- dead: Otherwise.
- Approaches to garbage collection.
- Reference counting vs. mark/sweep a.k.a. tracing (sometimes just called “garbage collection” unto itself, but that’s kind of confusing).
- What metadata do they maintain, when do they run, what do they do, and what do they do when they run?
- Reference cycles.
- Refinements under the mark/sweep umbrella.
- Conservative GC.
- Its applicability to memory-unsafe languages like C/C++, where the compiler does not necessarily know which values are pointers.
- A cool blog post about Riptide, the GC in Apple’s JavaScript compiler, that mentions that it uses a conservative GC. In fact, counterintuitively, lots of implementations of safe languages use conservative GCs—try to think of reasons why this is the case!
- The wasted memory seems bad, but “Bounding Space Usage of Conservative Garbage Collectors” by Hans-J. Boehm shows how data structures often don’t have such a big problem.
- A thread on the Go mailing list demonstrates why conservative GC often works much better on 64-bit than 32-bit architectures: because valid pointers are far more sparse in the space of all integers.
- Parallel collectors use multithreaded tracing algorithms. (Useful but not that interesting.)
- Some aspects of “when” the GC runs:
- Concurrent collection: the mutator runs in parallel with the collector.
- Incremental collection.
- Moving/copying/compacting.
- Semispace.
- Generational.
- The “generational hypothesis”: pithily, that most objects die young. More usefully, when a given GC run occurs, recently-allocated objects are more likely to be unreachable than objects that have already lasted a long time.
Tasks
- Implement a garbage collector for the Bril interpreter and the Bril memory extension, eliminating the need for the
free
instruction. Stick with something simple, like reference counting or a semispace collector.
- Check that it works by running the benchmarks that use memory after deleting all their
free
instructions.