CS 6120: Advanced Compilers: The Self-Guided Online Course
CS 6120 is a PhD-level Cornell CS course by Adrian Sampson on programming language implementation. It covers universal compilers topics like intermediate representations, data flow, and “classic” optimizations as well as more research-flavored topics such as parallelization, just-in-time compilation, and garbage collection. The work consists of reading papers and open-source hacking tasks, which use LLVM and an educational IR invented just for this class.
This page lists the curriculum for following this course at the university of your imagination, for four imagination credits (ungraded). There’s a linear timeline of lessons interspersed with papers to read. Each lesson has videos and written notes, and some have implementation tasks for you to complete. Tasks are all open-ended, to one degree or another, and are meant to solidify your understanding of the abstract concepts by turning them into real code. The order represents a suggested interleaving of video-watching and paper-reading.
Some differences with the “real” CS 6120 are that you can ignore the task deadlines and you can’t participate in our discussion threads on Zulip. Real 6120 also has an end-of-semester course project—in the self-guided version, your end-of-semester assignment is to change the world through the magic of compilers.
The instructor is a video production neophyte, so please excuse the production values, especially in the early lessons. CS 6120 is open source and on GitHub, so please file bugs if you find problems.
-
Producing Wrong Data Without Doing Anything Obviously Wrong!
Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, and Peter F. Sweeney. ASPLOS 2009. - SIGPLAN Empirical Evaluation Guidelines
Lesson 2: Representing Programs
Lesson 3: Local Analysis & Optimization
Lesson 4: Data Flow
Lesson 5: Global Analysis & SSA
-
Efficient Path Profiling
Thomas Ball and James R. Larus. MICRO 1996.
Lesson 6: LLVM
-
Provably Correct Peephole Optimizations with Alive
Nuno P. Lopes, David Menendez, Santosh Nagarakatte, and John Regehr. PLDI 2015.
Lesson 7: Loop Optimization
Lesson 8: Interprocedural Analysis
Lesson 9: Alias Analysis
-
Type-Based Alias Analysis
Amer Diwan, Kathryn S. McKinley, and J. Eliot B. Moss.
Lesson 10: Memory Management
-
A Unified Theory of Garbage Collection
David F. Bacon, Perry Cheng, and V. T. Rajan. OOPSLA 2004. -
Fast Conservative Garbage Collection
Rifat Shahriyar, Stephen M. Blackburn, and Kathryn S. McKinley. OOPSLA 2014.
Lesson 11: Dynamic Compilers
-
An Efficient Implementation of SELF, a Dynamically-Typed Object-Oriented Language Based on Prototypes
C. Chambers, D. Ungar, and E. Lee. OOPSLA 1989. -
Trace-Based Just-in-Time Type Specialization for Dynamic Languages
Andreas Gal, Brendan Eich, Mike Shaver, David Anderson, David Mandelin, Mohammad R. Haghighat, Blake Kaplan, Graydon Hoare, Boris Zbarsky, Jason Orendorff, Jesse Ruderman, Edwin W. Smith, Rick Reitmaier, Michael Bebenita, Mason Chang, and Michael Franz. PLDI 2009.
Lesson 12: Program Synthesis
-
Superoptimizer: A Look at the Smallest Program
Alexia Massalin. ASPLOS 1987. -
Chlorophyll: Synthesis-Aided Compiler for Low-Power Spatial Architectures
Phitchaya Mangpo Phothilimthana, Tikhon Jelvis, Rohin Shah, Nishant Totla, Sarah Chasins, and Rastislav Bodik. PLDI 2014.
Lesson 13: Concurrency & Parallelism
-
Threads Cannot Be Implemented as a Library
Hans-J. Boehm. PLDI 2005. -
Exploiting Superword Level Parallelism with Multimedia Instruction Sets
Samuel Larsen and Saman Amarasinghe. PLDI 2000. -
A Type and Effect System for Deterministic Parallel Java
Robert L. Bocchino, Vikram S. Adve, Danny Dig, Sarita V. Adve, Stephen Heumann, Rakesh Komuravelli, Jeffrey Overbey, Patrick Simmons, Hyojin Sung, and Mohsen Vakilian. OOPSLA 2009. -
Formal Verification of a Realistic Compiler
Xavier Leroy. CACM in 2009.