Basics

Coordinated checkpointing

blocking

non-blocking

Uncoordinated checkpointing

Message logging

survey: Alvisi/Marzullo

optimistic:

sender-based logging:

causal logging:

Manetho:

FBL: Alvisi and Maruzllo

Byzantine failures

Tennessee:

Prith Banerjee

self-checking programs

Replay and debugging

Netzer et al:

Objective CAML

Shared memory

MPI

Compiling

Other stuff

Reversible computations:

Systems:

NetSolve and Globus

Muller:

Seti@Home