The National Science Foundation has funded a joint project between Computer Science professor Keshav Pingali (principal researcher), Research Associate Paul Stodghill, and researchers at the Pittsburgh Supercomputing Center. The project, "Mobile Applications in Computational Grids," leverages opportunities presented by grid computing for developing new classes of applications. Two such classes are applications that move between computing resources to exploit computational resources, and applications that migrate between servers to achieve resilience to hardware faults. Both of these are instances of mobile applications in computational grids. To achieve full mobility, applications must be able to move their state from one platform to another one that may have a different processor architecture (processor independence) and a different number of processors (platform-independence).
The research project is solving this problem by using a combination of compiler technology and runtime systems. The proposed solution is based on application-level checkpointing, an approach in which an application is instrumented so that it can save and restore its own state without any assistance from the operating system or architecture. To achieve the goal of mechanism transparency, the proposed approaches employ compiler technology to automatically instrument codes in this way. Experimental evaluation of the resulting system will be carried out jointly with the Pittsburgh Supercomputing Center (PSC). The proposed system will enable many existing parallel codes to be quickly and safely transformed into mobile applications.