The Design and Architecture of the
Microsoft Cluster Service
-- A Practical Approach to High-Availability and Scalability
Werner Vogels,
Dan Dumitriu, Ken
Dept. of Computer Science, Cornell University
Rod Gamache, Mike Massa, Rob Short, John Vert
Cluster group, Microsoft Corporation
Barrera, Jim Gray
Scalable Server group, Microsoft Research.
Microsoft Cluster Service (MSCS) extends the Windows NT operating system to support
high-availability services. The goal is to offer an execution environment where
off-the-shelf server applications can be continuously available, even in the presence of
node failures. Later versions of MSCS will provide scalability via a node and application
management system which allows applications to scale to hundreds of nodes. In this paper
we provide a detailed description of the MSCS architecture and the design decisions that
have driven the implementation of the service. The paper also describes how some major
applications use the MSCS features, and describes features added to make it easier to
implement and manage fault-tolerant applications on MSCS.
Copyright 1998 IEEE. Published in the Proceedings of FTCS'98, June
23-25, 1998 in Munich, Germany. Personal use of this material is permitted. However,
permission to reprint/republish this material for advertising or promotional purposes or
for creating new collective works for resale or redistribution to servers or lists, or to
reuse any copyrighted component of this work in other works, must be obtained from the
IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane /
P.O. Box 1331/ Piscataway, NJ 08855-1331, USA. Telephone: + Intl.908-562-3966.
