State Transfer in HOT
This document is a part of the online
Horus Documentation,
under
Horus Object Tools.
The state-transfer functionality is provided in HOT by two classes:
HorusClSv
implements a state-transfer protocol (which is discussed in the sections below)
and offers a low-level interface to it. The
HorusCSX
class offers a higher-level interface to state transfer; users of HOT will want to
use it in most cases.
The implementation of state transfer in HOT follows the "pull" paradigm, in which a
joining server asks old server(s) for pieces of the global state, and decides by itself
when state transfer has been completed. In the "push" paradigm (used in Isis and
XFER layer of Horus), it is a responsibility of old servers to decide how to transfer
their state to the joining server, and how to handle failures during state transfer. It
appears that the "pull" approach is simpler to implement and is more flexible.
In the sections below, we show how the state is transferred in two interesting
situations (new server joining the group, and two group partitions merging), and
discuss the state transfer protocol.
The typical scenario is a new process X joining the group as a server. This occurs
in several stages. Initially, the X joins as a client (Stage 1). It then
sends a becomeServer request to the group coordinator (Stage 2), which flushes the
group and installs a new view. In that view, X is in the list of xfer-servers
(Stage 3). From that moment on, X has all "rights" of normal servers, but no
"responsibilities:"
While X is among xfer-servers, it will receive the same messages (casts
and scasts) as normal servers.
However, X is not required to do any service as long as it is in the XFER
stage.
When X becomes an xfer-server, it starts state transfer. The state is transfered in a
series of read requests to normal servers (Stage 4). When all the state has been
received, X sends an xferDone message to the coordinator (stage 5). The
coordinator flushes the group and installs a new view, in which X is in the list of
servers. At that point, X becomes a normal server.
When two partitions merge, the servers in one of them will suddenly become
clients. It is not clear how to conciliate the two different global states in a general
case (the actual policy is very much application dependent). In the current
implementation of HOT, one of the partitions formally loses its state, and all its
servers restart state transfer from the other partition. It is left up to the application
to implement state merge efficiently.
Here is what happens when a group merge occurs. Initially, there are
two partitions with possibly different global states (Stage 0). Eventually, the two
partitions merge. When that happens, all members of one of the partitions (servers,
xfer-servers, and clients) suddenly find themselves in the clients list (Stage 1). That
triggers a restart of the state transfer by all "disqualified" servers (Stage 2). When
state transfer completes, the servers list of the merged group includes servers from both
partitions (Stage 3).
The aim of the state transfer protocol of HOT is to provide some kind of
convergence properties to the global state. The global state in this context means
some (application-defined) data that is consistently maintained by a subset of the
group members, which are called servers. When a new member joins the group and
wants to become a server, it has to receive the state in a consistent way. Among
possible problems are:
Old servers may crash during state transfer (in particular, it may happen that all
servers but the joining one crash);
New messages may be sent/received during state transfer, which may result in
(potentially inconsistent) modifications to the global state;
The group may merge with another partition, with a different opinion on what
the current state is.
The implementation of state transfer in HOT attempts to deal with these problems
in the following way. Each group member has a membership type, which is either
Client or Server. After the join downcall is invoked, a group member
is always in one of the following states:
JOINING
CLIENT_NORMAL
BECOMING_SERVER
SERVER_XFER
SERVER_XFER_DONE
SERVER_NORMAL
It is guaranteed that if the membership type is Client, then the state of the
member will eventually become CLIENT_NORMAL, after which it will
always be CLIENT_NORMAL.
If the membership type is Server, and state transfer option is
NO_XFER, then
it is guaranteed that the state of the member will eventually become
SERVER_NORMAL. After that, whenever the state of the member becomes
different from SERVER_NORMAL, it will eventually become
SERVER_NORMAL again.
Suppose the membership type is Server, and state transfer option is other than
NO_XFER. Suppose also that whenever the state of the member changes to
SERVER_XFER (as specified by the VIEW upcall parameters), that is
eventually followed by an xferDone downcall.
Then it is guaranteed that the state of the member will eventually become
SERVER_NORMAL. After that, whenever the state of the member becomes
different from SERVER_NORMAL, it will eventually become
SERVER_NORMAL again.
Note that even after a joining server has received the state and become a normal
server, it may lose its server status as a result of a group merge. If that
happens, the state transfer will be restarted automatically.
If a group merge occurs during state transfer, the transfer will be terminated and
restarted again.
A state transfer will also be terminated if all normal servers crash before the
transfer completes.
send mail to
alexey@cs.cornell.edu