Need to coordinate distributed processes, support mutual exclusion. Like the critical section problem, but no shared variables. Servers can implement locking for their own resources, but locks introduce state. Stateless servers like NFS support file locking via lockd; in general, synchronization can be provided by a different service than storage service.
Distributed mutual exclusion: a single process is given the right, temporarily, to access a resource.
Election algorithms are used to choose a unique process to serve as coordinator.
Server can be a bottleneck, and is a single point of failure. If it fails, a new server can be elected (e.g., from those needing the service). The new server must determine the state of all clients (who's waiting, who's in the CS). Ordering of requests in the new server may be different than old server's. Also, failure of a client must be considered.
Assumptions: all processes p_1...p_n know each other's addresses, all messages sent are delivered, each process p_i keeps a logical clock. Messages to request token: <T,p_i>, where T is senders timestamp, p_i sender's id. Only one CS.
Each process records its state wrt the token: RELEASED, WANTED, HELD.
init: state := RELEASED; request the token: state := WANTED; | Multicast request to all processes; | defer request processing T := request's timestamp; | Wait until ( number of replies received = n-1 ); state := HELD; on receiving request <T_i,p_i> at p_j (j <> i): if (state = HELD) or (state=WANTED and (T,p_j)<(T_i,p_i)) then queue request from p_i without replying; else reply immediately to p_i; release the token: state := RELEASED; reply to any queued requests;Example: (Fig 10.9) p_1 and p_2 make concurrent requests, p_3 not interested. p_1's timestamp is 41, p_2's is 34.
Obtaining the token takes 2(n-1) messages, with hw support, only n.
Messages are sent even when no process requests the token. Failures require reconfiguration of the ring. If failed process held the token, and election is held to regenerate the token. If failure really didn't happen, may wind up with 2 tokens.
Three types of messages: election to start an election, answer to respond to an election message, coordinator to announce the new coordinator.
To begin an election: send election to all processes with a larger id. If no replies come back, send a coordinator message to all processes with lower ids, announcing yourself as coordinator. Otherwise (you got a reply to your election message), wait for a coordinator message. If it doesn't come, start another election.
When the coordinator message is received, record the id of the coordinator.
When an election message is received, send back an answer message and start another election (unless you already did).
When a failed process restarts, it starts an election: it will become coord if it has largest id (bully).
(Fig 10.11)
best case : election started by process with next-to-highest id: makes itself coord and sends (n-2) coordinator messages. Worst case: lowest-id process detects failure: each hihger-id process starts an election...O(n^^2).
Init: each process marked as non-participant. To start an election: mark self as participant, put own id in and election message and send to neighbor.
When recieving an election message, compare id in message with own id. If id in message greater, forward message to neighbor. If own id greater and self is non-participant, replace id in message with own id and forward. In either case, mark self as participant.
If id in message matches own id, become the coordinator. Mark self as non-participant, send elected message containing own id to neighbor.
When recieving an elected message not sent yourself, mark self as non-partipant and forward message.
Worst case: CW neighbor has highest id. Takes (n-1) messages to reach him with election message, he doesn't know he wins til it goes around again (n messages), then n elected messages sent: total of 3n-1 messages.