Need to coordinate distributed processes, support mutual exclusion. Like the critical section problem, but no shared variables. Servers can implement locking for their own resources, but locks introduce state. Stateless servers like NFS support file locking via lockd; in general, synchronization can be provided by a different service than storage service.

Distributed mutual exclusion: a single process is given the right, temporarily, to access a resource.

Election algorithms are used to choose a unique process to serve as coordinator.

Distributed mutual exclusion

ME1: (safety)
At most one process may execute in the critical section (CS) at a time.
ME2: (liveness)
A process requesting entry to the CS is eventually granted it (so long as any process executing in the CS eventually leaves it).
ME3: (ordering)
Entry to the CS should be granted in happened-before order.

The central server algorithm

A server process grants permission to enter a critical section. (Fig 10.7) (Can think of a token managed by server: holder of token can enter CS.) To enter a CS, send a message to server and wait for reply. Server sends reply immediately if CS available, else it queues the request. On leaving the CS, a process sends a message to the server, which can then allow the oldest queued process to continue (by sending it a reply).

Server can be a bottleneck, and is a single point of failure. If it fails, a new server can be elected (e.g., from those needing the service). The new server must determine the state of all clients (who's waiting, who's in the CS). Ordering of requests in the new server may be different than old server's. Also, failure of a client must be considered.

A distributed algorithm using logical clocks (Ricart and Agrawala's algorithm)

Uses distributed agreement instead of a central server. Basic idea: a request to enter a CS is multicast, and don't enter the CS until it gets a reply from all processes.

Assumptions: all processes p_1...p_n know each other's addresses, all messages sent are delivered, each process p_i keeps a logical clock. Messages to request token: <T,p_i>, where T is senders timestamp, p_i sender's id. Only one CS.

Each process records its state wrt the token: RELEASED, WANTED, HELD.

init:  
	state := RELEASED;

request the token:
	state := WANTED;			|
	Multicast request  to all processes;	|  defer request  processing
	T  := request's timestamp;		|
	Wait until ( number of  replies received = n-1 );
	state := HELD;

on receiving request <T_i,p_i> at p_j (j <> i):
	if (state = HELD) or (state=WANTED and (T,p_j)<(T_i,p_i))
	then queue request  from p_i without replying;
	else reply immediately to p_i;

release the token:
	state := RELEASED;
	reply to any queued requests;

Example: (Fig 10.9) p_1 and p_2 make concurrent requests, p_3 not interested. p_1's timestamp is 41, p_2's is 34.

Obtaining the token takes 2(n-1) messages, with hw support, only n.

A ring-based algorithm

Processes are arranged in a logical ring (Fig 10.10). The token is passed in one direction around the ring: current holder allowed access, when done, passes token. If the token arrives and CS not requested, immediately pass token to neighbor. (ME3 may not hold.)

Messages are sent even when no process requests the token. Failures require reconfiguration of the ring. If failed process held the token, and election is held to regenerate the token. If failure really didn't happen, may wind up with 2 tokens.

Elections

Elections must result in a unique winner, even if multiple processes concurrently initiate elections.

The bully algorithm

Processes need to know the addresses and identities of all other members. The surviving member with the largest identifier is chosen as coordinator.

Three types of messages: election to start an election, answer to respond to an election message, coordinator to announce the new coordinator.

To begin an election: send election to all processes with a larger id. If no replies come back, send a coordinator message to all processes with lower ids, announcing yourself as coordinator. Otherwise (you got a reply to your election message), wait for a coordinator message. If it doesn't come, start another election.

When the coordinator message is received, record the id of the coordinator.

When an election message is received, send back an answer message and start another election (unless you already did).

When a failed process restarts, it starts an election: it will become coord if it has largest id (bully).

(Fig 10.11)

best case : election started by process with next-to-highest id: makes itself coord and sends (n-2) coordinator messages. Worst case: lowest-id process detects failure: each hihger-id process starts an election...O(n^^2).

A ring-based election algorithm (Chang and Roberts)

Processes in a logical ring, only communcate with one neighbor, e.g. CCW. Identities of others not known. Process with highest id is elected as coordinator. Assume no crashes.

Init: each process marked as non-participant. To start an election: mark self as participant, put own id in and election message and send to neighbor.

When recieving an election message, compare id in message with own id. If id in message greater, forward message to neighbor. If own id greater and self is non-participant, replace id in message with own id and forward. In either case, mark self as participant.

If id in message matches own id, become the coordinator. Mark self as non-participant, send elected message containing own id to neighbor.

When recieving an elected message not sent yourself, mark self as non-partipant and forward message.

Worst case: CW neighbor has highest id. Takes (n-1) messages to reach him with election message, he doesn't know he wins til it goes around again (n messages), then n elected messages sent: total of 3n-1 messages.