Randomized Consensus on Unknown Domains in Finite Rounds
10 Jun 2019
The most widely known result in distributed computing is the FLP result that proves the impossibility of consensus [1]. One of the assumptions of this theorem is that processes are deterministic. When randomness is allowed, the Ben-Or algorithm gives simple, elegant way to solve binary consensus [2]. Or rather, \(\text{lim}_{t \to \infty} P(\text{consensus has been reached}) = 1\). The algorithm goes something like this.
a = input({0, 1})
loop:
send_phase1(a)
A = receive_phase1()
if ∃aʹ∈ A : |{a''∈ A : a'' = a'}| > n/2:
b = a'
else:
b = ⊥
send_phase2(b)
B = receive_phase2()
if ∃bʹ∈ B : b' != ⊥ && |{b'' ∈ B : b'' = b'}| > f:
decide(bʹ)
if ∃bʹ∈ B : b' != ⊥:
a = b'
else:
a = choose_random({0, 1})
The algorithm proceeds in asynchronous rounds, each with two phases. In each phase, each process broadcasts its value for the phase and then waits for the values of \(n - f\) other processes. The safety argument is not complicated, and the termination argument is even more straightforward. Eventually the random choices made by processes in some round will be overwhelmingly in favor of either 0 or 1, and that value will decided by every process in the next round.
This blog post is a record of a couple simple observations I made when re-reading the paper recently (in preparation for teaching it in UW’s distributed systems course this past quarter) that I couldn’t find written up anywhere else.
Unknown Domains
The first observation is that this algorithm can be modified to allow processes to have inputs values from larger domains than \(\{0, 1\}\). In fact, the input domain can be infinite. Furthermore, we don’t even need the processes to know the input domain a priori. On the very last line of the protocol, when processes make a random choice, they simply choose from all values seen so far in any message.
Since there are at most n possible input values (because there are at most n processes), the termination argument still applies. The only real difference is that termination could take longer. The original paper contained a theorem that when \(f\) is \(O(\sqrt{n})\), the expected number of rounds is constant. That theorem no longer applies when the set of possible decision values grows with \(n\).
The other interesting fact is that this protocol can be further modified to support a model where not all processes get input values. Processes without input values initialize \(a = \bot\) on the first line. And on the last line when random choices are made, processes choose from all non-\(\bot\) values seen. As long as at least \(f + 1\) processes get input values, termination is still guaranteed. Even more interestingly, the check on first-phase messages can be refined somewhat in this case. The property that must be maintained is that no second-phase messages with different non-\(\bot\) values should exist for the same round. So, assuming we get \(n - f\) messages in the second round, if the number of messages for one non-\(\bot\) value is more than \(f\) greater than for every other non-\(\bot\) value, then we can safely choose that for the second phase.
Both of these modifications are important for the way people use consensus in the real world — namely to implement state machine replication. The commands proposed to the state machine log often come from an infinite domain and are chosen by clients at runtime. The servers implementing the state machine do not know which commands will be proposed a priori. Furthermore, we would like the system to make progress even when clients don’t send their commands to all servers.
Finite Rounds
The other observation is that the original protocol had all processes taking protocol steps and sending messages forever, even after they had decided values. This is not necessary. Instead, we can have processes send messages for each phase proactively up until they’ve decided a value and the reactively thereafter. That is, once a process has reached a decision, it only sends a message for a phase after it has received at least one message for that phase (or a later phase).
Because all non-faulty processes eventually decide a value, eventually all processes stop sending messages. Note, however, that this does not mean processes can halt. They must continue listening for messages forever (or at least \(f+1\) processes must). Eventually, however, they will no longer send any new messages.
When combined the model above where not all processes get input values, we could also specify that processes which have not seen any non-\(\bot\) values remain in a reactive mode until they receive a non-\(\bot\) value.
References
- Fischer, M.J., Lynch, N.A. and Paterson, M.S. Impossibility of Distributed Consensus with One Faulty Process. J. ACM. 32(2):374–382. Apr. 1985.
- Ben-Or, M. Another Advantage of Free Choice (Extended Abstract): Completely Asynchronous Agreement Protocols. In Proceedings of the Second Annual ACM Symposium on Principles of Distributed Computing. Montreal, Quebec, Canada. 1983. 27–30.