This post is about leader-based BFT protocols such as PBFT, FaB, and similar view-based designs. The missing piece in many explanations is not just the size of the quorum overlap. It is: what property the overlap is supposed to guarantee.
The first note in this sequence showed the obstruction at
3f + 1. The second showed the clean
5f + 1 intuition for why one voting round can sometimes
be enough. This note extracts the common idea.
More precisely, this is the lens I find most useful for understanding the lower bounds in leader-based BFT. The formal 2021 theorem is stated for a broadcast abstraction and proved by indistinguishability.
The Question
Before doing any quorum arithmetic, what is the protocol actually trying to guarantee at a leader change?
That is the right starting point. If you do not define that target
first, statements like overlap = f + 1 or
overlap = 3f + 1 do not yet mean anything.
Problem Definition
For one slot, block, or decision of leader-based BFT, the protocol has to satisfy:
-
Safety: if any honest replica commits
x, then no honest replica may ever commity != x. - Liveness: after the network stabilizes and an honest leader gets a view, the honest replicas eventually commit some value.
-
View-change safety: if a previous view may
already have committed
x, then the next leader must not be able to steer honest replicas toward a conflicting valuey.
That third item is the heart of the lower-bound intuition.
Strictly speaking, view-change safety is usually left implicit
because it is already contained in safety. If a later leader causes
honest replicas to commit y after some honest replica
already committed x, then safety is broken. So
view-change safety is not a new top-level correctness property. It is the
internal obligation that makes safety survive leader changes.
What A View Change Has To Recover
As in the previous two notes, a view change is the moment when a new leader gathers a quorum of reports about earlier views and decides how to continue.
In the best case, a view commits a value. If it does not, it still has to leave behind enough evidence that the next leader can continue safely. So the problem is not only to decide a value. It is also to fail in a way that preserves enough information for the next leader to stay safe.
What Honest Overlap Must Achieve
Now we can define the symbolic object the counting argument is actually trying to capture.
Suppose:
-
C_xis an old commit quorum for valuex -
Qis the quorum the next leader collects during the view change
What do we need from Q?
We need Q to contain enough evidence that, if
C_x might have existed, the leader's safe-proposal rule
is forced to preserve x.
That can happen in two ways:
-
Qexplicitly contains a certificate proving the old commit -
even without such a certificate, the overlap between
C_xandQleaves behind enough honest evidence forxthat conflicting values cannot dominate
That second case is where the familiar counting arguments come from.
Actual protocols express this with protocol-specific objects: prepared or commit-related evidence carried in PBFT-style view-change messages, and quorum certificates, timeout certificates, and vote rules in HotStuff- and DiemBFT-style designs. The exact objects differ, but they serve the same purpose.
The toy condition H_x(Q) > F_y(Q) below is only a
clean sufficient condition for intuition. Real protocols can satisfy
view-change safety with richer rules. The 2021
5f - 1 construction is an example: it uses
4f - 1-vote quorum certificates,
4f - 1-message timeout certificates that lock blocks,
and a new-view rule that proposes from the carried-forward lock.
A Simple Symbolic Recovery Condition
For the simplest vote-count intuition, define:
The notation in this section is just a teaching device for this post.
-
H_x(Q)= the minimum guaranteed number of honest replicas inQthat came from the oldx-commit quorum -
F_y(Q)= the maximum possible support insideQfor a conflicting valuey
Then a very useful sufficient condition is:
H_x(Q) > F_y(Q)
Why? Because then the new leader cannot safely treat
y as equally plausible or stronger than x.
The old committed value still dominates in the next view.
That is the symbolic target the overlap is supposed to help with.
Apply It To 3f + 1
Here:
- old quorum size =
2f + 1 - new-view quorum size =
2f + 1 - total replicas =
3f + 1
So the overlap is at least:
(2f + 1) + (2f + 1) - (3f + 1) = f + 1
Among those f + 1 overlapping replicas, up to
f may be Byzantine. So:
H_x(Q) = 1
But a conflicting value can still get support from:
- up to
fhonest replicas outside the old quorum - up to
fByzantine replicas
So:
F_y(Q) = 2f
Thus:
H_x(Q) = 1 <= 2f = F_y(Q)
So the overlap does not guarantee view-change safety. It may leave the
next leader with too little honest evidence for x.
Apply It To 5f + 1
Here:
- old quorum size =
4f + 1 - new-view quorum size =
4f + 1 - total replicas =
5f + 1
So the overlap is at least:
(4f + 1) + (4f + 1) - (5f + 1) = 3f + 1
Remove up to f Byzantine replicas from that overlap and
you get:
H_x(Q) = 2f + 1
Conflicting support is still at most:
F_y(Q) = 2f
So now:
H_x(Q) = 2f + 1 > 2f = F_y(Q)
Now the old committed value dominates in the next view.
That is the symbolic reason the simple 5f + 1 intuition
works while the simple 3f + 1 intuition does not.
What This Note Shows
This note does not prove the full lower bound. It isolates the safety
condition the next view has to satisfy and shows what the overlap is
supposed to guarantee. That is enough to see why the simple
3f + 1 quorum arithmetic fails and why larger replica
sets can make a 2-round design plausible.
Series
Previous: Why larger replica sets can replace the third round in leader-based BFT
Start from the beginning: Why leader-based 3f+1 BFT needs a third round