Glossary term

Consensus Algorithm

Engineering definition of a consensus algorithm covering agreement, replicated logs, quorum intersection, leader terms, commit index, partition behavior and validation.

Branch: Computer Engineering
Glossary type: concept
Content: Glossary term
Updated: Jun 26, 2026
Revision: v1.0.0 · reviewed

Definition

concept

A consensus algorithm is a distributed protocol that lets multiple nodes agree on one value, command order or replicated state transition despite failures and message delays.

Consensus algorithms appear in replicated databases, metadata stores, cluster membership systems, distributed locks, configuration services, control-plane services and fault-tolerant storage. A useful design states the fault model, membership, quorum rule, leader or proposer role, term or ballot numbering, log-entry rule, commit rule, partition behavior, recovery process, operational limits and validation evidence.

A consensus algorithm is a distributed protocol that lets multiple nodes agree on one value, command order or replicated state transition despite failures and message delays. It is the mechanism behind many replicated logs, metadata stores, configuration services, lock services and control-plane systems.

Consensus is stronger than asking for a quorum count. A quorum is a voting geometry. A consensus algorithm defines the state machine, terms, log rules, commit rule and recovery behavior that make those votes mean the same decision.

Agreement Contract

For a decision instance:

k

if one correct node commits value:

v

and another correct node commits value:

w

then safety requires:

v=w

The system may delay a decision during failures, but it must not commit two different values for the same instance. Availability can degrade; safety must not.

Fault Model

Most infrastructure consensus used for replicated logs is crash-fault tolerant. If the cluster has:

N

nodes and uses majority quorums, the usual tolerated crash failure count is:

\displaystyle f=\left\lfloor\frac{N-1}{2}\right\rfloor

Equivalently, a design that must tolerate:

f

crash failures usually needs at least:

N\geq2f+1

nodes. Byzantine consensus has different assumptions and usually different quorum sizes; it should not be implied by a normal crash-fault consensus claim.

Quorum Intersection

A majority quorum is:

\displaystyle Q=\left\lfloor\frac{N}{2}\right\rfloor+1

Two quorums intersect when:

Q_a+Q_b>N

That intersection is important because some node in the new decision path has evidence from the old decision path. The protocol still has to define what evidence is valid and how stale terms are rejected.

Terms and Leaders

Many consensus protocols use a term, ballot or epoch:

T

Messages from older terms are rejected:

T_{msg}<T_{local}\Rightarrow \text{reject}

Leader election is part of many consensus designs, but it is not the whole protocol. The leader must replicate entries, obey term rules, step down when it sees newer evidence and avoid committing entries that do not satisfy the protocol’s commit rule.

Replicated Log

For replicated-log consensus, a log entry can be represented as:

L_k=(T_k,cmd_k)

where:

k

is the log index. The commit index:

C

is the highest index known to be safely committed. A common commit screen is that an entry is stored on at least a quorum:

replicas(L_k)\geq Q

with term and leader-specific rules applied by the protocol.

Latency and Availability

A simplified commit-latency screen is:

T_{commit}=T_{append}+T_{replicate}+T_{quorum\_ack}+T_{apply}

For:

T_{append}=12\ \text{ms}

T_{replicate}=35\ \text{ms}

T_{quorum\_ack}=8\ \text{ms}

and:

T_{apply}=5\ \text{ms}

the screened commit time is:

T_{commit}=12+35+8+5=60\ \text{ms}

If each node has independent availability:

A

and at least:

Q

of:

N

nodes must be available, quorum availability is:

A_Q=\sum_{i=Q}^{N}\binom{N}{i}A^i(1-A)^{N-i}

For:

N=5,\quad Q=3,\quad A=0.98

the independent quorum availability screen is:

A_Q=0.9999

Correlated failures, bad deployments, network partitions and storage latency can make the real availability much lower.

Boundary With Neighboring Patterns

Quorum defines how many votes are enough. Leader election chooses who may coordinate. Consensus defines how decisions remain safe across terms, logs and failures. A distributed lock may use consensus internally, but a lock is only a use case. Two-phase commit coordinates atomic transaction outcome across participants; it does not by itself replicate an ordered state machine. A CRDT avoids immediate agreement by designing merge rules that converge later.

These distinctions matter during incidents. Replacing consensus with a cache lock, replacing fencing with a status flag, or replacing a commit rule with a timestamp can create split-brain or lost updates.

Validation

Validation should include leader crash, follower crash, network partition, delayed messages, duplicated messages, reordered messages, disk restart, stale term recovery, log truncation, snapshot restore, membership change, clock skew where leases are used, quorum loss, minority reads and operator recovery.

Useful evidence includes term history, election trace, log index comparison, commit-index monotonicity, quorum membership, rejected stale messages, recovery replay, snapshot checksum, p95 and p99 commit latency, unavailability windows and proof that no two nodes committed different commands at the same log index.

Failure Modes

Common failure modes include treating quorum as consensus, allowing two leaders to commit in the same term, accepting stale messages, changing membership without overlap, serving strong reads from a stale follower, acknowledging before durable replication, losing log entries during snapshot restore, ignoring disk latency, assuming clocks prove order, and testing only clean leader failover.

A consensus algorithm is not just an implementation detail. It is a safety boundary. Systems that depend on it should expose enough operational evidence to prove which term, quorum, log entry and commit rule authorized each critical decision.

REF

Disciplines