Glossary term
Replicated Log
Engineering definition of a replicated log covering append order, commit index, follower lag, log compaction, snapshot recovery and validation.
Definition
conceptA replicated log is an ordered sequence of durable entries copied across nodes so a distributed system can apply the same commands or state transitions in the same order.
Replicated logs appear in consensus systems, metadata stores, distributed databases, lock services, controllers, event-sourced services and recovery mechanisms. A useful design states the entry format, append rule, commit rule, apply rule, durability boundary, follower lag metric, snapshot rule, compaction policy, recovery behavior and validation evidence.
A replicated log is an ordered sequence of durable entries copied across nodes so a distributed system can apply the same commands or state transitions in the same order. It is a common foundation for consensus systems, metadata stores, lock services, replicated databases and event-sourced services.
The log is not just a file. It is a contract about ordering, durability, commitment, replay and recovery. A system that loses, reorders or applies log entries inconsistently can violate linearizability, duplicate commands or recover into a state that never existed.
Entry Model
A log entry at index:
can be represented as:
where term_k or epoch identifies the authority that wrote it, cmd_k is the command or state transition, and meta_k may include checksum, client request id, timestamp, dependency metadata or schema version.
The index gives a stable order:
The meaning of each entry depends on applying prior committed entries first.
Append and Commit
Appending an entry is not the same as committing it. A leader or writer may append locally before the entry is safely replicated.
Let:
be the commit index. Entries with:
are safe to apply according to the system’s commit rule. Entries with:
may still be speculative, uncommitted or subject to truncation during recovery.
Apply Order
Each replica should apply committed entries in index order. If:
is the highest applied index on replica:
then normal application advances:
only when entry:
is available and committed. Skipping an entry can create a state that no valid log prefix represents.
Follower Lag
Follower lag can be measured in entries:
where:
is the highest replicated index on follower:
If entries arrive at rate:
and a follower is behind by:
seconds, then:
For:
the follower is approximately:
behind the commit path.
Storage Growth and Compaction
If each entry has average size:
and append rate is:
storage growth is:
If storage budget is:
current log storage is:
and growth rate is:
time to budget exhaustion is:
For:
the time is:
Snapshots and compaction are therefore correctness and availability mechanisms, not only disk cleanup.
Boundary With Consensus and CDC
Consensus algorithms decide which entries are committed and in what order. The replicated log stores and replays those entries. Change data capture may read a database log and publish changes downstream, but it does not necessarily provide the same command-ordering and commit-index contract as a consensus log.
A replicated log can also support linearizable reads by proving that a read observes at least a known commit index.
Recovery
Recovery should restore the last durable log prefix, verify checksums, replay committed entries, discard invalid speculative entries, install snapshots safely and resume from a known commit index. A follower that rejoins after a long outage may need snapshot transfer rather than entry-by-entry catch-up.
The recovery rule should define what happens when local entries conflict with the leader’s log, when snapshots are corrupt, when schema versions changed and when disk contains entries beyond the known commit index.
Validation
Validation should include leader crash after append, crash after commit, follower restart, divergent follower logs, snapshot install, compaction during reads, checksum failure, disk-full behavior, slow follower catch-up, duplicate client command, out-of-order delivery and mixed-version replay.
Useful evidence includes append latency, commit latency, fsync time, follower lag, applied index, snapshot age, compaction duration, replay time, truncation count, checksum failures and invariant checks after recovery.
Failure Modes
Common failure modes include applying uncommitted entries, truncating committed entries, serving reads from a follower behind the required commit index, compacting entries before snapshots are durable, losing client request ids during replay, replaying non-idempotent commands twice, accepting corrupted snapshots and monitoring only leader health while followers fall behind.
A replicated log is credible only when the system can prove what entry is committed, what entry each replica has applied, and how recovery preserves a valid committed prefix.