Glossary term
Conflict-Free Replicated Data Type
Engineering definition of a CRDT covering eventual convergence, state-based and operation-based replication, merge laws, metadata cost, conflict boundaries and validation.
Definition
conceptA conflict-free replicated data type, or CRDT, is a replicated data structure whose update and merge rules are designed so replicas can converge without a central coordinator.
CRDTs are used in collaborative editors, replicated databases, offline-first applications, distributed caches, edge systems and eventually consistent services when availability during partition is more important than immediate global agreement. A useful CRDT design states the data model, replica identity, state or operation semantics, merge laws, delivery assumptions, metadata bounds, garbage-collection rule, invariant limits and validation evidence.
A conflict-free replicated data type, or CRDT, is a replicated data structure whose update and merge rules are designed so replicas can converge without a central coordinator. It is useful when users, devices or services must keep accepting local updates during network partitions or disconnected operation.
CRDTs do not make every business rule conflict-free. They make a selected data structure merge safely under stated assumptions. The engineering task is to choose the right algebra, bound the metadata cost and prove that convergence still preserves the invariants that matter.
Convergence Contract
For replicas with states:
strong eventual convergence means that replicas which have incorporated the same set of updates:
eventually expose the same logical value:
for any pair:
after delivery and merge complete.
Merge Laws
State-based CRDTs normally rely on a merge operator:
that is commutative:
associative:
and idempotent:
These properties make repeated, delayed and differently ordered state exchange converge to the same result.
State-Based Replication
In a state-based CRDT, a replica sends state or a delta state to another replica. The receiver merges:
Full-state synchronization is simple but can be expensive. If each state payload is:
bytes and synchronization runs at:
exchanges per second across:
peers, the rough network rate is:
Delta-state designs reduce payload size but add more rules for causal context, retention and duplicate handling.
Operation-Based Replication
In an operation-based CRDT, replicas broadcast operations rather than full state. A delivered operation may be represented as:
where id supports duplicate suppression and deps records causal prerequisites. Concurrent operations must commute or be transformed into a form that makes their final effect independent of delivery order:
This design can reduce bandwidth, but it depends more heavily on reliable delivery, replay, causal ordering or an idempotent operation log.
Counter Example
A grow-only counter keeps one component per replica:
Replica:
increments only:
The value is:
and merge is component-wise maximum:
A positive-negative counter uses two grow-only counters:
so increments and decrements remain monotonic internally even when the visible value can move down.
Set Example
An observed-remove set stores add tags and remove evidence. An element:
is visible when at least one add tag for x has not been removed:
This rule avoids a simple “remove wins forever” mistake, but it creates tombstone and tag-retention questions. A set CRDT is not complete until it defines when remove metadata can be compacted without resurrecting deleted elements.
Boundary With Consensus
CRDTs trade immediate global agreement for local availability and later convergence. Consensus protocols choose one ordered decision; CRDTs define a data type where many orders lead to the same result. A CRDT can be a good fit for likes, counters, presence, editable text regions, shopping carts or replicated metadata. It is a poor fit for exclusive ownership, unique serial numbers, financial debit constraints, scarce inventory or actuator authority unless the invariant is enforced elsewhere.
Quorum and leader election may still exist around a CRDT system for membership, durability, garbage collection or high-risk commands. The CRDT does not remove the need to define authority boundaries.
Metadata Cost
CRDT metadata often grows with replica count, update count or delete history. If a design stores:
replica counters at:
bytes each, counter metadata is:
For:
the counter metadata is:
before payload, identifiers and serialization overhead.
If tombstones grow at:
bytes per second and the remaining metadata budget is:
the time to budget exhaustion is:
For a metadata limit of:
current metadata of:
and tombstone growth of:
the remaining time is:
Validation
Validation should exercise partitions, delayed delivery, duplicate delivery, reordered operations, replica restarts, snapshot restore, metadata compaction, old-client compatibility and mixed-version deployments. Property tests are especially useful because merge laws can be checked directly.
Useful assertions include:
The test suite should also include domain invariants. A cart CRDT may converge but still violate pricing rules. A collaborative editor may converge but still produce unacceptable text structure after concurrent edits.
Failure Modes
Common failure modes include using last-writer-wins where user intent matters, treating timestamp order as causal order, losing operation identifiers during replay, compacting tombstones before all replicas have observed them, allowing unbounded metadata growth, ignoring replica identity reuse, assuming eventual convergence protects scarce resources, and testing only two replicas when production has many versions and offline clients.
A production CRDT needs operational evidence: convergence latency, metadata size, compaction age, duplicate-operation rate, unresolved semantic conflicts, anti-entropy success, stale-client behavior and recovery tests after partitioned writes.