Glossary term

Serializability

Engineering definition of serializability covering transaction isolation, serial order, conflict graphs, write skew, strict serializability and validation.

Definition

concept

Serializability is a transaction isolation property in which the result of concurrent transactions is equivalent to some serial execution of those transactions.

Serializability appears in databases, transaction managers, distributed storage, reservation systems, financial systems, inventory services and concurrent state machines when transactions must preserve invariants despite overlap. A useful engineering claim states the transaction boundary, read and write sets, conflict rule, isolation level, retry behavior, distributed commit boundary, anomaly tests and validation evidence.

Serializability is a transaction isolation property in which the result of concurrent transactions is equivalent to some serial execution of those transactions. It lets engineers reason as if transactions ran one at a time, even when the implementation overlaps reads, writes, locks, validation and commits.

The property matters whenever invariants span more than one record, row, object, account, reservation, inventory item or configuration value. A system can avoid low-level race conditions and still violate a business invariant if its transaction isolation is too weak.

Transaction History

Let a concurrent history be:

H

over transactions:

T_1,T_2,\ldots,T_n

The history is serializable if there exists a serial history:

S

with the same transaction effects and compatible observations:

H\equiv S

The serial order does not have to match real-time order unless the system claims strict serializability.

Read and Write Sets

Each transaction has a read set:

R_i

and write set:

W_i

Conflicts occur when two transactions access the same item and at least one access is a write:

W_i\cap R_j\neq\varnothing

or:

W_i\cap W_j\neq\varnothing

These overlaps are where isolation control must preserve a valid serial order.

Conflict Graph

Conflict serializability can be checked with a precedence graph:

G_c=(V,E)

where vertices are transactions and edges represent ordered conflicts. If:

T_i\rightarrow T_j

then T_i must appear before T_j in the equivalent serial order.

A history is conflict-serializable when:

G_c

is acyclic. A cycle means the observed conflict order cannot be explained by any serial execution.

For:

n=6

transactions, the maximum undirected pair count to screen for potential pairwise conflicts is:

\displaystyle \frac{n(n-1)}{2}=\frac{6\cdot5}{2}=15

Actual conflict analysis still depends on operation order and read/write sets.

Write Skew

Snapshot isolation can prevent some lost updates but still allow write skew. Two transactions read the same invariant, update different rows, and both commit. Each transaction looked valid in its snapshot, but the combined result violates the invariant.

For example, if at least one operator must remain on duty, two transactions can both observe two operators on duty, each mark a different operator off duty, and commit. No single row was overwritten, but the invariant is broken.

Strict Serializability

Strict serializability combines serializability with real-time order. If transaction:

T_a

commits before:

T_b

starts, strict serializability requires:

T_a<T_b

in the serial order. This is closer to linearizability for transactions. Plain serializability may allow a serial order that does not preserve that real-time relation.

Boundary With Two-Phase Commit

Two-phase commit can coordinate whether participants commit or abort one distributed transaction. It does not by itself prove that concurrent transactions are serializable. Isolation still needs locks, validation, timestamp ordering, optimistic concurrency control, serializable snapshot isolation or another concurrency-control mechanism.

Atomic commit answers “all or nothing.” Serializability answers “as if one at a time.” A correct transaction system often needs both.

Retry policy is part of the contract. A serialization failure should normally be retried from the beginning with fresh reads, while an ambiguous commit outcome needs idempotency or transaction identity before retry. Replaying only the final write step can preserve neither atomicity nor serial order.

Conflict Load Screen

If transaction arrival rate is:

\lambda_t

average overlap window is:

T_o

and probability of a meaningful invariant conflict is:

p_c

then a rough expected conflict pressure is:

E[C]=\lambda_tT_op_c

For:

\lambda_t=40\ \text{transactions/s},\quad T_o=0.25\ \text{s},\quad p_c=0.03

the screen is:

E[C]=40\cdot0.25\cdot0.03=0.3

This is enough to justify conflict metrics and retry behavior in testing.

Validation

Validation should include lost-update tests, write-skew tests, predicate reads, range conflicts, phantom reads, concurrent inserts, long transactions, retries after serialization failure, distributed participants, timeout recovery and mixed isolation levels.

Useful evidence includes read/write sets, conflict graph traces, serialization failures, retry counts, lock wait, deadlock aborts, invariant checks, transaction duration, p95 and p99 commit latency and proof that accepted histories satisfy the declared isolation level.

Failure Modes

Common failure modes include assuming snapshot isolation is serializable, protecting only rows but not predicates, retrying non-idempotent transactions after ambiguous commit, mixing isolation levels without marking the weaker path, holding locks across slow external calls, and testing only single-row conflicts.

A credible serializability claim should state the transaction boundary, the anomaly set being excluded, the concurrency-control mechanism and the evidence that real workloads cannot violate the invariants that the transactions are meant to protect.

REF

See also