Glossary term
Compare-and-Swap
Engineering definition of compare-and-swap covering atomic update semantics, CAS retry cost, memory ordering, ABA risk, lock-free design and validation evidence.
Definition
methodCompare-and-swap is an atomic read-modify-write operation that updates a memory location only if its current value equals an expected value.
Compare-and-swap is used in operating systems, lock-free data structures, embedded firmware, reference counters, state machines and concurrent services when a design needs conditional update without holding a traditional mutex. A useful review states the protected invariant, expected value, desired value, retry policy, memory-ordering semantics, contention level, ABA protection, progress guarantee and validation evidence.
Compare-and-swap is an atomic read-modify-write operation that updates a memory location only if its current value equals an expected value. It is usually abbreviated as CAS.
CAS is used when several threads or cores may try to update the same state without a traditional mutex. It can protect a counter, pointer, state flag, reference count or queue link, but only if the surrounding invariant, memory ordering and retry behavior are correct.
Atomic Operation
A CAS operation can be written conceptually as:
where E is the expected value and D is the desired new value. It succeeds only when:
and then atomically writes:
If the current value is different, the operation fails and usually returns the observed value or a failure indication.
Protected Invariant
CAS does not protect arbitrary code. It protects one conditional transition. The design must state the invariant being updated, such as “advance this state from pending to running only once” or “replace this pointer only if no other thread changed it.”
If the invariant spans multiple fields, a single-word CAS may be insufficient. The design may need a lock, transaction, version tag, sequence counter, double-width CAS or single-owner queue.
Retry Cost
CAS is often used in a retry loop. If the probability that one attempt succeeds is:
then a simple expected attempt count is:
If one attempt costs:
the expected update time is:
For:
and:
the expected update time is:
If contention drops success probability to 0.25, the expected time becomes 360 ns.
Contention Limit
CAS can perform well at low contention and poorly when many workers repeatedly modify the same cache line. Failed attempts still consume memory bandwidth, coherence traffic and CPU cycles.
A first utilization screen for CAS retries is:
where f_cas is the logical update rate. This screen is optimistic if failed attempts invalidate peer caches or trigger backoff.
Memory Ordering
CAS atomicity is not the same as full memory visibility. The operation may need acquire, release or acquire-release semantics depending on what data must be visible before or after the state transition.
For a producer handoff, the intended relation may be:
For a consumer, the matching relation may be:
The language memory model, compiler, processor and device bus determine which primitive proves that relation.
ABA Risk
CAS compares values, not history. A location can change from A to B and back to A, causing a later CAS to succeed even though the object changed in between. This is the ABA problem.
Mitigations include version tags, sequence counters, hazard pointers, epoch reclamation, reference-count discipline, pointer tagging or designs that avoid reusing identifiers until observers are safe. The required mitigation depends on whether the value is a counter, pointer, descriptor, queue node or state flag.
Progress Limits
CAS-based code is often called lock-free, but a CAS loop can still retry many times under contention. Lock-free does not mean wait-free, starvation-free or deadline-safe. A real-time design must bound retry count or provide a fallback path.
For deadline work, a screen should include:
and verify:
Validation Evidence
Useful evidence includes the invariant, atomic width, memory-ordering mode, generated instruction, target architecture, cache-line placement, contention level, retry-count distribution, maximum retry count, ABA mitigation, randomized scheduling, stress test, fault injection, sanitizer output and trace evidence around failures.
The strongest test fails when CAS is replaced with a non-atomic read-modify-write, then passes with the intended CAS and memory ordering. It should also test the high-contention case, because correctness can pass while retry cost violates latency or throughput requirements.
Relationship To Neighbor Terms
Race condition is the failure mode CAS may prevent when the invariant fits an atomic conditional transition. Memory barrier defines ordering and visibility around the CAS. Lock contention is the cost CAS may avoid, while livelock and task starvation are risks when retries are uncontrolled. Sequence counters and idempotency keys can make stale observations or duplicate side effects detectable.