Glossary term

Read-Copy-Update

Engineering definition of read-copy-update covering read-side snapshots, copy-and-publish updates, grace periods, quiescent states, reclamation and validation.

Definition

concept

Read-copy-update is a synchronization pattern in which readers access a published snapshot while writers copy, modify and publish a new version, then reclaim old versions only after a grace period.

Read-copy-update appears in operating systems, routing tables, configuration snapshots, telemetry maps, kernel data structures and read-mostly services when readers must be fast and updates can tolerate copy-and-publish cost. A useful design states the protected object, read-side critical section, publication pointer, update serialization rule, grace-period detection, quiescent-state definition, memory-ordering contract, reclamation backlog and validation evidence.

Read-copy-update is a synchronization pattern in which readers access a published snapshot while writers copy, modify and publish a new version, then reclaim old versions only after a grace period. It is useful for read-mostly data where reader latency matters more than immediate update reclamation.

RCU-style designs appear in kernels, routing tables, configuration snapshots, telemetry maps, registry caches and control-plane data structures. The core tradeoff is that readers can be very cheap, while writers and reclamation become more complex.

Snapshot Rule

Let the currently published pointer be:

P_{cur}

and the new version prepared by a writer be:

P_{new}

The writer updates a copy, then publishes:

P_{cur}\leftarrow P_{new}

Readers that already hold the old pointer may continue using it. New readers see the new pointer after publication.

Read-Side Critical Section

A read-side critical section should be short and should not block indefinitely. If reader duration is:

T_r

then the reclamation delay is bounded only when:

T_r\leq T_{r,max}

under the stated workload. A reader that stalls inside the read-side section can delay reclamation for every old version waiting behind it.

Grace Period

An old version cannot be freed until all readers that could have seen it have passed through a quiescent state. Let grace-period duration be:

T_g

The safe reclamation condition for version:

V_k

is:

\text{all pre-existing readers of }V_k\text{ have quiesced}

The engineering issue is not just correctness. Long grace periods retain memory, increase update lag and can hide stalled readers.

Reclamation Backlog

If update rate is:

\lambda_u

and each retired version consumes:

B_v

bytes until reclamation, then retained memory is approximately:

M_{ret}\approx \lambda_u T_g B_v

This term should be part of the capacity review. A design with excellent read latency can still fail by accumulating retired versions during burst updates or delayed quiescent states.

Memory Ordering

Publication must make the initialized contents of the new version visible before readers follow the new pointer. A typical ordering requirement is:

V_{init}\rightarrow P_{publish}\rightarrow P_{load}\rightarrow V_{read}

The exact primitives depend on the language, processor and runtime. A pointer swap without the right release/acquire relationship can expose partially initialized data.

Relationship To Other Primitives

RCU is different from a read-write lock. Readers usually do not exclude writers from preparing and publishing a new version; they only delay reclamation of old versions. RCU is also different from a sequence counter: a sequence counter lets readers detect concurrent updates, while RCU lets readers safely keep using an older published object.

RCU can be used inside lock-free designs, but it is not just “lock-free”. The progress and safety claim depends on grace-period detection, memory reclamation and reader discipline.

When It Does Not Fit

RCU is a weak fit when updates must be visible immediately, when readers need to modify shared objects, when memory is too tight to retain old versions, or when the system cannot detect quiescent states reliably. It is also risky when stale snapshots can trigger unsafe decisions. In those cases, a read-write lock, monitor, sequence counter or explicit message handoff may be easier to validate.

Failure Modes

Common failure modes include freeing an old version before the grace period ends, readers blocking inside read-side sections, missing memory barriers, updating an object in place instead of copying it, unbounded retired-object backlog, stale data exceeding its allowed age, writer serialization bugs and tests that never force delayed readers.

The most dangerous mistake is to validate only functional correctness. RCU failures often appear as rare use-after-free bugs, stale-read decisions or memory growth under long-running workloads.

Worked Check

Suppose a routing snapshot is updated every:

2\ \text{ms}

so:

\lambda_u=500\ \text{updates/s}

Each retired snapshot is:

B_v=4096\ \text{bytes}

and measured grace-period duration is:

T_g=7\ \text{ms}

The approximate retained retired memory is:

M_{ret}=500(0.007)(4096)=14336\ \text{bytes}

That backlog is small. If a stalled reader stretches the grace period to:

T_g=0.4\ \text{s}

then:

M_{ret}=500(0.4)(4096)=819200\ \text{bytes}

The synchronization design has become a memory-capacity problem.

Validation Evidence

Useful evidence includes reader-duration traces, grace-period distributions, quiescent-state detection tests, memory-ordering review, retired-object backlog limits, stale-data age measurements, update serialization tests, delayed-reader fault injection, sanitizer evidence and p99 reader latency.

A strong RCU review states what data may be stale, how long stale data is acceptable, how reclamation is proven safe and what monitor detects a reader that prevents grace periods from completing.

REF

See also