Exercise set
Concurrency Locking, Thread Pool, and Race Condition Exercises
Solved concurrency exercises for lock contention, thread pools, queues, semaphores, condition variables, race conditions, deadlock and release gates.
These exercises focus on shared-state concurrency: locks, semaphores, queues, thread pools, event loops, race conditions, deadlock and release gates. Operating-system scheduling and distributed-system consistency are handled in separate specialist exercise sets.
Use the numbers as engineering screens. A concurrency fix needs invariants, traces, stress tests, reproducibility evidence, rollback criteria and monitoring tied to the specific failure mode.
Release Evidence Notes
Concurrency evidence should prove the invariant under load, not only show that a test stopped failing once. Useful evidence includes contention histograms, trace spans, lock-order checks, queue depth, saturation alarms, race reproduction, mitigation proof and retest results.
Engineering Boundary Notes
These exercises assume the shared state, critical section, queue and worker pool are known. If external calls, storage transactions or distributed locks are involved, the release gate must also include network and consistency evidence.
Common Release Mistakes
- measuring average lock wait but ignoring tail contention;
- increasing thread count until CPU thrashes;
- treating a timeout as a race-condition fix;
- using a semaphore without proving downstream capacity;
- replacing a race with a global lock that causes convoying;
- validating a queue with no burst or cancellation test.
Scenario Map
| Scenario | Main calculation | Release decision |
|---|---|---|
| Lock contention | hold time, wait time and convoy risk | Split lock, reduce hold time or accept. |
| Worker pools | demand, blocked workers and saturation | Resize pool or shed load. |
| Queues | arrival, service and backlog growth | Add backpressure or capacity. |
| Race conditions | lost updates and invariant checks | Fix transition and retest. |
| Release gate | concurrency metrics and evidence | Deploy or hold release. |
Validation Package Checklist
- invariant definition and race reproduction case;
- lock map, lock order and critical-section timing;
- thread-pool size, blocked-worker count and queue depth;
- stress-test workload, seed, duration and trace sample;
- mitigation proof under old and new load profiles;
- rollback trigger and post-release concurrency alarms.
Exercise 1: Lock Hold Utilization
A mutex is acquired 900 times per second. Mean critical-section hold time is 0.42\ \text{ms}. Compute lock utilization.
Solution
Engineering Comment
At this utilization, tail wait can grow quickly under bursts. The lock should be profiled before adding more workers.
Plausibility Check
About one thousand holds per second at half a millisecond each is a large fraction of a second.
Exercise 2: Contention Wait Estimate
A single lock has utilization U_L=0.55 and mean hold time 0.30\ \text{ms}. Use the screening estimate:
Solution
Engineering Comment
This is only a queueing screen. Actual lock wait can be worse with priority inversion or long-tail critical sections.
Plausibility Check
Wait is slightly larger than hold time because utilization is above half.
Exercise 3: Lock Convoy Throughput
A global lock serializes a section that takes 2.5\ \text{ms}. What is the maximum throughput through that section?
Solution
Engineering Comment
Adding threads cannot exceed the serialized critical-section throughput.
Plausibility Check
Four operations fit in 10\ \text{ms}, so 400\ \text{per s} is consistent.
Exercise 4: Reader-Writer Lock Benefit
A workload has 80\% reads and 20\% writes. Read hold time is 0.20\ \text{ms} and write hold time is 0.70\ \text{ms}. Estimate mean exclusive-lock hold time if all operations use one mutex.
Solution
Engineering Comment
A reader-writer lock can help only if reads can truly run concurrently and writer starvation is controlled.
Plausibility Check
The mean is closer to the read time because reads dominate.
Exercise 5: Semaphore Capacity
A downstream API allows 35 concurrent requests. Current semaphore limit is 48. How many requests exceed the safe capacity?
Solution
Engineering Comment
A semaphore should represent the bottleneck capacity, not the caller’s desired parallelism.
Plausibility Check
The limit is above the API allowance, so excess is positive.
Exercise 6: Condition-Variable Wake Backlog
A consumer wakes every 25\ \text{ms} and processes 40 messages per wake. Producers add 2200 messages per second. Is backlog growing?
Solution
Consumer capacity:
Arrival exceeds service:
Backlog grows at 600\ \text{messages/s}.
Engineering Comment
The wake policy or consumer parallelism must change before release.
Plausibility Check
One wake every 25\ \text{ms} means 40 wakes per second.
Exercise 7: Spinlock CPU Burn
A spinlock wait loop consumes one core while waiting. Four threads each spin for 3\ \text{ms} every 100\ \text{ms}. Compute total core utilization burned by spinning.
Solution
Engineering Comment
Spinlocks are acceptable only for short, bounded waits. Millisecond-scale spins are usually a release smell.
Plausibility Check
Each thread burns 3\% of a core, four threads burn 12\%.
Exercise 8: Compare-And-Swap Retry Cost
A lock-free update costs 0.8\ \mu s per CAS attempt. Under contention, success probability per attempt is 0.40. Estimate expected time per successful update.
Solution
Expected attempts:
Expected time:
Engineering Comment
Lock-free code is not wait-free by default. Retry cost belongs in the throughput and latency budget.
Plausibility Check
Less than half success probability implies more than two attempts on average.
Exercise 9: Memory-Barrier Event Ordering
Producer writes data at time 10.0\ \mu s and publishes a flag at 10.2\ \mu s. Without a release barrier, a consumer may observe the flag before the data. What invariant must the barrier enforce?
Solution
The required order is:
The release/acquire pair must prevent the flag from becoming visible before the data.
Engineering Comment
This is a correctness requirement, not a performance optimization.
Plausibility Check
The flag announces the data, so data visibility must happen first.
Exercise 10: Ring-Buffer Occupancy
A ring buffer has 4096 slots. Arrival rate is 18000\ \text{items/s} and service rate is 17000\ \text{items/s} for a burst lasting 1.5\ \text{s}. Will it overflow if it starts empty?
Solution
Backlog added:
Since 1500<4096, it does not overflow during this burst.
Engineering Comment
This does not prove long-run stability because arrival exceeds service during the burst.
Plausibility Check
The burst adds about one third of the buffer.
Exercise 11: Thread-Pool Saturation
A pool has 32 workers. Average service time is 40\ \text{ms} and arrival rate is 650\ \text{requests/s}. Compute required busy workers.
Solution
The pool has 32 workers, so nominal headroom is 6 workers.
Engineering Comment
Nominal headroom can disappear if external calls block or tail service time rises.
Plausibility Check
Each request uses 0.04 worker-seconds, so hundreds per second need dozens of workers.
Exercise 12: Blocked Worker Fraction
In a 40-worker pool, telemetry shows 11 workers blocked on I/O. Compute blocked fraction.
Solution
Engineering Comment
Blocked workers reduce effective capacity. Async I/O, pool partitioning or backpressure may be needed.
Plausibility Check
About one quarter of the pool is blocked.
Exercise 13: Event-Loop Lag Budget
An event loop has target lag below 20\ \text{ms}. A synchronous handler runs for 14\ \text{ms} and a logging flush adds 9\ \text{ms}. Does the path pass?
Solution
Since 23>20\ \text{ms}, it fails.
Engineering Comment
Move blocking work off the event loop or bound it with an explicit budget.
Plausibility Check
The two synchronous parts exceed the lag target.
Exercise 14: Message-Queue Backpressure
A queue receives 1200 messages per second and drains 950 messages per second. The alert threshold is 15000 queued messages. Starting empty, estimate time to alert.
Solution
Engineering Comment
Backpressure should act before the alert threshold if the downstream service cannot catch up.
Plausibility Check
At 250 extra messages each second, 15000 takes one minute.
Exercise 15: Lost Update Race
Two threads read counter value 100. Each increments locally and writes back. What final value appears under a lost-update race, and what value is correct?
Solution
Both write:
Observed final value can be 101, while the correct serialized result is:
Engineering Comment
The invariant is not “the program did not crash”; it is that every accepted update is represented exactly once.
Plausibility Check
Two increments should add two, not one.
Exercise 16: Deadlock Timeout Screen
Two locks are acquired in opposite order by two code paths. A timeout releases one path after 2\ \text{s}. Does the timeout remove the deadlock design flaw?
Solution
No. The wait cycle still exists:
The timeout may break some incidents, but it does not prove safe ordering.
Engineering Comment
The engineering fix is a consistent lock order, lock hierarchy or nonblocking design.
Plausibility Check
Timeout changes duration, not the cyclic dependency.
Exercise 17: Livelock Retry Cap
Two workers retry a conflicting operation every 8\ \text{ms} with no backoff. A jittered backoff changes retry interval to a mean of 40\ \text{ms}. By what factor does retry pressure drop?
Solution
Retry rate is inversely proportional to interval:
Retry pressure drops by about 5\times.
Engineering Comment
Backoff helps only if the operation is idempotent or the retry state is safe.
Plausibility Check
Five times longer between retries means one fifth the pressure.
Exercise 18: Concurrency Release Gate
A concurrency fix leaves lock utilization at 37.8\%, thread-pool nominal headroom at 6 workers, event-loop lag at 23\ \text{ms} against 20\ \text{ms}, and a lost-update reproduction still fails once in 10^6 operations. Decide release status.
Solution
The event-loop gate fails:
The race-condition invariant is also not proven because the lost-update test still fails. Release should be held.
Engineering Comment
Performance headroom cannot compensate for an unproven shared-state invariant.
Plausibility Check
At least two hard gates fail, so the release decision is negative.