Exercise set

Algorithmic Performance and Data Structures Exercises

Solved algorithmic performance exercises for Big-O growth, data structures, amortization, memory footprint, latency, throughput and scaling decisions.

These exercises practise algorithmic performance and data-structure selection as engineering decisions. The goal is not to recite a growth class, but to connect operation count, memory layout, service time, queueing, benchmark evidence and release risk.

Assume simplified screening models unless an exercise states otherwise. Real systems should also validate input distributions, compiler and runtime versions, data layout, cache behavior, allocator behavior, concurrency level, instrumentation overhead and tail latency.

How to Use These Exercises

For each problem, state what the input size means, choose the relevant cost model, calculate the result, and then decide whether the result is usable evidence. A calculation that passes a mean throughput target can still fail a p99 latency, memory or worst-case requirement.

The recurring checks are:

  1. growth rate and measured constants;
  2. memory footprint and data movement;
  3. average, amortized and worst-case behavior;
  4. latency, throughput and queueing impact;
  5. release guard bands for growth, rare inputs and measurement uncertainty.

Release Evidence Notes

Algorithmic performance evidence should state the workload, input-size distribution, operation mix, data representation, runtime version, hardware class, compiler or interpreter settings, measurement method and release target. A benchmark is weak evidence if it only proves a happy-path mean time for a small synthetic input.

Data-structure release evidence should include memory footprint, locality, allocation pattern, resize behavior, worst-case or adversarial behavior, concurrency assumptions and tail-latency effect. A structure that passes average throughput may still be rejected when one resize, recursion depth, cache miss pattern or lock interaction violates the operational envelope.

Engineering Boundary Notes

These exercises use simplified growth, memory and queueing models. They do not replace production load testing, profiling, concurrency validation, cache analysis, garbage-collection review, adversarial-input testing or formal complexity proof. Big-O class, benchmark constants and queueing estimates should be interpreted inside the actual system boundary.

Common Release Mistakes

  • selecting an algorithm from asymptotic class alone while ignoring constants and data size;
  • accepting average latency while p95 or p99 latency fails;
  • ignoring resize pauses, recursion depth, allocator behavior and cache locality;
  • benchmarking sorted, tiny or synthetic data while production inputs are skewed;
  • comparing data structures without memory, update pattern and concurrency cost;
  • extrapolating beyond the measured input range without a growth guard.

Scenario Map

ScenarioExercisesPrimary checkEngineering decision
Growth and break-even analysis1, 2, 5, 8Big-O, constants, logarithms and measured scalingDecide whether an algorithm remains acceptable as data grows.
Data structures and memory footprint3, 4, 6, 7, 9, 10, 14, 15, 16Amortization, load factor, locality, graph storage, recursion and indexingChoose a representation that fits memory, latency and update constraints.
Service performance and deadlines11, 12, 13, 17Throughput, queue response, batch deadline and parallel speedupCheck whether the design has enough operational margin.
Release decision evidence18p99 latency, memory, resizing risk and benchmark limitsSelect the safer implementation even when its average looks slower.

Validation Package Checklist

  • workload, input-size range and operation mix are defined;
  • benchmark environment, runtime, build settings and instrumentation overhead are recorded;
  • mean, tail latency, throughput and memory footprint are separated;
  • growth extrapolation is checked against measured data or a defensible model;
  • resize, allocation, recursion and adversarial-input risks are bounded;
  • concurrency, cache locality and data movement assumptions are stated;
  • release decision names the accepted structure, rejected alternative and remaining monitoring trigger.

Exercise 1: Compare Growth at a Fixed Input Size

A review compares three implementations for a workload with n=1{,}000{,}000 records:

ImplementationCost model
A80n\ \text{ns}
B12n\log_2 n\ \text{ns}
C0.001n^2\ \text{ns}

Estimate the runtime of each implementation and rank them.

Solution

For n=1{,}000{,}000:

\log_2 n \approx 19.93

Implementation A:

T_A = 80 \times 1{,}000{,}000 = 80{,}000{,}000\ \text{ns}=80\ \text{ms}

Implementation B:

T_B = 12 \times 1{,}000{,}000 \times 19.93 \approx 239{,}160{,}000\ \text{ns}=239\ \text{ms}

Implementation C:

T_C = 0.001 \times (1{,}000{,}000)^2 = 1{,}000{,}000{,}000\ \text{ns}=1.0\ \text{s}

The ranking is A, then B, then C.

Engineering Comment

Big-O alone would say A is linear, B is linearithmic and C is quadratic. The constants matter at a fixed size, but they do not rescue the quadratic implementation once n grows. Implementation C may still be acceptable for small test data and still be the wrong production design.

Plausibility Check

The result is plausible because n^2 creates 10^{12} scaled operations, while n\log_2 n is only about 2.0 \times 10^7 scaled terms.

Exercise 2: Find a Break-Even Input Size

Algorithm A has cost:

T_A(n)=40n\log_2 n\ \text{ns}

Algorithm B has cost:

T_B(n)=0.004n^2\ \text{ns}

Estimate the input size where the two algorithms take the same time.

Solution

Set the two costs equal:

40n\log_2 n = 0.004n^2

For n>0, divide by n:

40\log_2 n = 0.004n

so:

\displaystyle \frac{n}{\log_2 n}=10{,}000

Try n=175{,}000:

\log_2(175{,}000)\approx 17.42
\displaystyle \frac{175{,}000}{17.42}\approx 10{,}045

The break-even point is therefore about:

n \approx 174{,}000 \text{ to } 175{,}000

Engineering Comment

Below the break-even point, the quadratic method can be competitive if its constant is very small. Above it, the linearithmic method dominates. A production decision should compare the break-even point with expected data growth, not only today’s benchmark.

Plausibility Check

At n=100{,}000, n/\log_2 n \approx 6{,}020, so the quadratic method may still be competitive. At n=200{,}000, the ratio is about 11{,}360, so the linearithmic method should be better.

Exercise 3: Amortized Cost of a Dynamic Array

A dynamic array starts with capacity 1024 and doubles whenever it fills. You append 50{,}000 elements. Count one write for each new appended element and one write for each copied element during resizing. Estimate the average number of element writes per append.

Solution

Resizes copy the old capacities:

1024+2048+4096+8192+16384+32768=64{,}512

New appended elements require:

50{,}000

Total element writes:

50{,}000+64{,}512=114{,}512

Average writes per append:

\displaystyle \frac{114{,}512}{50{,}000}=2.29

Engineering Comment

The occasional resize is expensive, but the average append cost stays constant. This is the meaning of amortized O(1) append. It is still not a real-time guarantee: one unlucky append can copy 32{,}768 elements.

Plausibility Check

The average is a little above two writes per append, which matches the usual doubling behavior. The final capacity is 65{,}536, so the copied total below one final capacity is plausible.

Exercise 4: Hash Table Load Factor and Resize Margin

A hash table has 131{,}072 slots and resizes at load factor 0.70. It currently stores 82{,}000 keys. Calculate the current load factor and how many additional keys can be inserted before resizing.

Solution

Current load factor:

\displaystyle \alpha=\frac{82{,}000}{131{,}072}\approx 0.626

Resize threshold:

0.70 \times 131{,}072 = 91{,}750.4

The largest integer count before exceeding the threshold is about 91{,}750. Remaining inserts:

91{,}750-82{,}000=9{,}750

Engineering Comment

The table is not full, but it has limited burst margin. If the workload can insert more than about 9{,}750 keys during a latency-critical window, capacity should be reserved before the burst or resizing should be moved out of the hot path.

Plausibility Check

A load factor of 0.626 is below 0.70, so the table should not resize immediately. The remaining capacity to the threshold is about 7.4\% of the table size, which is about 9700 slots.

Exercise 5: Binary Search Comparison Count

A sorted array contains 12{,}000{,}000 records. Estimate the maximum number of binary-search iterations and compare it with an average linear scan.

Solution

For binary search:

k=\left\lceil \log_2(n+1) \right\rceil

With n=12{,}000{,}000:

k=\left\lceil \log_2(12{,}000{,}001) \right\rceil = \left\lceil 23.52 \right\rceil = 24

An average successful linear scan checks about:

\displaystyle \frac{n}{2}=6{,}000{,}000

records.

Engineering Comment

Binary search is not just faster in average runtime. It also gives a predictable upper bound for this lookup model. That predictability is useful in latency budgets and embedded lookup tables.

Plausibility Check

2^{24}=16{,}777{,}216, which is enough to cover 12{,}000{,}000 records. The iteration count is therefore reasonable.

Exercise 6: Dense Records Versus Pointer-Heavy Nodes

A telemetry store needs 2{,}000{,}000 records. A dense array record uses 16 bytes. A linked-node representation stores the same payload plus an 8 byte pointer and 16 bytes of allocator metadata per node. Estimate both memory footprints.

Solution

Dense array:

M_{\text{array}}=2{,}000{,}000 \times 16=32{,}000{,}000\ \text{bytes}

Pointer-heavy node:

M_{\text{node}}=2{,}000{,}000 \times (16+8+16) =80{,}000{,}000\ \text{bytes}

Additional memory:

80{,}000{,}000-32{,}000{,}000=48{,}000{,}000\ \text{bytes}

Engineering Comment

The pointer-heavy structure uses 2.5 times the memory before considering poorer locality. It may still be justified for specific insertion patterns, but it should not be selected just because insertion into a linked list looks O(1) on paper.

Plausibility Check

The node adds 24 bytes of overhead to a 16 byte payload. The overhead is larger than the payload, so a large memory increase is expected.

Exercise 7: Sequential Scan Versus Pointer Chasing

A scan processes 10{,}000{,}000 samples of 8 bytes each. A cache line is 64 bytes. In a packed array, each cache line holds 8 samples. In a pointer-chasing layout, assume each sample effectively pulls a separate cache line. Estimate the cache-line traffic.

Solution

Packed array lines:

\displaystyle \frac{10{,}000{,}000}{8}=1{,}250{,}000\ \text{lines}

Packed traffic:

1{,}250{,}000 \times 64 = 80{,}000{,}000\ \text{bytes}

Pointer-chasing lines:

10{,}000{,}000\ \text{lines}

Pointer-chasing traffic:

10{,}000{,}000 \times 64=640{,}000{,}000\ \text{bytes}

Traffic ratio:

\displaystyle \frac{640}{80}=8

Engineering Comment

Both layouts can have the same O(n) complexity, but the memory traffic differs by a factor of eight in this simplified model. Data-structure selection must include locality and bandwidth, not only operation count.

Plausibility Check

The ratio equals the number of 8 byte samples that fit in one 64 byte cache line, so the result is internally consistent.

Exercise 8: Scaling a Sorting Benchmark

A sort implementation takes 2.4\ \text{s} for 1{,}000{,}000 records and is expected to scale as n\log_2 n. Estimate the runtime for 8{,}000{,}000 records. The release target is 20\ \text{s}.

Solution

Scale by:

\displaystyle \frac{8{,}000{,}000\log_2(8{,}000{,}000)} {1{,}000{,}000\log_2(1{,}000{,}000)}

Using:

\log_2(1{,}000{,}000)\approx 19.93

and:

\log_2(8{,}000{,}000)\approx 22.93

Ratio:

\displaystyle 8 \times \frac{22.93}{19.93}\approx 9.20

Runtime:

2.4 \times 9.20 \approx 22.1\ \text{s}

Engineering Comment

The estimate fails the 20\ \text{s} target before adding IO, allocation, cache effects or measurement uncertainty. The team should not approve the larger data size from the one-million-record benchmark alone.

Plausibility Check

The input grows by 8 times. An n\log n algorithm should grow by slightly more than 8 times, so 22.1\ \text{s} from 2.4\ \text{s} is plausible.

Exercise 9: Heap Priority Queue Operation Count

A scheduler processes 500{,}000 events. Each event requires one pop from a binary heap and two pushes. Estimate the heap-level comparison count if the heap size stays near 500{,}000 and each heap operation costs about \lceil \log_2 n \rceil comparisons.

Solution

Heap height estimate:

\left\lceil \log_2(500{,}000) \right\rceil = \left\lceil 18.93 \right\rceil =19

Operations per event:

1+2=3

Total comparisons:

500{,}000 \times 3 \times 19 = 28{,}500{,}000

If each comparison and movement step costs about 80\ \text{ns}:

28{,}500{,}000 \times 80\ \text{ns}=2.28\ \text{s}

Engineering Comment

The heap is probably acceptable for a batch scheduler, but the estimate may be too high for a hard real-time event path. The next review should check whether the event rate, deadline and jitter budget tolerate this operation count.

Plausibility Check

A heap with about half a million elements has a height near 19, so tens of millions of comparisons for 1.5 million heap operations is credible.

Exercise 10: Graph Storage Choice

A directed graph has 200{,}000 vertices and 1{,}200{,}000 edges. An adjacency-list representation stores one 4 byte destination and one 4 byte weight per edge, plus a 4 byte offset for each vertex and one final offset. Estimate the memory footprint and compare it with a bit-matrix representation.

Solution

Adjacency offsets:

(200{,}000+1)\times 4 \approx 800{,}004\ \text{bytes}

Edge records:

1{,}200{,}000 \times (4+4)=9{,}600{,}000\ \text{bytes}

Total adjacency-list memory:

10{,}400{,}004\ \text{bytes}

A bit matrix needs:

200{,}000^2 = 40{,}000{,}000{,}000\ \text{bits}

or:

5{,}000{,}000{,}000\ \text{bytes}

Engineering Comment

For a sparse graph, the adjacency list is vastly smaller. A bit matrix can give constant-time edge existence checks, but it is difficult to justify when it consumes about 5\ \text{GB} before weights and metadata.

Plausibility Check

The graph has only 1.2 million edges out of 40 billion possible directed pairs. It is sparse, so an adjacency list should be much smaller than a matrix.

Exercise 11: Throughput from Algorithmic Service Time

A service has six workers. Each request processes 50{,}000 items with measured service time:

s(n)=0.8\ \text{ms}+0.000002n\log_2 n\ \text{ms}

Estimate the maximum ideal throughput, ignoring queueing and coordination overhead.

Solution

For n=50{,}000:

\log_2 n \approx 15.61

Service time:

s=0.8+0.000002 \times 50{,}000 \times 15.61
s=0.8+1.561=2.361\ \text{ms}

A single worker can process:

\displaystyle \frac{1000}{2.361}\approx 423.5\ \text{requests/s}

Six workers can process ideally:

6 \times 423.5 \approx 2541\ \text{requests/s}

Engineering Comment

This is an upper bound, not a release claim. Queueing, memory bandwidth, locks, garbage collection, network overhead and p99 latency can reduce safe throughput. A production target should use a lower value with guard band.

Plausibility Check

A request takes a little over 2\ \text{ms} per worker, so one worker should handle a few hundred requests per second. Six workers near 2500\ \text{requests/s} is plausible.

Exercise 12: Queueing Impact of a Slower Algorithm

A single-threaded service receives \lambda=160 requests per second. Version A has mean service time 5\ \text{ms} and Version B has mean service time 3.5\ \text{ms}. Use the M/M/1 response-time estimate:

\displaystyle W=\frac{1}{\mu-\lambda}

where \mu=1/s. Compare the mean response time.

Solution

Version A:

s_A=0.005\ \text{s}
\displaystyle \mu_A=\frac{1}{0.005}=200\ \text{requests/s}
\displaystyle W_A=\frac{1}{200-160}=0.025\ \text{s}=25\ \text{ms}

Version B:

s_B=0.0035\ \text{s}
\displaystyle \mu_B=\frac{1}{0.0035}\approx 285.7\ \text{requests/s}
\displaystyle W_B=\frac{1}{285.7-160}\approx 0.00796\ \text{s}=8.0\ \text{ms}

Engineering Comment

Reducing service time by 30\% reduces mean response time much more than 30\% because Version A is close to saturation. Algorithmic improvements are especially valuable when utilization is high.

Plausibility Check

Version A has utilization 160/200=0.80, while Version B has utilization about 0.56. Lower utilization should produce much shorter queueing delay.

Exercise 13: Batch Deadline and Growth Margin

A batch job has fixed setup time 1.2\ \text{s} and per-record processing time 4.5\ \mu\text{s}. It must finish within 5.0\ \text{s}. Check whether 600{,}000 records pass, then estimate the record count where the deadline is reached.

Solution

Runtime for 600{,}000 records:

T = 1.2 + 600{,}000 \times 4.5 \times 10^{-6}
T = 1.2+2.7=3.9\ \text{s}

The margin is:

5.0-3.9=1.1\ \text{s}

Deadline count:

5.0 = 1.2+n \times 4.5 \times 10^{-6}
\displaystyle n=\frac{3.8}{4.5\times 10^{-6}}\approx 844{,}444

Engineering Comment

The current batch passes, but the margin is finite. If data growth can reach 900{,}000 records, the job will violate the deadline unless the algorithm, batching or hardware is improved.

Plausibility Check

600{,}000 records consume 2.7\ \text{s} of processing time, so it is reasonable that the 5\ \text{s} deadline is reached below 1 million records after setup is included.

Exercise 14: Membership Filter False-Positive Rate

A membership filter stores n=100{,}000 keys using m=1{,}200{,}000 bits and k=8 hash functions. Estimate the false-positive probability:

p\approx \left(1-e^{-kn/m}\right)^k

The target is p \le 1\%.

Solution

Compute:

\displaystyle \frac{kn}{m}=\frac{8\times 100{,}000}{1{,}200{,}000}=0.6667

Then:

e^{-0.6667}\approx 0.513

So:

p\approx (1-0.513)^8 = 0.487^8
p\approx 0.0032 = 0.32\%

Engineering Comment

The filter passes the false-positive target. It is still unsuitable if false positives cause unsafe actions, irreversible deletes or billing errors. A membership filter is a screening structure, not proof that a key exists.

Plausibility Check

The design uses 12 bits per key, which is a reasonable size for a sub-1% false-positive target with multiple hash functions.

Exercise 15: Recursion Stack Risk

A recursive algorithm uses 96 bytes of stack frame per call. The stack limit is 8\ \text{MiB}. Compare balanced recursion depth \lceil \log_2(100{,}000)\rceil with worst-case depth 100{,}000.

Solution

Balanced depth:

\left\lceil \log_2(100{,}000) \right\rceil = \left\lceil 16.61 \right\rceil =17

Balanced stack use:

17 \times 96 = 1632\ \text{bytes}

Worst-case stack use:

100{,}000 \times 96 = 9{,}600{,}000\ \text{bytes}

The stack limit is:

8\ \text{MiB}=8 \times 1024^2=8{,}388{,}608\ \text{bytes}

Worst-case recursion exceeds the limit.

Engineering Comment

The same algorithm may be safe on balanced data and unsafe on adversarial or already ordered data. Release evidence should include worst-case input shape or use an iterative implementation, depth guard or robust pivot strategy.

Plausibility Check

About 96 bytes times 100{,}000 calls is close to 10\ \text{MB}, so exceeding an 8\ \text{MiB} stack is expected.

Exercise 16: Index Lookup Versus Linear Scan

A table has 4{,}000{,}000 records. A linear scan costs 30\ \text{ns} per record. An index lookup costs four random reads of 80\ \mu\text{s} each plus 20\ \mu\text{s} of CPU work. The service receives 500 lookups per second. Compare the two approaches.

Solution

Linear scan:

T_{\text{scan}}=4{,}000{,}000 \times 30\ \text{ns} =120{,}000{,}000\ \text{ns}=120\ \text{ms}

Index lookup:

T_{\text{index}}=4 \times 80\ \mu\text{s}+20\ \mu\text{s} =340\ \mu\text{s}

At 500 lookups per second, scan CPU-equivalent time would be:

500 \times 120\ \text{ms}=60\ \text{s/s}

Index time would be:

500 \times 0.34\ \text{ms}=170\ \text{ms/s}

Engineering Comment

The index is clearly required for this lookup rate. The remaining engineering question is not whether to index, but how to control index update cost, cache behavior, storage reliability and tail latency.

Plausibility Check

A 120\ \text{ms} scan cannot support 500 lookups per second on one service path. A sub-millisecond index lookup is consistent with the random-read model.

Exercise 17: Parallel Speedup Limit

A data-processing algorithm has parallel fraction p=0.92. Estimate the ideal speedup on 8 workers using Amdahl’s law:

\displaystyle S=\frac{1}{(1-p)+p/N}

If the single-worker runtime is 41\ \text{s} and the release target is 7\ \text{s}, does ideal parallelism pass?

Solution

For p=0.92 and N=8:

\displaystyle S=\frac{1}{0.08+0.92/8}
\displaystyle S=\frac{1}{0.08+0.115}=5.13

Ideal runtime:

\displaystyle T_8=\frac{41}{5.13}\approx 8.0\ \text{s}

Engineering Comment

Even ideal eight-worker execution fails the 7\ \text{s} release target. More workers alone may not be enough; the serial fraction, data movement, synchronization or algorithmic structure must improve.

Plausibility Check

With 8\% serial work, the maximum speedup even with infinite workers is 1/0.08=12.5. On only 8 workers, a speedup around 5 is plausible.

Exercise 18: Release Decision Between Two Data Structures

A service can use either a hash table or a sorted vector for a lookup-heavy path. Measurements are:

CandidateMean lookupp99 lookupMemoryRisk note
Hash table2\ \text{ms}35\ \text{ms}700\ \text{MB}resize and collision bursts
Sorted vector9\ \text{ms}12\ \text{ms}360\ \text{MB}rebuild during maintenance window

The release gate is p99 lookup \le 20\ \text{ms} and memory \le 512\ \text{MB}. Choose the safer candidate.

Solution

Check the hash table:

p99=35\ \text{ms} > 20\ \text{ms}

and:

700\ \text{MB} > 512\ \text{MB}

The hash table fails both gates.

Check the sorted vector:

p99=12\ \text{ms} \le 20\ \text{ms}

and:

360\ \text{MB} \le 512\ \text{MB}

The sorted vector passes both gates.

Engineering Comment

The safer release choice is the sorted vector, even though its mean lookup is slower. Engineering performance decisions should be tied to explicit gates: p99 latency and memory capacity matter more here than average latency.

Plausibility Check

The sorted vector has a higher mean but tighter tail and lower memory footprint. That is a credible tradeoff when data is mostly read-only and rebuilds can happen outside the hot path.

REF

See also