Experiment 118: FIFO dispatch waiters with counter gate

Date: 2026-05-01

Status: In Review

Direction:stream-rerun-dispatch

Problem

Experiment 114 showed that the old shared-completer reader-pool wakeup could be

structurally wasteful: one worker-free event woke every parked dispatcher, one

caller won the slot, and the rest scanned the pool and re-parked. That result

was rejected after rebasing because the release stream workloads stopped

exercising the parked-dispatcher path once exp 106 elided most stream re-runs

before reader-pool admission.

Experiment 115 closed the measurement gap by adding profile-only counters for

parked dispatchers, wake retries, and max parked concurrency. That makes the

same dispatch-policy change evaluable without relying first on a noisy wall-time

delta.

Hypothesis

Replacing the single shared dispatch completer with FIFO one-shot waiters should

turn each worker-free event into one resumed dispatcher. On a workload that

exceeds the four-reader pool, dispatcherWakeRetryTotal should drop to zero

while dispatcherMaxParkedConcurrent still proves the workload reached the

parked path.

Accept for PR review if:

Approach

ReaderPool now keeps a Queue<Completer<void>> of parked dispatch waiters

instead of one shared _workerAvailable completer.

eliminates re-park wake amplification.

Counter comments and the dispatcher profile harness wording were updated so

they describe both the old shared-completer baseline and the new FIFO behavior.

Results

Profile-mode command:

 /Users/dan/Coding/flutter_arm64/bin/dart run -DRESQLITE_PROFILE=true benchmark/profile/dispatcher_park_profile.dart 

Median of three full harness runs:

concurrencybaseline parkedFIFO parkedbaseline retriesFIFO retriesbaseline max parkedFIFO max parkedbaseline wallFIFO wall
810460440.34 ms0.32 ms
16781266012120.78 ms0.60 ms
3240628378028281.20 ms1.28 ms

The direct signal is clean:

concurrency - pool_size shape.

candidate exercised the same parked-dispatcher depth.

Release concurrent-read command:

 /Users/dan/Coding/flutter_arm64/bin/dart run benchmark/suites/concurrent_reads.dart 

Single-pass release medians for resqlite:

concurrencybaseline wallFIFO walldelta
10.301 ms0.347 ms+15.3 %
20.325 ms0.326 ms+0.3 %
40.400 ms0.402 ms+0.5 %
80.863 ms0.724 ms-16.1 %

The release wall-time result is supportive only at 8x and too small/noisy to be

the primary acceptance signal. The profile counters are the decision signal.

Decision

Keep in review. The experiment succeeds on the newly enabled counter gate:

FIFO dispatch eliminates wake amplification on a workload that demonstrably

parks past the worker count. The production diff is small and release

concurrent reads do not show an obvious targeted regression, but the PR should

still be reviewed as a behavior-preserving reader-pool scheduling change.

Future Notes

The next dispatch experiment should not try another queue policy until it has a

workload whose dispatcherWakeRetryTotal or dispatcherMaxParkedConcurrent

shows remaining headroom. If this PR merges, reader-pool wake amplification is

no longer the main dispatch signal; future work should look for admission,

completion batching, or stream-rerun sources that still produce measurable

parking.