Experiment 142: Single-row text parameter direct encoding

Date: 2026-06-08

Status: Rejected

Direction:parameter-encoding-and-binding

Benchmark Run: Tracelite A/B retest, exp-142-tracelite-single-row-text

Problem

PR #130 proposed applying the exp 125 / exp 126 direct text payload packing

pattern to the single-row allocateParams path. That path is used by

parameterized reads and single-row writes that do not go through the wide-batch

matrix encoder.

The old focused harness showed strong local wins on small string-heavy shapes,

but it was produced by the pre-Tracelite experiment loop. The reassessment

question was whether the change still looked merge-worthy when retested against

current origin/main with the integrated Tracelite baseline/candidate workflow

and peer guardrails.

Hypothesis

Removing the temporary List<Uint8List?> plus per-string Uint8List allocation

from allocateParams should improve standardized workloads that include

single-row string parameter binding, or at least stay neutral.

Accept only if Tracelite clears the primary resqlite improvement gate on the

closest current lanes and sqlite_async guardrails remain clean. Reject or

defer if the primary lanes are neutral/noisy or trend slower, because the change

adds more custom string-encoding logic to a hot binding path.

Approach

Created two resqlite worktrees from current origin/main at

1979ec3c4069ffa960df465971b23e5e53323768:

The candidate reapplied only the code and focused harness from PR #130:

Candidate patch shape:

byte lists,

native parameter buffer,

unchanged.

Before the A/B run, both retest worktrees resolved dependencies with ARM64 Dart.

The candidate then passed:

 /Users/dan/Coding/flutter_arm64/bin/cache/dart-sdk/bin/dart analyze \ lib/src/native/resqlite_bindings.dart \ benchmark/experiments/single_row_param_encoding.dart /Users/dan/Coding/flutter_arm64/bin/cache/dart-sdk/bin/dart test \ test/database_test.dart test/transaction_test.dart 

Ran the integrated Tracelite A/B workflow with pinned Tracelite

a2bf3648836fcf680d0aceccb18c2b31a2109586 and ARM64 Dart:

 /Users/dan/Coding/flutter_arm64/bin/cache/dart-sdk/bin/dart run \ benchmark/run_tracelite_experiment.dart \ --dart=/Users/dan/Coding/flutter_arm64/bin/cache/dart-sdk/bin/dart \ --tracelite-root=/Users/dan/Coding/tracelite \ --baseline-root=/Users/dan/.codex/worktrees/retest-exp142-baseline \ --candidate-root=/Users/dan/.codex/worktrees/retest-exp142-candidate \ --label=exp-142-tracelite-single-row-text \ --direction=parameter-encoding-and-binding \ --suite-scenarios=chat-sim,narrow-batch-insert \ --policy-scenarios=chat-sim,narrow-batch-insert \ --interfaces=sqlite_async,resqlite \ --guardrail-peers=sqlite_async \ --runs=2 \ --min-repetitions=5 \ --max-repetitions=12 \ --out-dir=build/tracelite-experiments/exp-142-tracelite-single-row-text 

Artifacts:

Results

The wrapper collected both sides and wrote the decision artifacts:

stepstatus
baseline suite historyok
candidate suite historyok
decision artifactinconclusive

Tracelite decision policy:

fieldvalue
expectationimprovement
primary threshold38.0%
max guardrail regression28.5%
max CV28.5%

Primary comparisons:

rolescenariopeermetricbaselinecandidatechangemax CVpstatuseffect
primarychat-simresqlitemeasured_elapsed_ns15.7 ms16.7 ms+6.86%24.2%0.716neutralinconclusive
primarynarrow-batch-insertresqlitemeasured_elapsed_ns11.8 ms13.7 ms+16.4%27.5%0.057neutralinconclusive

Guardrail comparisons:

rolescenariopeermetricbaselinecandidatechangemax CVpstatuseffect
guardrailchat-simsqlite_asyncmeasured_elapsed_ns27.4 ms27.8 ms+1.41%20.3%0.460neutralpass
guardrailnarrow-batch-insertsqlite_asyncmeasured_elapsed_ns12.1 ms11.6 ms-3.68%15.3%0.800neutralpass

Decision insights:

severityfindingdetail
warningDecision is inconclusiveEvidence is not strong enough for a production decision.
warningPrimary metric did not clearchat-sim changed +6.86% with max CV 24.2%, p=0.716, 95% delta CI -1.02 ms..3.17 ms.
warningPrimary metric did not clearnarrow-batch-insert changed +16.4% with max CV 27.5%, p=0.057, 95% delta CI -575 us..4.44 ms.

The baseline and candidate explain artifacts also warned that these short

lanes are partly harness-dominated. That limits how much we can infer about the

exact micro-cost, but it does not create positive evidence for carrying more

custom binding code: the closest standardized lanes did not improve and both

resqlite primary rows moved slower.

The decision graph-data export validated and produced:

datasetrows
scenario_series2240
peer_summary16
decision_summary1
decision_comparisons4

Decision

Reject. Do not carry the allocateParams direct text encoding code from PR

#130.

The old focused harness remains a useful clue that single-row text parameter

encoding can be made faster in isolation, but the current Tracelite retest did

not convert that clue into production evidence. On the closest available

standard lanes, the candidate was neutral/inconclusive and directionally slower:

+6.86% on chat-sim and +16.4% on narrow-batch-insert.

This reinforces exp 146's parameter-encoding lesson: small and narrow binding

changes should stay on the generic path unless a current workload shows

parameter encoding is material and a Tracelite A/B decision clears the primary

gate. If a future app profile shows high-frequency single-row string parameter

binding as a real bottleneck, add a Tracelite scenario or profile lane that

directly covers that shape before retrying this implementation.

Future Notes

The current Tracelite scenario set does not perfectly isolate the original

focused harness shape from PR #130. That is a reason to avoid overclaiming a

runtime regression, not a reason to merge the code. A future retry should first

add or select a scenario where single-row string parameter encoding is a

material fraction of measured time, then rerun the integrated A/B workflow.

Validation

both retest worktrees