Rejected

Jun 26, 2026 · Rejected · result-transfer-shape

Experiment 200: Stable-Type selectBytes Moonshot

Date: 2026-06-26T08:58:26-04:00

Status: Rejected

Category: Moonshot

Direction:result-transfer-shape

Benchmark Run: focused benchmark/experiments/select_bytes_int_heavy.dart and benchmark/experiments/select_bytes_wide_cols.dart; no release-suite run because the runtime prototype was rejected and reverted.

Archive:archive/exp-200

Problem

Exp 190, 192, 194, 195, 198, and 199 removed most of the visible per-cell JSON

encoder overhead from write_json_to_buf: column-name tokens are cached, integer

and integer-valued REAL formatters write directly into the output buffer, and

the fixed-size row path pre-reserves capacity once per row.

One boundary remains in the inner loop: every cell still calls

sqlite3_column_type(stmt, i) before choosing the NULL / INTEGER / FLOAT / TEXT

/ BLOB writer. On fixed-shape result sets this looks redundant. The statement

and table declarations often imply stable per-column storage classes, and the

first row reveals the actual SQLite storage class the encoder will see.

Hypothesis

Assumption challenged: selectBytes() must inspect each row's SQLite storage

class cell by cell, even when a result set appears type-stable.

Prototype: record each column's storage class from row 0, then serialize later

rows using that cached class instead of calling sqlite3_column_type() again.

This intentionally allows more risk than an exploit experiment. The point is to

measure the ceiling for removing the type probe and to learn whether any hidden

default can be safe.

The kill condition is correctness, not just wall time: if ordinary SQLite

dynamic typing can make later rows carry a different storage class, the hidden

default is invalid no matter how much a fixed-type benchmark improves.

Approach

The archived prototype changed only write_json_to_buf:

allocate one per-query int[col_count] type vector,
fill it from sqlite3_column_type() while serializing row 0,
for row 1 and later, skip sqlite3_column_type() and dispatch on the cached

first-row type,

keep the existing row-level reservation and fixed-cell direct-write path from

exp 199.

No production-safe fallback was added. A fallback that rechecks the type before

using the cached class would pay the exact probe this experiment is trying to

remove. A production version would need a stronger proof surface: strict tables,

declared-type inspection with query-shape constraints, an explicit opt-in, or

some other semantic contract that SQLite's dynamic storage classes cannot

silently violate.

The runtime patch was archived and reverted from the branch. No unsafe code is

kept.

Results

Fixed-shape integer harness

Focused harness: dart run benchmark/experiments/select_bytes_int_heavy.dart.

Medians are one baseline pass followed by one candidate pass.

Lane	Baseline	Candidate	Delta
10k rows x 8 small ints	3056 us	2820 us	-7.7%
10k rows x 20 small ints	7626 us	8415 us	+10.3%
10k rows x 20 big ints (~18 digits)	8431 us	8756 us	+3.9%
10k rows x 8 mixed (4 int + 2 text + 2 real)	9985 us	10385 us	+4.0%
1k rows x 2 ints	136 us	145 us	+6.6%

This harness does not support accepting the idea. The narrow 8-column integer

lane improves, but every other lane regresses in the same pass. The per-query

type-vector allocation and altered branch shape appear to erase or exceed the

saved type call on the integer-heavy shapes that should have been the cleanest

win.

Wide-column harness

Focused harness: dart run benchmark/experiments/select_bytes_wide_cols.dart.

This produced a visible candidate-faster signal, but the confirmation baseline

also drifted much slower than the first baseline, so magnitude is not

load-bearing.

Shape	Baseline P1	Candidate P1	Delta P1	Candidate P2	Baseline P2	Delta P2
10k rows x 8 int cols	2.539 ms	2.181 ms	-14.1%	2.105 ms	3.547 ms	-40.7%
10k rows x 20 int cols	6.486 ms	5.485 ms	-15.4%	5.174 ms	7.058 ms	-26.7%
10k rows x 8 mixed cols	2.779 ms	2.635 ms	-5.2%	2.327 ms	4.022 ms	-42.1%
10k rows x 20 mixed cols	8.906 ms	6.251 ms	-29.8%	5.982 ms	8.091 ms	-26.1%
10k rows x 2 int cols	0.801 ms	0.685 ms	-14.5%	0.658 ms	0.870 ms	-24.4%

There is probably a real ceiling in the wide fixed-shape case: skipping one

SQLite type probe per cell is not free. But the mixed benchmark signal is not

enough to overcome the semantic failure below, and the integer-specific harness

does not reproduce a clean win.

Dynamic type hazard

A targeted local probe created a single untyped SQLite column, inserted an

INTEGER in row 1 and TEXT in row 2, and compared selectBytes() with

select():

 selectBytes prototype: [{"v":1},{"v":0}] select baseline:       [{"v":1},{"v":"abc"}]

The prototype cached row 0 as SQLITE_INTEGER, then serialized row 1 with

sqlite3_column_int64(). SQLite converted the text value "abc" to integer

0, producing wrong JSON.

This is not an edge condition outside SQLite's model. SQLite storage classes are

per value, not per column, unless a stronger schema contract is present and

proven usable by the encoder.

Validation:

 dart pub get dart run benchmark/experiments/select_bytes_int_heavy.dart dart run benchmark/experiments/select_bytes_wide_cols.dart

The dynamic-type probe was local-only and removed before publication.

Decision

Rejected as a hidden default.

The performance ceiling is plausible on fixed-shape wide rows, but the

assumption is not safe for ordinary SQLite result sets. Caching the first row's

storage class can silently corrupt selectBytes() output as soon as later rows

carry a different storage class. The only safe hidden implementation would need

to reintroduce per-cell verification or a proof layer whose complexity is not

justified by the current mixed performance evidence.

Future Notes

Do not skip sqlite3_column_type() in the general selectBytes() path based

only on the first row.

Reopen this frontier only with a real proof surface: SQLite STRICT tables,

declared-type plus query-shape constraints, a generated/static statement

mode, or an explicit user-visible contract.

If a proof surface exists, reuse select_bytes_wide_cols.dart as the fast

ceiling check, but require a dynamic-storage-class correctness guard before

any production code can be accepted.