Errata

Errata

Corrections applied to the published posts after simulator fixes. This page documents what changed between the original runs and the current data backing each post. The qualitative conclusions are intact; the quantitative knees moved to slightly more pessimistic values.

The corrections below were consolidated on 2026-04-17 and rolled into both posts in the 2026-06-09 correction.

Scope

Simulator fixes that changed the numbers

Fixes landed in Endive between publication and the post-fix re-runs. Each one has an identifiable, directional effect on the curves.

Per-attempt I/O cost model

The biggest source of change. The per-attempt cost grew from 3 S3 operations (buggy) to 5 S3 operations (correct, non-inlined):

CommitDateChangeNet direction
9d1d8e12026-04-08Remove the manifest-file PUT from the per-attempt cost — that write was part of transaction runtime, not the commit protocol.faster commits
5c8aa302026-04-08Add the table_metadata read+write pair to the per-attempt cost. The TM was missing from the model entirely; every FA/VO attempt was one full RTT too cheap.slower commits (dominant effect)
ec383ff2026-04-08Failure-path TM read for overlap detection; free retry for cross-table/disjoint-partition conflicts (no manifest I/O); CAS size cap for inlined metadata.neutral for single-table; lowers latency for multi-table cross-table contention

Net change at S3 median (non-inlined, per attempt):

old (buggy):  MF_write(100K) + ML_read + ML_write           ≈ 43 + 27 + 60 = 130 ms
new:          TM_read + ML_read + ML_write + TM_write + CAS ≈ 27 + 27 + 60 + 60 + 1 = 175 ms

The published blog ran on the buggy ~130ms-per-attempt model. Its observed throughput was artificially inflated by double-counting a manifest-file write that the protocol doesn’t require, but also artificially deflated by skipping the table-metadata pair. The net effect of the fix is slower commits overall: the per-table ceiling drops from ~7.7 c/s to ~5.7 c/s at S3 medians.

Timing / information leaks

CommitDateChange
edbe6da2026-03-20CAS version check now evaluates at half-RTT (server-side), not at full RTT. Removes a physically-impossible fast path where a client’s CAS could succeed against a catalog state that hadn’t yet propagated back to it.
a6e908e2026-04-09catalog.read() also split-yields: the snapshot returned is what the catalog held at half-RTT, not at the client’s wall-clock time. Symmetric to edbe6da.

These two fixes are why the saturation ceiling now sits on the 1/(5L) theoretical bound (5.7 c/s observed vs. 5.71 c/s predicted) rather than above it. Before these fixes, the simulator could produce commits faster than message delays physically allow.

VO IO-convoy accounting

CommitDateChange
47cbb112026-03-19Convoy reads N−1 historical manifest lists where N is the table version delta, not the global catalog sequence delta.
092e4892026-03-19Deduplicate the per-attempt ML read from the convoy’s N ML reads (was charged twice per attempt).
fa517532026-04-08Convoy decomposed per-table: Σ_table (V_table − 1) · M_table instead of a global V_global · M.

Effect is strongest on multi-table VO tails (exp4b). On a single table (exp2 FA=0.0), the 47cbb11 N−1 correction and the 092e489 de-dup lower pure-VO P99 modestly (219→203 s at IA=200 ms); the fa51753 per-table decomposition is a no-op with only one table. The separate per-attempt TM pair raises per-attempt cost, but that effect is accounted for under Per-attempt I/O cost above, not here.

Config drift (pre-run correction, 2026-04-13)

The blog’s exp1–4 configs had table_metadata_inlined = true set, which was a no-op until commit f1ad9ef (2026-03-26) made the flag actually do something. From that point forward, every re-run silently used inlined metadata (1/(3L) bound, ~11.4 c/s ceiling). Fixed pre-run by d55d3ce (2026-04-13), which flips the flag to false across exp1–4 templates. The post-fix runs use non-inlined metadata, matching the blog’s original intent.

The drift was undetected for ~18 days because expctl’s staleness check compared stored cfg.toml to the directory hash and simulator code, not to the template. Addressed by template-hash stamping (0e961fd, 5351c79).

Simulator is strictly serial

The parallel-I/O footnote in the 2026-03-09 post (“up to 4 I/O operations can run in parallel”) was inaccurate: the simulator issues all storage I/O serially. The max_parallel = 4 knob in the configs was never consumed by endive/. The footnote and surrounding sentence have been removed.

Quantitative deltas

Simulating Catalog and Table Conflicts (2026-03-09)

MetricPublishedCorrected
FA single-table throughput ceiling~7.7 c/s (19% success @ 7.8 c/s)5.7 c/s (14% success)
FA practical ceiling (99%+ success)~2.7 c/s2.0 c/s
FA P50 at low load320 ms410 ms
FA P99 at saturation1.89 s2.58 s
VO P99 at IA=200 ms (pure VO)219 s203 s (convoy fixes)
FA success @ 50 ms IA42%33%

The “3–4 commits/sec” language in the original post has been changed to “2–3 commits/sec”. The per-attempt cost explanation was rewritten from “~300 ms retry” to “five S3 round-trips per attempt, ~175 ms at S3 median latencies”.

Simulating Multi-Table Contention (2026-03-23)

Single-table knees at >95% VO success:

ProviderPublished FA-onlyCorrected FA-onlyPublished 90/10Corrected 90/10
S3 Express14.6 c/s14.6 c/s7.5 c/s7.4 c/s
S3 Standard2.4 c/s1.8 c/s1.8 c/s1.8 c/s
Azure Premium2.5 c/s2.4 c/s1.9 c/s1.8 c/s
Azure Standard2.4 c/s1.8 c/s1.5 c/s1.5 c/s
GCP0.7 c/s0.4 c/s0.4 c/s0.4 c/s

The biggest structural change is the multi-table FA-only knee for S3 Standard and Azure Premium at 5–20 tables:

ProviderTablesPublishedCorrectedΔ
S3 Standard5–207.2–7.4 c/s3.7 c/s−50%
S3 Standard507.4 c/s7.2 c/s−3%
Azure Premium5–207.2–7.4 c/s3.7 c/s−50%
Azure Premium507.4 c/s3.7 c/s−50%
Azure Standard10–503.7 c/s3.7 c/s
GCP503.6 c/s0.7 c/s−81%
S3 Express5014.9 c/s36.0 c/s+142% (free-retry for cross-table CAS failures)

For S3/Azure at 5–20 tables the published numbers were catalog-CAS-bound at ~7.4 c/s; with the non-inlined per-table ceiling now at 5.7 c/s, the per-table bound binds first and the knee flattens to 3.7 c/s. Only at 50 tables does catalog CAS again become the bottleneck (for S3 Standard FA-only; Azure Premium sits at the per-table bound through 50 tables).

S3 Express at 50 tables jumps to 36 c/s because its ~10 ms per-op latency keeps the per-table bound very high; with the ec383ff free-retry fix, cross-table CAS failures no longer charge manifest I/O, so the catalog handles more commits.

GCP’s multi-table scaling is weaker than presented: 0.4 c/s (1 table) → 0.7 c/s (50 tables) FA-only, versus the published 0.7 c/s → 3.6 c/s. The per-op latency is high enough that the per-table bound binds through 50 tables.

What survives unchanged

  • Every qualitative conclusion in both posts.
  • “Sustained commit rates above 1–2 commits/sec are unattainable” — now exactly right rather than hedged.
  • “Storage I/O is the primary bottleneck for single-table workloads.”
  • “Catalog CAS latency up to 120 ms adds only modest overhead for single-table.”
  • “IO convoys serialize VO commit attempts; P99 reaches minutes at moderate rates.”
  • “More tables move the bottleneck from per-table metadata I/O to catalog contention” (qualitatively true; the crossover is now at ~50 tables for S3/Azure instead of ~5).
  • “Zipf concentration limits the benefit of adding tables; rank-1 table dominates.”
  • “S3 Express is in a different class for catalog-as-file workloads.”
  • “GCS is not viable for catalog-as-file workloads.”
  • The Iceberg write-path diagram and metadata-size assumptions (~1 MiB TM, ~100 KiB manifest list).
  • The per-provider latency distributions (S3, S3 Express, Azure, Azure Premium, GCS).

Full companion reports

  • EXP1-3_REPORT.md — post-fix validation and cell-by-cell comparison for exp1, exp2, exp3a, exp3b.
  • EXP4_REPORT.md — post-fix validation for exp4a, exp4b (uniform and Zipf), and exp4c across five storage providers.