Errata

Corrections applied to the published posts after simulator fixes. This page documents what changed between the original runs and the current data backing each post. The qualitative conclusions are intact; the quantitative knees moved to slightly more pessimistic values.

The corrections below were consolidated on 2026-04-17 and rolled into both posts in the 2026-06-09 correction.

Scope

Simulating Catalog and Table Conflicts (published 2026-03-09) — exp1, exp2, exp3a, exp3b.
Simulating Multi-Table Contention in Catalog Formats (published 2026-03-23) — exp4a, exp4b (including Zipfian variants), and exp4c.

Simulator fixes that changed the numbers

Fixes landed in Endive between publication and the post-fix re-runs. Each one has an identifiable, directional effect on the curves.

Per-attempt I/O cost model

The biggest source of change. The per-attempt cost grew from 3 S3 operations (buggy) to 5 S3 operations (correct, non-inlined):

Commit	Date	Change	Net direction
`9d1d8e1`	2026-04-08	Remove the manifest-file `PUT` from the per-attempt cost — that write was part of transaction runtime, not the commit protocol.	faster commits
`5c8aa30`	2026-04-08	Add the `table_metadata` read+write pair to the per-attempt cost. The TM was missing from the model entirely; every FA/VO attempt was one full RTT too cheap.	slower commits (dominant effect)
`ec383ff`	2026-04-08	Failure-path TM read for overlap detection; free retry for cross-table/disjoint-partition conflicts (no manifest I/O); CAS size cap for inlined metadata.	neutral for single-table; lowers latency for multi-table cross-table contention

Net change at S3 median (non-inlined, per attempt):

old (buggy):  MF_write(100K) + ML_read + ML_write           ≈ 43 + 27 + 60 = 130 ms
new:          TM_read + ML_read + ML_write + TM_write + CAS ≈ 27 + 27 + 60 + 60 + 1 = 175 ms

The published blog ran on the buggy ~130ms-per-attempt model. Its observed throughput was artificially inflated by double-counting a manifest-file write that the protocol doesn’t require, but also artificially deflated by skipping the table-metadata pair. The net effect of the fix is slower commits overall: the per-table ceiling drops from ~7.7 c/s to ~5.7 c/s at S3 medians.

Timing / information leaks

Commit	Date	Change
`edbe6da`	2026-03-20	CAS version check now evaluates at half-RTT (server-side), not at full RTT. Removes a physically-impossible fast path where a client’s CAS could succeed against a catalog state that hadn’t yet propagated back to it.
`a6e908e`	2026-04-09	`catalog.read()` also split-yields: the snapshot returned is what the catalog held at half-RTT, not at the client’s wall-clock time. Symmetric to `edbe6da`.

These two fixes are why the saturation ceiling now sits on the 1/(5L) theoretical bound (5.7 c/s observed vs. 5.71 c/s predicted) rather than above it. Before these fixes, the simulator could produce commits faster than message delays physically allow.

VO IO-convoy accounting

Commit	Date	Change
`47cbb11`	2026-03-19	Convoy reads `N−1` historical manifest lists where `N` is the table version delta, not the global catalog sequence delta.
`092e489`	2026-03-19	Deduplicate the per-attempt ML read from the convoy’s N ML reads (was charged twice per attempt).
`fa51753`	2026-04-08	Convoy decomposed per-table: `Σ_table (V_table − 1) · M_table` instead of a global `V_global · M`.

Effect is strongest on multi-table VO tails (exp4b). On a single table (exp2 FA=0.0), the 47cbb11 N−1 correction and the 092e489 de-dup lower pure-VO P99 modestly (219→203 s at IA=200 ms); the fa51753 per-table decomposition is a no-op with only one table. The separate per-attempt TM pair raises per-attempt cost, but that effect is accounted for under Per-attempt I/O cost above, not here.

Config drift (pre-run correction, 2026-04-13)

The blog’s exp1–4 configs had table_metadata_inlined = true set, which was a no-op until commit f1ad9ef (2026-03-26) made the flag actually do something. From that point forward, every re-run silently used inlined metadata (1/(3L) bound, ~11.4 c/s ceiling). Fixed pre-run by d55d3ce (2026-04-13), which flips the flag to false across exp1–4 templates. The post-fix runs use non-inlined metadata, matching the blog’s original intent.

The drift was undetected for ~18 days because expctl’s staleness check compared stored cfg.toml to the directory hash and simulator code, not to the template. Addressed by template-hash stamping (0e961fd, 5351c79).

Simulator is strictly serial

The parallel-I/O footnote in the 2026-03-09 post (“up to 4 I/O operations can run in parallel”) was inaccurate: the simulator issues all storage I/O serially. The max_parallel = 4 knob in the configs was never consumed by endive/. The footnote and surrounding sentence have been removed.

Quantitative deltas

Simulating Catalog and Table Conflicts (2026-03-09)

Metric	Published	Corrected
FA single-table throughput ceiling	~7.7 c/s (19% success @ 7.8 c/s)	5.7 c/s (14% success)
FA practical ceiling (99%+ success)	~2.7 c/s	2.0 c/s
FA P50 at low load	320 ms	410 ms
FA P99 at saturation	1.89 s	2.58 s
VO P99 at IA=200 ms (pure VO)	219 s	203 s (convoy fixes)
FA success @ 50 ms IA	42%	33%

The “3–4 commits/sec” language in the original post has been changed to “2–3 commits/sec”. The per-attempt cost explanation was rewritten from “~300 ms retry” to “five S3 round-trips per attempt, ~175 ms at S3 median latencies”.

Simulating Multi-Table Contention (2026-03-23)

Single-table knees at >95% VO success:

Provider	Published FA-only	Corrected FA-only	Published 90/10	Corrected 90/10
S3 Express	14.6 c/s	14.6 c/s	7.5 c/s	7.4 c/s
S3 Standard	2.4 c/s	1.8 c/s	1.8 c/s	1.8 c/s
Azure Premium	2.5 c/s	2.4 c/s	1.9 c/s	1.8 c/s
Azure Standard	2.4 c/s	1.8 c/s	1.5 c/s	1.5 c/s
GCP	0.7 c/s	0.4 c/s	0.4 c/s	0.4 c/s

The biggest structural change is the multi-table FA-only knee for S3 Standard and Azure Premium at 5–20 tables:

Provider	Tables	Published	Corrected	Δ
S3 Standard	5–20	7.2–7.4 c/s	3.7 c/s	−50%
S3 Standard	50	7.4 c/s	7.2 c/s	−3%
Azure Premium	5–20	7.2–7.4 c/s	3.7 c/s	−50%
Azure Premium	50	7.4 c/s	3.7 c/s	−50%
Azure Standard	10–50	3.7 c/s	3.7 c/s	—
GCP	50	3.6 c/s	0.7 c/s	−81%
S3 Express	50	14.9 c/s	36.0 c/s	+142% (free-retry for cross-table CAS failures)

For S3/Azure at 5–20 tables the published numbers were catalog-CAS-bound at ~7.4 c/s; with the non-inlined per-table ceiling now at 5.7 c/s, the per-table bound binds first and the knee flattens to 3.7 c/s. Only at 50 tables does catalog CAS again become the bottleneck (for S3 Standard FA-only; Azure Premium sits at the per-table bound through 50 tables).

S3 Express at 50 tables jumps to 36 c/s because its ~10 ms per-op latency keeps the per-table bound very high; with the ec383ff free-retry fix, cross-table CAS failures no longer charge manifest I/O, so the catalog handles more commits.

GCP’s multi-table scaling is weaker than presented: 0.4 c/s (1 table) → 0.7 c/s (50 tables) FA-only, versus the published 0.7 c/s → 3.6 c/s. The per-op latency is high enough that the per-table bound binds through 50 tables.

What survives unchanged

Every qualitative conclusion in both posts.
“Sustained commit rates above 1–2 commits/sec are unattainable” — now exactly right rather than hedged.
“Storage I/O is the primary bottleneck for single-table workloads.”
“Catalog CAS latency up to 120 ms adds only modest overhead for single-table.”
“IO convoys serialize VO commit attempts; P99 reaches minutes at moderate rates.”
“More tables move the bottleneck from per-table metadata I/O to catalog contention” (qualitatively true; the crossover is now at ~50 tables for S3/Azure instead of ~5).
“Zipf concentration limits the benefit of adding tables; rank-1 table dominates.”
“S3 Express is in a different class for catalog-as-file workloads.”
“GCS is not viable for catalog-as-file workloads.”
The Iceberg write-path diagram and metadata-size assumptions (~1 MiB TM, ~100 KiB manifest list).
The per-provider latency distributions (S3, S3 Express, Azure, Azure Premium, GCS).

Full companion reports

EXP1-3_REPORT.md — post-fix validation and cell-by-cell comparison for exp1, exp2, exp3a, exp3b.
EXP4_REPORT.md — post-fix validation for exp4a, exp4b (uniform and Zipf), and exp4c across five storage providers.

Chris Douglas