Skip to content

docs(lore-0063): correct prices.* sizing from 64k ground-truth measurement#49

Merged
karczuRF merged 3 commits into
developfrom
feat/0063_provision-prices-db-on-hetzner-ch-self-served
Jun 22, 2026
Merged

docs(lore-0063): correct prices.* sizing from 64k ground-truth measurement#49
karczuRF merged 3 commits into
developfrom
feat/0063_provision-prices-db-on-hetzner-ch-self-served

Conversation

@karczuRF

Copy link
Copy Markdown
Collaborator

Summary

Re-measured the prices.* ClickHouse footprint with a fresh 64,000-ledger backfill (window 62016000-62079999) and corrected the docs + cost-share, which were still citing the superseded task-0046 estimate.

Fully local / prepare-only: docker ClickHouse 25.6, cached partition (no S3 fetch), no prod infra touched.

Key finding — sizing was wrong by ~25–50×

Sample Window B/ledger
0060 calib (10k) 62966000+ 3,597
0060 full (100k) 62882700+ 3,677
This run (64k) 62016000+ 1,872
  • Real footprint: ~1.9–3.7 KB/ledger / ~3.5–6 GB/yr (activity-dependent), vs the 0046 ~74 B/ledger / ~0.45 GB/yr estimate.
  • Driver is trading-pair diversity (thousands of unfiltered low-volume tokens), not ledger count.
  • 64k run measured 114.23 MiB total, ~18.8 min wall-clock (cached).

Cost-share / architecture impact

  • BE cost-share moves from ~1%/$1-2 to ~10-15%/$8-11 per env/mo.
  • New shared-vs-dedicated-container table: a dedicated prices CH container (ADR 0007 Alt-3) costs ~2× and breaks BE's in-cluster price_usd_series JOIN (0199 contract) → shared prices DB stays correct; sidecar remains the task-0047-RED fallback only.
  • Architecture decision unchanged; even ~9 GB @ 10yr (with _1h/_4h retention cap) is trivial for the shared box.

Changes

  • New: lore/1-tasks/active/0063_…/notes/G-64k-sizing-remeasure.md — full measurement record, three-window comparison, corrected projection, cost table, reproduction block.
  • Docs: docs/database-schema/database-schema-overview.md §1.2, §8.3 (sizing table + "superseded" callout), §8.5 (cost lines), revision-history row.
  • Lineage: G-provisioning-plan.md forward-links the new note.

Caveats (documented in the note)

  • close_usd/USD-series enrichment not run → that column compresses to ~0 here (adds only a few % when populated; it's a column, not a new row class).
  • AMM candles ≈ 0 (in-window-registry limitation) → real production is somewhat higher.

karczuRF added 3 commits June 19, 2026 13:06
…ement

Ran a fresh 64,000-ledger backfill (62016000-62079999, cached partition,
fully local) through the real prices-clickhouse pipeline and measured
system.parts: 114 MiB / ~1,872 B/ledger. Combined with task 0060's 10k+100k
runs, the real prices.* footprint is ~1.9-3.7 KB/ledger (~3.5-6 GB/yr),
superseding the task-0046 ~74 B/ledger / ~0.45 GB/yr estimate by ~25-50×.

Driver is trading-pair diversity, not ledger count. Cost-share with BE
accordingly moves from ~1%/$1-2 to ~10-15%/$8-11 per env/mo. Adds a
shared-vs-dedicated-container cost table: a dedicated prices CH container
(ADR 0007 Alt-3) costs ~2x and breaks BE's in-cluster price_usd_series JOIN,
so the shared prices DB stays correct (sidecar = task-0047-RED fallback only).

Saves the measurement as notes/G-64k-sizing-remeasure.md and refreshes the
schema-overview §1.2/§8.3/§8.5 + revision history.
Record the local-only ground-truth test over 128,000 cached ledgers
(62848000-62975999) through the production prices-clickhouse schema:
SDEX extraction verified (16.5M trades), AMM path proven via Aquarius
(864 ticks; Phoenix/Soroswap 0 due to in-window pool-discovery limit),
oracle captured. Full-schema footprint 4.13 KiB/ledger (516.87 MiB /
128k, ~22 GiB/yr at this high-activity range); parse 18.7 ms/ledger.
Link forward from G-64k-sizing-remeasure.
The schema overview never documented the `close_usd` column or its
BE-facing read-surface views, so a reader searching the doc could not
find the historical USD asset price BE requested.

Add the `close_usd Decimal(38,14) DEFAULT 0` column to the
`price_ohlcv_*` DDL and a §3.2 subsection covering the derived VIEWs
(`price_usd_series` / `_1h`, `usd_reference` / `_1h`,
`identity_by_contract`), the read-time ok / no_asset_price /
no_reference status discriminator, caller-owned grain selection, and
the load-bearing USDC-issuer literal. Source of truth:
packages/prices-clickhouse/schema/views.sql.
@karczuRF karczuRF merged commit f230cf1 into develop Jun 22, 2026
3 checks passed
@karczuRF karczuRF deleted the feat/0063_provision-prices-db-on-hetzner-ch-self-served branch June 22, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant