diff --git a/lore/1-tasks/active/0063_FEATURE_provision-prices-db-on-hetzner-ch-self-served/README.md b/lore/1-tasks/active/0063_FEATURE_provision-prices-db-on-hetzner-ch-self-served/README.md index 3887687..15074fd 100644 --- a/lore/1-tasks/active/0063_FEATURE_provision-prices-db-on-hetzner-ch-self-served/README.md +++ b/lore/1-tasks/active/0063_FEATURE_provision-prices-db-on-hetzner-ch-self-served/README.md @@ -42,6 +42,21 @@ history: Hard constraint (operator): **no Hetzner or AWS calls** — every remote action (clickhouse-client on the box, ansible-playbook, aws secretsmanager) is gated on explicit per-session approval. + - date: 2026-06-18 + status: active + who: oski + note: > + DDL-apply identity decided — **Option 1**: loopback `default` admin + applies all DDL; runtime certs (`prices_writer`/`prices_reader`) carry + no DDL (`write_no_ddl`/`read_only`, grants on `prices.*` only). Chosen + over the G-note's scoped-DDL writer and the hybrid migrator cert; matches + BE (removed remote-DDL users in BE 0241) and keeps the 0051 loopback + descope. Authored `notes/G-provisioning-plan.md` — the ready-to-execute + runbook with drafted BE-PR XML (services/quotas + CN-map), the CREATE + DATABASE one-shot, the cert→single-bundle-secret procedure (aligned to + 0052, reconciling the old two-secret assumption), verification, and a + gated-action inventory. All authoring only — no Hetzner/AWS/BE-repo + action taken. --- # Provision the `prices` database on Hetzner CH (self-served) @@ -97,22 +112,28 @@ under admin, then kept reproducible by 0051's migration runner. ### Step 2: Add the prices tenant to BE's RBAC (PR to soroban-block-explorer) -Mirror the existing BE tenant shape (see `G-be-prices-db-rbac-ask.md`): +Mirror the existing BE tenant shape. **Concrete drafted XML/CN-map/SQL +lives in `notes/G-provisioning-plan.md`** (Option 1). In summary: - `users.d/services.xml`: add `prices_writer` (profile `write_no_ddl`, - quota `high_write`) and `prices_reader` (profile `read_only`, quota - `api_throttle` or a new `prices_read` quota), both `` - with networks restricted to loopback + the compose bridge subnet. -- Reuse BE's existing `write_no_ddl` / `read_only` profiles unless - prices needs tighter caps; if so add a `prices_write` profile in - `profiles.xml`. Decide at impl time; record in a short note. -- `CLICKHOUSE_CN_USER_MAP`: add `prices-ingestion-{env}:prices_writer` - and `prices-api-{env}:prices_reader`. - -> DDL caveat: `write_no_ddl` blocks `CREATE TABLE`. 0051's schema-apply -> runs under an admin/DDL-capable identity, **not** `prices_writer`. -> Resolve which identity applies schema as part of Step 2 (either a -> short-lived admin cert for migrations, or BE applies the initial DDL). + quota `prices_write`, grant `SELECT, INSERT, OPTIMIZE ON prices.*`) and + `prices_reader` (profile `read_only`, quota `prices_read`, grant + `SELECT ON prices.*`), both `` with networks restricted + to loopback + the compose bridge subnet. +- `users.d/profiles.xml`: **no change** — reuse BE's `write_no_ddl` + (8 GiB) + `read_only` (4 GiB/30 s). No `prices_write` profile (Option 1). +- `users.d/quotas.xml`: add dedicated `prices_write` / `prices_read` + (caps copied from BE's `high_write` / `api_throttle`) so prices never + draws down BE's per-service budget. +- `CLICKHOUSE_CN_USER_MAP` (env, not a file): append + `prices-ingestion-{env}:prices_writer` and `prices-api-{env}:prices_reader`. + +> **DDL identity — DECIDED 2026-06-18 (Option 1, Design Decisions → Emerged #1):** +> DDL is the box `default` admin over loopback; `prices_writer` stays +> `write_no_ddl` with **no** `CREATE`/`DROP`/`ALTER` grant. Schema is applied by +> 0051 over loopback (Step 4 below), **not** over mTLS and **not** by +> `prices_writer`. No migration cert is issued (revisit only if box-access-per- +> migration becomes friction — then add an Option-3 `prices_migrator` cert). ### Step 3: Create the database + deploy the RBAC @@ -129,18 +150,25 @@ Mirror the existing BE tenant shape (see `G-be-prices-db-rbac-ask.md`): - Requires the CA private key. If admin-on-the-box does **not** include CA-key access, this sub-step stays a BE ask (BE runs the script and hands over the bundle) — flag it explicitly. -- Store cert+key (+ CA cert) bundles in AWS Secrets Manager, 2 secrets - per env, at the keys 0011's CDK + 0052's client crate expect. +- Store each cert as a **single JSON bundle** `{cert,key,ca}` in AWS + Secrets Manager (one secret per identity per env), named by the + `MTLS_SECRET_NAME` 0052's client reads — **not** the two-secret + cert/key split the 0050 G-note assumed. Reconcile 0011/0038 CDK to the + single-bundle shape if it still emits `MTLS_CERT_SECRET_NAME` / + `MTLS_KEY_SECRET_NAME` (see `notes/G-provisioning-plan.md` §5 + open + item 2; likely a follow-up task). ### Step 5: Verify tenant isolation For each env (dev → staging → prod): - Connect via Caddy:443 with the prices cert; `SELECT version()` → 200. -- As `prices_writer`: `CREATE TABLE prices.smoke (x UInt8) ENGINE=Memory` - + `INSERT` succeed; the same against `default.*` is **denied**. +- As `prices_writer`: `INSERT` into an existing `prices.*` table succeeds; + `INSERT`/`SELECT` against `default.*` is **denied**; `CREATE TABLE + prices.smoke …` is **denied** too (writer has no DDL — the Option-1 + proof). The tables themselves are created by 0051's loopback apply, not + here. - As `prices_reader`: `SELECT` from `prices.*` works; any write denied. -- Drop the smoke table. ## Acceptance Criteria @@ -150,15 +178,48 @@ For each env (dev → staging → prod): quota scoping resource usage away from BE's `default.*` - [ ] Caddy `CLICKHOUSE_CN_USER_MAP` maps the prices CNs to the prices users; unmapped CNs 403 -- [ ] Per-env mTLS cert+key pairs issued and stored in AWS Secrets - Manager (2 secrets/env) at the keys 0011/0052 read; or, if CA-key - access is withheld, the BE-issuance hand-off is recorded done +- [ ] Per-env mTLS certs issued and stored in AWS Secrets Manager as a + single `{cert,key,ca}` JSON bundle per identity (named by + `MTLS_SECRET_NAME`, per 0052); or, if CA-key access is withheld, + the BE-issuance hand-off is recorded done - [ ] Smoke test confirms isolation: `prices.*` writable by - `prices_writer`, `default.*` denied; `prices_reader` read-only -- [ ] DDL-apply identity for 0051 decided + documented (admin cert vs - BE applies initial DDL) -- [ ] `notes/G-provisioning-record.md` captures the SQL/XML applied, - SSM/Secrets keys, CNs, and per-env completion dates + `prices_writer`, `default.*` denied, `CREATE TABLE` denied to the + writer; `prices_reader` read-only +- [x] DDL-apply identity for 0051 decided + documented — **Option 1: + loopback `default` admin applies DDL; `prices_writer` is + `write_no_ddl`** (Design Decisions → Emerged #1; + `notes/G-provisioning-plan.md`) +- [x] Provisioning runbook authored — `notes/G-provisioning-plan.md` + (drafted BE-PR XML/CN-map, SQL, cert/SM procedure, gated-action + inventory). A per-env completion record is appended as steps run. + +## Design Decisions + +### Emerged + +1. **Option 1 — loopback-admin DDL; no-DDL runtime certs** (chosen over + Option 2 "scoped-DDL `prices_writer` applies over mTLS", the literal + 0050 G-note, and Option 3 "separate short-lived `prices_migrator` + cert"). 0063 grants prices-api box admin access, so the loopback + `default` path covers all DDL; the always-on runtime certs stay + least-privilege (`write_no_ddl` / `read_only`, grants on `prices.*` + only) — a leaked ingestion cert cannot `DROP TABLE prices.*` or touch + `default.*`. Matches BE exactly (they removed their remote-DDL + `migration_admin`/`partition_admin` users in BE 0241) and keeps the + 0051 loopback descope intact (no mTLS apply path). Trade-off: each + schema change needs box access — acceptable for a low-churn OHLCV + schema on the wholesale-idempotent apply; upgrade to Option 3's + migrator cert only if that friction is ever felt. Full runbook + + drafted artifacts in `notes/G-provisioning-plan.md`. +2. **No new `prices_write` profile; dedicated `prices_write`/`prices_read` + quotas.** Profiles reuse BE's `write_no_ddl`/`read_only` (Option 1 + needs no DDL profile); quotas are prices-owned so prices can't draw + down BE's `high_write`/`api_throttle` budget (mirrors BE's own + `dev_read`-vs-`api_throttle` isolation). Quota naming is a minor + BE-PR-time call (G-plan open item 4). +3. **Single `{cert,key,ca}` bundle secret per 0052**, not the two-secret + cert/key split the G-note/0038-PR#34 assumed — reconcile the CDK + (G-plan open item 2). ## Blocked on diff --git a/lore/1-tasks/active/0063_FEATURE_provision-prices-db-on-hetzner-ch-self-served/notes/G-provisioning-plan.md b/lore/1-tasks/active/0063_FEATURE_provision-prices-db-on-hetzner-ch-self-served/notes/G-provisioning-plan.md new file mode 100644 index 0000000..f944bce --- /dev/null +++ b/lore/1-tasks/active/0063_FEATURE_provision-prices-db-on-hetzner-ch-self-served/notes/G-provisioning-plan.md @@ -0,0 +1,283 @@ +--- +id: "G-provisioning-plan" +title: "prices tenant provisioning runbook — Option 1 (loopback-admin DDL, no-DDL runtime certs)" +type: G +task: "0063" +status: mature +spawned_from: ["G-be-prices-db-rbac-ask"] +spawns: [] +related_notes: + - "../../../backlog/0050_FEATURE_be-side-prep-sns-mtls-prices-db-provisioning/notes/G-be-prices-db-rbac-ask.md" +links: + - "../../../../2-adrs/0007_live-data-sink-on-shared-hetzner-clickhouse.md" + - "../../../../../../soroban-block-explorer/crates/db-clickhouse/users.d/services.xml" + - "../../../../../../soroban-block-explorer/crates/db-clickhouse/users.d/profiles.xml" + - "../../../../../../soroban-block-explorer/crates/db-clickhouse/users.d/quotas.xml" + - "../../../../../../soroban-block-explorer/infra-hetzner/ca/issue-client-cert.sh" +--- + +# prices tenant provisioning runbook — Option 1 + +> **Decision (2026-06-18):** Option 1 — **DDL is the box `default` admin over +> loopback; the runtime certs (`prices_writer` / `prices_reader`) carry no DDL.** +> Chosen over the G-note's scoped-DDL writer (Option 2) and the hybrid migrator +> cert (Option 3). Rationale: 0063 grants prices-api box admin access, so the +> loopback path covers DDL; runtime certs stay least-privilege (a leaked +> ingestion cert cannot `DROP TABLE prices.*`); matches BE exactly (they +> *removed* their remote-DDL users in BE 0241); keeps the 0051 loopback descope +> intact (no mTLS apply path). Upgrade to Option 3's `prices_migrator` cert only +> if box-access-per-migration ever becomes friction. + +This is the **ready-to-execute** runbook. Every step is either local authoring +or a 🔒 **gated remote action** (Hetzner / AWS / BE-repo) that needs explicit +per-session operator approval before running. + +--- + +## 0. Identity model (recap) + +| Job | Identity | Path | Powers | +|---|---|---|---| +| Install / migrate schema | `default` admin | loopback on the box (SSH) | full DDL | +| Ingest candles (0038/0039) | cert `CN=prices-ingestion-{env}` → `prices_writer` | Caddy:443 mTLS from AWS | `SELECT, INSERT, OPTIMIZE ON prices.*` | +| API reads (0040) | cert `CN=prices-api-{env}` → `prices_reader` | Caddy:443 mTLS from AWS | `SELECT ON prices.*` | + +No external CN maps to `default`; admin is reachable only from the box. + +### Single production box — what `{env}` means here + +There is **one** Hetzner CH box: BE's `production` dedicated server `ch-prod-01` +(one `CH_DOMAIN`; `infra-hetzner/README.md`: "exactly once per environment +(`production`)"). There is **no separate dev/staging Hetzner box.** So the +`{env}` placeholder throughout this runbook is the **AWS-side** environment of +the *connecting client* (the Lambda stage), not a second CH box — every cert CN +terminates at the same one box. + +> ⚠️ **Open implication — confirm with BE.** One box → one `prices` database. +> If per-env certs (`prices-ingestion-dev` vs `-production`) are all issued, they +> map to the **same** `prices_writer`/`prices_reader` users writing the **same** +> `prices.*` tables. Decide before issuing certs: (a) only `-production` CNs for +> now (recommended — least surface, matches the single box), or (b) per-env CNs +> with an agreed story for whether dev/staging clients should touch prod +> `prices.*` at all. Tracked as Open item 5. + +--- + +## 1. BE-repo PR content (author locally → 🔒 PR to soroban-block-explorer) + +All three edits are **additive** — no existing BE user/profile/quota changes. +Drafted here; do **not** push to the BE repo without approval. + +### 1a. `crates/db-clickhouse/users.d/services.xml` — two new users + +```xml + + + + + 127.0.0.1 + ::1 + 172.30.0.0/16 + + write_no_ddl + prices_write + + GRANT SELECT, INSERT, OPTIMIZE ON prices.* TO prices_writer + + + + + + + 127.0.0.1 + ::1 + 172.30.0.0/16 + + read_only + prices_read + + GRANT SELECT ON prices.* TO prices_reader + + +``` + +What is intentionally **absent** vs the G-note's writer grant: no `CREATE TABLE`, +`DROP TABLE`, `ALTER`, `TRUNCATE`. The writer appends + dedups (`OPTIMIZE` for +`ReplacingMergeTree`); structural change is the loopback admin's job. Even if a +grant slipped through, `write_no_ddl`'s `allow_ddl=0` blocks DDL at the profile. + +### 1b. `crates/db-clickhouse/users.d/profiles.xml` — NO change + +Option 1's core simplification: **no `prices_write` profile.** The writer reuses +`write_no_ddl` (8 GiB cap), the reader reuses `read_only` (4 GiB / 30 s). The +G-note's DDL-capable `prices_write` profile is not created. + +### 1c. `crates/db-clickhouse/users.d/quotas.xml` — two new quotas + +Dedicated quotas (not reusing BE's `high_write` / `api_throttle`) so prices can +never consume BE's per-service budget — the same isolation reasoning BE used to +split `dev_read` from `api_throttle`. Caps copied from the BE siblings: + +```xml + + + 3600 + 0 0 0 + 0 0 + 1125899906842624 + + + + + + 3600 + 10000 0 + 10000000000 + 50000000000 + 1099511627776 + 1000 + + +``` + +> Quotas are **not enforced on the Caddy-proxied path** (CH limitation noted in +> the G-note), so the real noisy-neighbour guard is the per-query *profile* cap +> (`read_only` 4 GiB/30 s). Quotas are defined for forward-compat + the host path. +> Reuse-vs-dedicated quota naming is a minor call — confirm with BE at PR time. + +### 1d. Caddy CN→user map (env, not a file) + +`clickhouse_cn_user_pairs` is derived from the `CLICKHOUSE_CN_USER_MAP` env var +(`group_vars/all.yml:75`), supplied at playbook time (operator shell / GH +Secrets). **Append** two pairs per env — no checked-in file to edit: + +``` +prices-ingestion-{env}:prices_writer,prices-api-{env}:prices_reader +``` + +Unmapped CN → `__unmapped__` → 403 at Caddy (fail-closed). No prices CN maps to +`default`/`dev_shared`. + +--- + +## 2. Create the database (🔒 Hetzner, loopback admin, once) + +```sql +-- as `default` admin over loopback (SSH to box): +CREATE DATABASE IF NOT EXISTS prices; +``` + +`CREATE DATABASE` is a box-admin one-shot; it is not granted to any scoped user. + +--- + +## 3. Deploy the RBAC (🔒 Hetzner, Ansible) + +After the 1a–1d PR merges in the BE repo + `CLICKHOUSE_CN_USER_MAP` is extended: + +```bash +ansible-playbook -i inventory.ini site.yml --tags app # CH picks up users.d, Caddy picks up the CN map +``` + +Prepare-only: coordinate the actual run with BE / explicit approval. + +--- + +## 4. Apply the schema (🔒 Hetzner, loopback — this is task 0051 Step 4) + +On the box (or SSH tunnel to `localhost:8123`), as `default` admin, run the +existing **plaintext** apply — no mTLS, no DDL cert: + +```bash +CLICKHOUSE_URL=http://localhost:8123 \ + cargo run -p prices-clickhouse --bin prices-clickhouse-init -- --rollups +# (or: clickhouse-client --queries-file=init.sql / seed.sql / views.sql) +``` + +Idempotent (`CREATE … IF NOT EXISTS`). Owned by 0051; listed here for the full +provisioning sequence. + +--- + +## 5. Issue + store the mTLS certs (🔒 Hetzner CA + 🔒 AWS) + +Per env, from BE's CA (needs the CA private key — **if box-admin does not include +CA-key access, this is a BE ask**: BE runs the script, hands over the bundle): + +```bash +./infra-hetzner/ca/issue-client-cert.sh prices-ingestion-{env} +./infra-hetzner/ca/issue-client-cert.sh prices-api-{env} +``` + +**Storage format — single JSON bundle, per task 0052 (NOT two secrets).** 0052's +client reads one Secrets Manager secret holding `{cert, key, ca}` JSON, named by +`MTLS_SECRET_NAME`, fetched via the Lambda Parameters & Secrets Extension. Store +one bundle secret per identity per env: + +```bash +# one secret per identity per env, JSON {cert,key,ca}: +aws secretsmanager put-secret-value --secret-id prices/{env}/clickhouse-mtls-ingestion \ + --secret-string "$(jq -n --arg c "$(cat prices-ingestion-{env}.crt)" \ + --arg k "$(cat prices-ingestion-{env}.key)" \ + --arg a "$(cat ca.crt)" '{cert:$c,key:$k,ca:$a}')" +``` + +> ⚠️ **Cross-task reconciliation:** 0038's earlier AWS side (PR #34) + the 0050 +> G-note assumed **two** secrets (`…-cert` / `…-key`). 0052 standardised on the +> **single-bundle** shape. Whichever 0011/0038 CDK provisions the secret + env +> vars must match 0052: `MTLS_SECRET_NAME` → one `{cert,key,ca}` JSON secret, +> plus `CH_DOMAIN`. Flag a follow-up to align the CDK if it still emits the +> two-secret `MTLS_CERT_SECRET_NAME` / `MTLS_KEY_SECRET_NAME` pair. + +Secrets-Manager bytes only; per the SSM key contract, only secret **names** ride +in env/SSM, never the cert/key material. 1-year manual rotation cadence. + +--- + +## 6. Verify tenant isolation (🔒 Hetzner, per env) + +Using the issued certs via Caddy:443: + +- `prices-ingestion-{env}` cert: `SELECT version()` → 200; `SHOW DATABASES` + includes `prices`; `INSERT` into a `prices.*` table succeeds; the same against + `default.*` → **ACCESS_DENIED**; `CREATE TABLE prices.smoke …` → **DENIED** + (writer has no DDL — the Option-1 proof). Schema itself was applied in Step 4. +- `prices-api-{env}` cert: `SELECT` on `prices.*` works; any `INSERT`/`CREATE` + → **ACCESS_DENIED**. + +--- + +## Gated-action inventory (each needs explicit approval before running) + +| # | Action | Target | Step | +|---|---|---|---| +| G1 | `clickhouse-client … SHOW DATABASES` (confirm admin) | 🔒 Hetzner | 0063 Step 1 | +| G2 | Open PR to `soroban-block-explorer` (1a–1d) | GitHub (BE repo) | §1 | +| G3 | `CREATE DATABASE IF NOT EXISTS prices` | 🔒 Hetzner | §2 | +| G4 | `ansible-playbook … --tags app` | 🔒 Hetzner | §3 | +| G5 | Apply schema (plaintext loopback) | 🔒 Hetzner | §4 / 0051 | +| G6 | `issue-client-cert.sh` (CA key) | 🔒 Hetzner CA | §5 | +| G7 | `aws secretsmanager put-secret-value` | 🔒 AWS | §5 | + +--- + +## Open items / decisions to confirm + +1. **CA-key access** — does box-admin include the CA private key? If not, Step 5 + issuance is a BE ask (BE runs `issue-client-cert.sh`, hands over the bundle). +2. **Secret shape reconciliation** — align 0011/0038 CDK to 0052's single-bundle + `MTLS_SECRET_NAME` (see §5 warning). Likely a follow-up task. +3. **Backup scope** (G-note §4) — recommend **(b)**: `prices.*` is re-derivable + from ledger history, so accept it is outside BE's `RESTORE DATABASE default` + set rather than extending BE's snapshot. Flag as a decision, not an oversight. +4. **Quota naming** — dedicated `prices_write`/`prices_read` (recommended, §1c) + vs reusing BE's `high_write`/`api_throttle`. Confirm at BE PR time. +5. **Per-env CNs vs single box** (§0) — one prod box → one `prices` DB. Confirm + whether to issue only `-production` CNs (recommended) or per-env CNs that all + target the same box/DB, before Step 5 cert issuance. +6. **`` element validity** — §1a places `GRANT …` + inside the user XML. BE's live `services.xml` does **not** use inline grants + (its service users have broad access by design), so this needs verifying + against the running CH version before the PR: confirm CH applies user-XML + `` at startup, else move the GRANT statements into the + `db-clickhouse-init` `init.sql` (run once under loopback admin) instead.