Skip to content

feat(0052): port BE's mTLS ClickHouse client#45

Merged
karczuRF merged 3 commits into
developfrom
feat/0052_port-mtls-client
Jun 18, 2026
Merged

feat(0052): port BE's mTLS ClickHouse client#45
karczuRF merged 3 commits into
developfrom
feat/0052_port-mtls-client

Conversation

@karczuRF

Copy link
Copy Markdown
Collaborator

Summary

Port BE's production mTLS ClickHouse client (soroban-block-explorer/crates/db-clickhouse/src/mtls.rs) into prices-clickhouse behind an aws-mtls feature, so both tenants share one proven transport.

  • src/mtls.rs (ported, credited):
    • client_with_mtls(domain, &bundle, db)hyper-rustls connector with with_client_auth_cert, root store = webpki-roots + bundle CA, injected via clickhouse::Client::with_http_client.
    • fetch_bundle_from_extension{cert,key,ca} JSON bundle from the AWS Parameters & Secrets Lambda Extension (localhost:2773), warm-cached.
    • client_from_lambda_env(db) — cold-start one-shot (MTLS_SECRET_NAME + CH_DOMAIN).
    • MtlsBundle has a manual Debug that redacts all PEM.
  • Cargo.tomlaws-mtls feature gating optional deps; default build stays plaintext-only.
  • lib.rs#[cfg(feature = "aws-mtls")] pub mod mtls;.

Why a port (not the closed #44 approach)

Enabling the clickhouse crate's rustls-tls-* features (PR #44, now closed) makes its default client TLS-capable but can't present a client cert. mTLS needs a custom connector via with_http_client — which exists in our clickhouse 0.13 and is generic over the connector (impl<C> HttpClient for Client<C, RequestBody>), so BE's pattern compiles with no version bump.

Verification

  • Default build lean (no rustls/hyper/reqwest); --features aws-mtls builds.
  • clippy clean; 7 unit tests pass (cargo test -p prices-clickhouse --features aws-mtls): PEM-parse shape, MtlsBundle Debug redaction, missing-env.
  • cargo tree → single rustls 0.23.40, hyper-rustls 0.27.9, reqwest 0.12, aws-lc-rs.

Deferred

Live mTLS round-trip against the Hetzner prices DB needs a real cert bundle + endpoint → 0063 (provisioning) / 0051 (live schema-apply, its first consumer).

karczuRF added 2 commits June 17, 2026 15:58
Port crates/db-clickhouse/src/mtls.rs from soroban-block-explorer into
prices-clickhouse behind an aws-mtls feature: builds a hyper-rustls
connector with a client cert (with_client_auth_cert), root store =
webpki-roots + bundle CA, injected via clickhouse::Client::with_http_client;
fetches the {cert,key,ca} bundle from the AWS Parameters & Secrets Lambda
Extension (MTLS_SECRET_NAME/CH_DOMAIN). Manual Debug redacts all PEM.

Works on our clickhouse 0.13 (with_http_client is connector-generic) so
no version bump. Default build stays plaintext-only; mTLS stack is gated.

Supersedes the closed PR #44 (enabling the clickhouse crate's rustls-tls
features could not present a client cert). aws-lc-rs provider, matching BE.

Verified: default + aws-mtls builds, clippy clean, 7 unit tests pass,
single rustls 0.23.40 in the tree. Live round-trip deferred to 0063/0051.
Close the last in-scope acceptance criterion: the prices-clickhouse
README now documents the aws-mtls feature, the MTLS_SECRET_NAME /
CH_DOMAIN / AWS_SESSION_TOKEN contract for client_from_lambda_env, the
off-Lambda client_with_mtls path, and the build-once-clone-per-invocation
pattern that amortises the cross-cloud TLS handshake.
@karczuRF

Copy link
Copy Markdown
Collaborator Author

Code review — high effort (recall-biased)

Clean, near-verbatim port with strong hygiene (PEM-redacting Debug, fail-fast env handling, feature-gated so the default build stays lean). No correctness bugs survived verification. Remaining findings are low-severity polish — neither is a merge blocker.

Findings

1. mtls.rs:205pool_idle_timeout(8s) vs the "amortise the TLS handshake across invocations" claim (low)
The warm-pool benefit only holds for invocations <8s apart. Periodic workers (task 0039) fire minutes apart, so their idle socket is closed after 8s and every scheduled invocation pays the full ~80–130 ms cross-cloud TLS handshake — the amortisation advertised in the doc comment (and the README "build once, reuse" note) never materialises for them. This is correct, safe behaviour (8s < CH's 10s server keepalive avoids half-dead sockets); the perf wording is just overstated for sparse callers. Suggest scoping the "amortised" wording to high-frequency callers, or noting sparse workers reconnect by design.

2. mtls.rs:249 — ambiguous private-key parse error (very low)
parse_private_key returns the same no private key found in PEM for an empty PEM and for a present-but-unparseable key, steering an operator toward "the field is empty" rather than "the key format/content is wrong". Narrow edge (rustls_pemfile handles PKCS#8/PKCS#1/SEC1), but a distinguishing message would save time on the live round-trip (0063).

Verified & dismissed (not reported)

  • reqwest missing a TLS backend (flagged by two angles) — refuted: the only endpoint is http://localhost:2773 (plain HTTP), and Cargo.lock confirms reqwest pulls no native-tls/hyper-tls/rustls. reqwest without a TLS backend builds and serves http:// fine; it only errors on https://. The lean config is correct — mTLS lives on the hyper-rustls ClickHouse path, not the extension fetch.
  • RootCertStore partial-state-on-error — store is a fresh local let mut per call; no shared/retry state to corrupt.
  • let _ = install_default() swallowing the error — the Err case is the already-installed idempotent path; correct.
  • CA-as-client-cert not rejected — cosmetic, matches the BE port.
  • Cross-file: confirmed clickhouse::Client::with_http_client exists on our 0.13 and the hyper-rustls 0.27 builder API matches.
  • Secrecy convention ("never leak private keys") — satisfied: MtlsBundle's manual Debug redacts all PEM and no error path prints key material.

Method: 8 finder angles (3 correctness + reuse/simplification/efficiency/altitude/conventions) → dedup → recall-biased verification. CI green (aws-mtls build + clippy + unit tests).

Two low-severity accuracy fixes from the code review:

- Qualify the build-once-reuse amortisation claim (doc comment +
  README). The ~80-130 ms handshake is only amortised while a pooled
  socket survives; pool_idle_timeout(8s) closes quiet connections, so
  invocations spaced past the idle window re-handshake. State the bound
  instead of implying zero handshakes on sparse traffic.
- Make the private-key parse error actionable: rustls_pemfile's None
  means no recognised key block, so name the supported PKCS#8/PKCS#1/SEC1
  formats and point at the bundle's key field.
@karczuRF

Copy link
Copy Markdown
Collaborator Author

All findings from the review are now addressed in ede44a2:

Finding 1 — amortisation overstatement (mtls.rs client_with_mtls doc + README "Build once, reuse"): both implied cloning gives zero handshakes across invocations. Now they state the real bound — pool_idle_timeout(8s) closes quiet sockets, so back-to-back calls reuse the warm connection but invocations spaced past the idle window re-handshake. Sharing the client stays the documented default (no per-call connector setup, pool warm under load), just not framed as a zero-handshake guarantee.

Finding 2 — ambiguous key-parse error (mtls.rs:249): rustls_pemfile::private_key returning None covers both a missing key block and an unrecognised format, so the old no private key found in PEM was ambiguous. The message now names the supported PKCS#8/PKCS#1/SEC1 headers and points at the bundle's key field.

Both were doc/message wording only — no logic change. Verified: cargo test -p prices-clickhouse --features aws-mtls (7 passed), clippy clean, fmt clean.

@karczuRF karczuRF merged commit d4a9657 into develop Jun 18, 2026
3 checks passed
@karczuRF karczuRF deleted the feat/0052_port-mtls-client branch June 18, 2026 09:47
karczuRF added a commit that referenced this pull request Jun 18, 2026
In-scope work (mTLS port + unit tests + README) merged to develop via
PR #45 (d4a9657). The two open acceptance criteria — live mTLS
round-trip and >=1 downstream consumer — both need the real cert
bundle + reachable Caddy endpoint that 0063 provisions, so 0052 waits
on 0063. No code work remains on 0052 itself.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant