Skip to content

configstore: remove dead/footgun config-store surface (singletons, warehouse + org columns)#822

Merged
benben merged 3 commits into
mainfrom
ben/configstore-prune-dead-fields
Jun 29, 2026
Merged

configstore: remove dead/footgun config-store surface (singletons, warehouse + org columns)#822
benben merged 3 commits into
mainfrom
ben/configstore-prune-dead-fields

Conversation

@benben

@benben benben commented Jun 29, 2026

Copy link
Copy Markdown
Member

Audit-driven removal of config-store schema/fields that are persisted/served but never read to drive behavior (or are outright footguns). No user-visible behavior change. Driven by a full cross-repo audit (control plane + the Crossplane compositions in posthog/charts).

Removed

Cluster-wide singleton config tablesGlobalConfig, DuckLakeConfig, RateLimitConfig, QueryLogConfig: tables, Snapshot fields, seeds, snapshot loads, admin GET/PUT /config/* API, and the /settings dashboard page. Effective config comes from CLI flags/env (server.Config) + the per-org ManagedWarehouse contract.

ManagedWarehouse dead columns (13)warehouse_database_{region,database_name,username}, metadata_store_{engine,region}, worker_identity_service_account_name (worker SA is computed, not read), warehouse_database_state, and the 6 per-component *_status_message columns (provisioner only writes the top-level status_message). Kept warehouse_database endpoint/port + the per-component *_state fields.

Runtime coordination dead columnscp_instances.{pod_uid,boot_id} (already encoded in the instance id), worker_records.pod_uid (never written), org_connection_queue.canceled_at (cancel is a DELETE).

Org dead/footgun columns (4)memory_budget, idle_timeout_s (dead: only copied to the snapshot + shown in admin status); worker_cpu_request, worker_memory_request (footgun: org_router applied the per-org value to the shared worker pool → last-org-wins pod sizing across all orgs). Legit per-org sizing stays via default_worker_* + client GUCs; the deployment-wide K8sConfig.WorkerCPURequest/MemoryRequest (charts/env) is a different field, untouched.

Migration / rollout safety

  • 000004 drops the 4 config tables + 13 warehouse columns + 4 org columns (Down recreates).
  • Runtime-schema dead columns dropped in autoMigrateRuntimeTables (DROP COLUMN IF EXISTS, idempotent, runs before first heartbeat — cp_instances.pod_uid/boot_id were NOT NULL, so a rollout that stopped supplying them would otherwise fail every upsert).

Reconciled with #821

The models-explorer (models_api.go) dropped its 4 config-singleton descriptors + harness/test references.

Verification

go build ./..., go build -tags kubernetes ./controlplane/... ./cmd/..., all test binaries compile (both tag sets), configstore + admin + control-plane unit tests pass. Postgres/e2e in CI.

🤖 Generated with Claude Code

Audit-driven cleanup of config-store surface that is persisted/served but never
read to drive runtime behavior. No user-visible behavior change.

Removed (confirmed dead by a full cross-repo audit incl. the Crossplane
compositions in posthog/charts):

- The four cluster-wide singleton config tables — GlobalConfig, DuckLakeConfig,
  RateLimitConfig, QueryLogConfig — and their Snapshot fields, FirstOrCreate
  seeds, snapshot loads, the admin GET/PUT /config/* API (handlers, routes,
  apiStore methods, gormAPIStore impls), and the /settings dashboard page.
  Nothing reads Snapshot.Global/DuckLake/RateLimit/QueryLog; effective config
  comes from CLI flags/env (server.Config) and the per-org ManagedWarehouse
  contract. (DuckLakeConfig was already marked legacy.)

- ManagedWarehouse dead columns: warehouse_database_{region,database_name,
  username}, metadata_store_{engine,region}, worker_identity_service_account_name
  (the worker SA name is computed, not read from config), warehouse_database_state
  (warehouse_database has no provisioning sub-state), and the six per-component
  *_status_message columns (the provisioner only writes the top-level
  status_message). Kept warehouse_database endpoint/port (curated provision
  response) and the per-component *_state fields that drive readiness.

- Runtime coordination dead columns: cp_instances.pod_uid/boot_id (already
  encoded into the cp instance id), worker_records.pod_uid (never written),
  org_connection_queue.canceled_at (cancel is a hard DELETE, never set) — plus
  the ControlPlaneRuntimeTracker podUID/bootID fields and constructor params.

Migration & rollout safety:
- New goose migration 000004 drops the four config tables and the 13 dead
  managed_warehouses columns (with a best-effort Down that recreates them).
- Runtime tables are AutoMigrate'd in a per-deployment schema, so the dead
  runtime columns are dropped in autoMigrateRuntimeTables (schema-qualified,
  DROP COLUMN IF EXISTS, idempotent). This runs before the first heartbeat —
  critical because cp_instances.pod_uid/boot_id were NOT NULL, so a rollout that
  stopped supplying them would otherwise fail every UpsertControlPlaneInstance.

Also updated: seed SQL (k8s + test fixtures), all affected unit/postgres tests
(including the migrations metadata-parity test, now asserting the config tables
are absent at version 4), and the README note about the former DuckLakeConfig
singleton.

Verified: `go build ./...`, `go build -tags kubernetes ./controlplane/... ./cmd/...`,
all kubernetes-tagged test binaries compile, and the configstore + admin
no-docker unit tests pass.
@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

Test Impact Plan

Deterministic summary of how this PR changes tests, CI runners, and coverage-risk signals.

Summary

Area Added Changed Deleted
Test files 0 14 0
E2E/journey files 0 1 0
Workflow files 0 0 0

Signals

  • Test cases: +0 / -0
  • Assertions: +5 / -9
  • Skips or known failures added: 0
  • Workflow continue-on-error added: 0
  • Workflow path filters added: 0
  • Test commands removed from justfile: 0
  • E2E/journey retry lines added: 0

Coverage risk: needs review

Warnings

  • Assertions removed (needs review)
    • 9 removed vs 5 added
  • E2E or journey files changed (needs review)
    • tests/e2e-mw-dev/harness.sh

benben added 2 commits June 29, 2026 12:05
#821 (models explorer UI) merged to main after this branch was cut. Its
models_api.go registered the four config-singleton models this branch deletes,
so the merged tree failed to build (undefined: configstore.GlobalConfig etc).

Merge main in and reconcile:
- Remove the global/ducklake/ratelimit/querylog descriptors from
  models_api.go (and the now-unused modelGroupConfig group).
- Drop them from models_api_test.go (wantKeys + the global-config redaction
  entry) and the harness models_explorer_api sidebar key list.
- Drop the dead Settings nav link + the empty Config sidebar group from
  models.html, and the /settings row from admin/README.md (the /settings page
  and its config API were removed on this branch).

Build (plain + kubernetes), all kubernetes test binaries, and the admin
redaction test pass.
@benben benben merged commit eda4113 into main Jun 29, 2026
27 checks passed
@benben benben deleted the ben/configstore-prune-dead-fields branch June 29, 2026 10:24
@benben benben changed the title configstore: remove dead config singletons + unused model fields configstore: remove dead/footgun config-store surface (singletons, warehouse + org columns) Jun 29, 2026
benben added a commit that referenced this pull request Jun 29, 2026
Removes four duckgres_orgs columns:
- memory_budget, idle_timeout_s — dead: only copied to the snapshot OrgConfig +
  shown in admin OrgStatus; no behavioral reader.
- worker_cpu_request, worker_memory_request — footgun: org_router applied the
  per-org value to the SHARED worker pool (last-org-wins pod sizing across all
  orgs). Per-org sizing stays via default_worker_cpu/memory + client GUCs; the
  deployment-wide K8sConfig.WorkerCPURequest/MemoryRequest (charts/env) is a
  separate, kept field.

Drops the two SetWorkerResources footgun blocks from org_router; new migration
000005 drops the columns (Down re-adds); updates OrgConfig snapshot view, admin
OrgStatus, seeds, and tests.

Split out of #822 (which squash-merged before this landed): same configstore
cleanup, new migration version since 000004 is already merged/applied.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant