Skip to content

feat(webhook): provider-agnostic receiver (push → warm), GitHub phase 1#70

Open
russellromney wants to merge 4 commits into
mainfrom
feat/webhook-receiver
Open

feat(webhook): provider-agnostic receiver (push → warm), GitHub phase 1#70
russellromney wants to merge 4 commits into
mainfrom
feat/webhook-receiver

Conversation

@russellromney

Copy link
Copy Markdown
Owner

What

A built-in webhook receiver so a provider push auto-enqueues a sync (push → warm), reusing the existing build queue. No CI Action, no glue — the same warm-on-push the managed cloud gives you. Phase 1 = GitHub; structured so GitLab/Gitea are later trait impls.

Implements docs/WEBHOOKS.md (the design doc is the first commit on this branch).

provider push ─▶ POST /webhooks/{provider}
                   │  verify signature (HMAC over the RAW body, constant-time)
                   │  normalize payload → CanonicalEvent
                   ▼
                 enqueue_sync(state, repo, branch, cred)  ──▶ worker ──▶ clonepack
                   └─ the SAME enqueue path /sync uses

Changes

  • webhook module (rust/src/webhook/): WebhookProvider trait (verify over raw body, parseCanonicalEvent { kind, repo, ref_, after, default_branch, private }) + WebhookConfig (per-provider secret + optional allowlist).
  • GitHub adapter: X-Hub-Signature-256 HMAC-SHA256 over the raw body, constant-time compare via subtle; X-GitHub-Event routing; parses push / branch-delete / ping.
  • POST /webhooks/{provider} in server.rs, registered under the rate_limited layer but not auth_middleware (the HMAC is the auth). Reads raw bytes before JSON, looks up the ProviderInstance, verifies, parses, dispatches.
  • enqueue_sync(...) factored out of sync_repo_inner and called from both /sync and the webhook — no duplicated build logic. The webhook drops the handle (fire-and-forget; responds 2xx fast).
  • Branch-deleteRefStore::delete_branch (file + S3 + caching impls); never builds.
  • Config: RIPCLONE_WEBHOOK_SECRET_<provider>, StaticBroker credential (queue: carry the per-request upstream token to the cross-process worker #55) for private clones, optional RIPCLONE_WEBHOOK_ALLOWLIST. No secret ⇒ 503, bad signature ⇒ 401.

Resolved open questions (recommended defaults)

  • Allowlist: allow-all + a loud startup log; restrict with RIPCLONE_WEBHOOK_ALLOWLIST.
  • Non-default branches: always warm the default branch; warm others only if already tracked.
  • Multi-instance routing: {provider} is the ProviderInstance id; the secret is keyed per instance id.

Tests

Signature verify (valid / invalid / missing), GitHub parse, enqueue invoked on push, branch-delete cleanup, allowlist gating, no-secret ⇒ 503, tracked vs untracked non-default branch. cargo fmt --check + cargo clippy --all-targets -- -D warnings + the full release test suite (cargo test --release --all-targets --locked) are green.

Follow-ups

GitLab (X-Gitlab-Token) and Gitea/Forgejo (X-Gitea-Signature) adapters — each is one WebhookProvider impl plus a match arm in webhook::provider_for.

🤖 Generated with Claude Code

russellromney and others added 4 commits June 26, 2026 15:54
Self-hosters have no automatic warming today — they must call /sync or write a
CI Action. Design a built-in webhook receiver on ripclone-server: verify the
provider signature, normalize the payload, and enqueue a sync on the existing
build queue. Provider-agnostic via a WebhookProvider trait (GitHub first, then
GitLab/Gitea). Documents how this converges with the managed cloud at the build
queue + per-job credential (#55), so self-host gets identical warm-on-push.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a built-in webhook endpoint so a provider push auto-enqueues a sync,
reusing the existing build queue. No CI Action, no glue — the same
warm-on-push the managed cloud gives you.

- `webhook` module: `WebhookProvider` trait (`verify` over the RAW body,
  `parse` → `CanonicalEvent`) + `WebhookConfig` (per-provider secret +
  optional allowlist). Structured so GitLab/Gitea are later trait impls.
- GitHub adapter: `X-Hub-Signature-256` HMAC-SHA256 over the raw body
  (constant-time compare via `subtle`), `X-GitHub-Event` routing; parses
  push / branch-delete / ping.
- `POST /webhooks/{provider}` in server.rs, under `rate_limited` but NOT
  `auth_middleware` (the HMAC is the auth). Reads raw bytes before JSON,
  looks up the ProviderInstance, verifies, parses, dispatches.
- Factor `enqueue_sync(state, repo, branch, rev, cred)` out of
  `sync_repo_inner`; both `/sync` and the webhook call it — no duplicated
  build logic. The webhook drops the handle (fire-and-forget warming).
- Branch-delete → `RefStore::delete_branch` (file + S3 + caching impls),
  never builds.
- Config: `RIPCLONE_WEBHOOK_SECRET_<provider>`, StaticBroker credential for
  private clones, optional `RIPCLONE_WEBHOOK_ALLOWLIST`. No secret ⇒ 503.

Resolved the doc's open questions with the recommended defaults: allow-all
allowlist + loud startup log; always warm the default branch, other branches
only if already tracked.

Tests: signature verify (valid/invalid/missing), GitHub parse, enqueue
invoked on push, branch-delete cleanup, allowlist gating, no-secret ⇒ 503,
tracked/untracked non-default branch. `fmt` + `clippy -D warnings` + full
release test suite green.

GitLab (`X-Gitlab-Token`) and Gitea (`X-Gitea-Signature`) are follow-ups
behind the same trait.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
From two independent adversarial reviews of the receiver:

Security/correctness:
- Cap webhook body at 25 MiB (MAX_WEBHOOK_BODY_BYTES) instead of the global
  256 MiB. The HMAC can only be checked after the whole body is buffered, so
  an unauthenticated caller must not be able to make the server hold 256 MiB
  before the 401. Oversized → 413.
- Validate the payload-derived branch (validate_git_rev) in both the push and
  delete handlers before it reaches the queue/git. Contained before (storage
  keys are slugged, git re-validates), but makes the trust boundary explicit
  and skips a doomed enqueue.
- Document that branch-delete cleanup intentionally skips the push allowlist
  (a non-allowlisted repo was never warmed, so delete is a safe no-op).

Testability + coverage:
- Extract `parse_secret` / `parse_allowlist` from `from_env` and unit-test
  them: empty secret ⇒ no secret (fail closed, no empty HMAC key), allowlist
  trimming/empty-dropping.
- GitHub verify: valid-hex-but-wrong-length signature (the exact ct_eq
  length-mismatch branch the comment claims is safe) + correct-length wrong
  bytes.
- Handler tamper test: sign body A with the right secret, deliver body B ⇒ 401
  (proves verification is over the raw received bytes).
- Coalescing: two identical signed pushes ⇒ exactly one queued build.
- Tag delete ignored; hostile branch name rejected.
- CachingRefStore::delete_branch evicts the cache (not just the file).

fmt + clippy -D warnings + full release suite green (lib 202 passed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…mirror

Address the two items previously left as low-severity notes:

- Branch-delete now applies the same repo allowlist as push, so the receiver
  acts only on in-scope repos symmetrically (an out-of-scope delete is ignored
  rather than silently mutating refs).
- Default-branch policy no longer depends solely on the payload: when a
  provider omits `repository.default_branch`, fall back to the local mirror's
  HEAD (populated by any prior sync), exactly as sync_repo_inner resolves HEAD.
  GitHub always sends it; this keeps the policy correct for future
  GitLab/Gitea adapters. A brand-new repo with neither stays untracked until
  first warmed (fail-safe).

Tests: default branch resolved from the mirror when the payload omits it;
no-default + no-mirror + untracked stays ignored; delete outside the allowlist
leaves refs untouched. fmt + clippy -D warnings + full release suite green
(lib 205 passed, 51 test binaries, 0 failures).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant