Skip to content

feat: add fleet e2e orchestrator and rename test workflows#208

Merged
joshua-temple merged 2 commits into
mainfrom
feat/fleet-e2e-orchestrator
Jun 17, 2026
Merged

feat: add fleet e2e orchestrator and rename test workflows#208
joshua-temple merged 2 commits into
mainfrom
feat/fleet-e2e-orchestrator

Conversation

@joshua-temple

@joshua-temple joshua-temple commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

What

Adds a maintainer CI harness, fleet-e2e.yaml ("Fleet E2E (live GitHub)"),
that revalidates the eight downstream cascade-example-* repos on real GitHub
as the release-candidate gate. Each suite runs in its own repo context (own
token, own main, own manifest); a green Fleet run means this cascade version
validated across the fleet.

How the E2E gate works

  • Triggers on workflow_run completion of "Integration (act + gitea)" plus
    manual workflow_dispatch. The workflow_run path makes the Integration
    dependency native: Fleet only fans out after Integration is green, no runner
    held open polling.
  • Job-level guard fans out only for a manual dispatch, or a green Integration
    run that was a push of an rc tag (startsWith(head_branch, 'v') and
    contains(head_branch, '-rc.')). This filters out merge_group and non-rc
    completions.
  • Sequencing: primary -> dependents (artifact-a, artifact-b, needs: primary)
    -> independents (4env, 3env, 2env, single-env, release-only, parallel). An
    aggregate job is the rc fleet gate: it renders a verdict only when the fleet
    actually fanned out (needs.resolve.result == 'success'), so filtered-out
    completions skip it as a clean no-op instead of failing it. When the fleet
    does fan out, the job fails if any stage failed and emits a per-repo pass/fail
    table to the step summary.
  • Cross-repo dispatch-recover-watch is factored into a dispatch-suite composite
    action (matches the setup-cli pattern): dispatch via gh workflow run,
    recover the run id by listing the target's runs created at/after a captured
    timestamp, then gh run watch --exit-status. Auth is CASCADE_STATE_TOKEN
    (GITHUB_TOKEN cannot dispatch cross-repo). Least-privilege permissions: per job.

Renames + badges + cron

  • validate.yaml: rename to "Tests & Lint"; add push: tags + workflow_dispatch
    (keeps workflow_call for pr.yaml). Gives it standalone runs so its badge
    renders.
  • e2e.yaml: rename to "Integration (act + gitea)"; remove the schedule cron
    (keeps push: tags, merge_group, workflow_dispatch). Fleet's workflow_run
    references this name.
  • README: third badge row grouping the three test workflows under their new names.

Caveats (honest)

  • The version under test is computed and logged in the step summary, but is
    passed to the suites INERT: the suites do not yet accept a cascade_version
    input, so the orchestrator dispatches them WITHOUT it (an extra input would
    error). Passing it through follows once the suites accept the input.
  • The fleet is currently paused (suites unpushed, token not provisioned), so
    there is no live run yet. This PR is build + lint only.
  • The rc gate is a documented signal: rc -> release promotion should consume the
    latest fleet conclusion for the tag before promoting.
  • The head_branch rc-tag guard is the primary path; the version-compute step
    also resolves the rc tag from head_sha as a documented fallback if
    head_branch is ever empty for a tag-triggered source run, selecting the
    highest rc by version sort for deterministic resolution.

Verification

  • actionlint (1.7.12) clean on all new/changed workflows; every third-party
    action SHA-pinned (matches existing repo style); composite action bash is
    shellcheck-clean.
  • go build ./... green. go test ./... green except the pre-existing,
    environment-dependent TestNormalizeWorkflowPath_ActionlintClean failure
    (fails identically on clean main; unrelated to this YAML-only change).

Add fleet-e2e.yaml, a maintainer CI harness that fans out to the eight
cascade-example repos on live GitHub as the release-candidate gate. It
triggers on workflow_run completion of the Integration workflow (native
e2e dependency) plus manual workflow_dispatch, guards on a green rc-tag
push, and sequences primary -> dependents -> independents with an
aggregate fan-in gate.

Factor the cross-repo dispatch-recover-watch logic into a
dispatch-suite composite action, matching the setup-cli pattern.

Rename Validate to Tests & Lint and add push:tags + workflow_dispatch so
it has standalone runs to badge while keeping workflow_call for pr.yaml.
Rename E2E to Integration (act + gitea) and drop the schedule cron. Add a
third README badge row grouping the three test workflows.

Signed-off-by: Joshua Temple <joshua.temple@stablekernel.com>
@joshua-temple joshua-temple force-pushed the feat/fleet-e2e-orchestrator branch from 80b6cbd to 5978aa9 Compare June 17, 2026 17:16
Gate the aggregate job on resolve succeeding so merge_group and non-rc tag completions skip the verdict instead of failing it. Make the head_sha rc-tag fallback deterministic with a version sort.

Signed-off-by: Joshua Temple <joshua.temple@stablekernel.com>
@joshua-temple joshua-temple merged commit 48927c8 into main Jun 17, 2026
7 checks passed
@joshua-temple joshua-temple deleted the feat/fleet-e2e-orchestrator branch June 17, 2026 18:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant