Skip to content

fix: scope external-update concurrency per component#210

Merged
joshua-temple merged 1 commit into
mainfrom
fix/external-update-per-component-concurrency
Jun 17, 2026
Merged

fix: scope external-update concurrency per component#210
joshua-temple merged 1 commit into
mainfrom
fix/external-update-per-component-concurrency

Conversation

@joshua-temple

Copy link
Copy Markdown
Collaborator

Problem

The generated external-update workflow shared orchestrate's ref-scoped concurrency group (orchestrate-${{ github.ref }}). When two distinct upstream components notify the same downstream repo on the same ref, both runs landed in one group. GitHub keeps only the latest pending run per concurrency group and cancels older pending ones even with cancel-in-progress: false, so one component's state write was silently dropped.

Live evidence (cascade-example-primary): a 4-run burst showed one CANCELLED run, which is exactly this shared-group cancellation. (The two FAILED runs in the same window failed earlier on bad dispatch inputs, external deploy 'shared'/'widget-b' not found in config, an example-repo data mismatch unrelated to this bug.)

Fix

Scope the default external-update concurrency group per component and per ref: cascade-external-${{ inputs.deploy_name }}-${{ github.ref }}.

  • Distinct components resolve to distinct groups, so both run to completion.
  • A component still serializes against itself (latest pending wins for that slot).
  • Cross-component and external-vs-orchestrate contention on the shared manifest file is resolved by commitWithApplicationRetry (5 attempts, fetch/reset/re-apply on non-fast-forward), which is its purpose.
  • An explicit concurrency.group override is still forwarded verbatim.
  • inputs.deploy_name is a workflow expression in the key, not shell text, so it carries no injection risk.

The state-token checkout fix and the stderr-capture behavior are unchanged.

Verification

  • TDD: updated the two tests that previously asserted the shared group; added a test that distinct components yield a per-component (non-shared) group. Red before, green after.
  • go build ./... && go test ./...: 1415 passed across 23 packages.
  • golangci-lint run ./...: no issues.
  • e2e module builds and vets clean.
  • The gitea e2e harness cannot reproduce GitHub's pending-run cancellation, so this relies on the generator assertion plus a live fleet re-run.

The external-update workflow shared orchestrate's ref-scoped concurrency
group (orchestrate-${{ github.ref }}). When two upstream components notify
the same downstream repo on the same ref, they landed in one group. GitHub
keeps only the latest pending run per group and cancels the older pending
ones even with cancel-in-progress: false, so one component's state write was
silently dropped.

Scope the default group per component and per ref
(cascade-external-${{ inputs.deploy_name }}-${{ github.ref }}). Distinct
components now run concurrently; a component still serializes against itself.
Cross-writer manifest contention is resolved by commitWithApplicationRetry's
fetch/reset/re-apply on a non-fast-forward push. Explicit concurrency.group
overrides are still forwarded verbatim, and the state-token checkout and
stderr-capture behavior are unchanged.

Signed-off-by: Joshua Temple <joshua.temple@stablekernel.com>
@joshua-temple joshua-temple merged commit 0b20ee0 into main Jun 17, 2026
7 checks passed
@joshua-temple joshua-temple deleted the fix/external-update-per-component-concurrency branch June 17, 2026 18:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant