Skip to content

Contract-test MockGhClient against the GithubkitClient sweep surface #423

@hadamrd

Description

@hadamrd

Problem

The branch, worktree, and epic convergence sweeps (the operational-convergence GC that reclaims operational exhaust) assert their behavior exclusively against hand-rolled fakes and MockGhClient (src/forge_loop/gh_client.py:1348), never against the real GithubkitClient (src/forge_loop/gh_client.py:427). Nothing pins the fake's method surface to the real client's.

Concrete failure: rename GithubkitClient.delete_branch(owner, repo, branch) to delete_branch(owner, repo, ref), or drop a method, and MockGhClient keeps the stale signature. Every sweep test in tests/test_branch_sweep.py, tests/test_worktree_sweep.py, and tests/test_epic_sweep.py stays green against the stale fake while production GC raises TypeError/AttributeError at runtime — a wrong-but-green failure on the exact path that reclaims operational exhaust, and the path an autonomous patch is least likely to exercise live.

The fix is a single contract test that uses inspect.signature to assert MockGhClient exposes each sweep-relied method with the same name and parameter list as GithubkitClient. No production code change.

The sweep-relied surface (verified present on both classes today):
get_issue_state, delete_branch, issues_by_label, sub_issues, close_issue, add_comment.

Acceptance criteria

  • A new contract test asserts that for each method in the sweep surface (get_issue_state, delete_branch, issues_by_label, sub_issues, close_issue, add_comment), MockGhClient exposes an attribute of the same name that is callable.
  • For each such method, the test compares inspect.signature(GithubkitClient.<m>) against inspect.signature(MockGhClient.<m>) and FAILS on any divergence in parameter names, order, kind (positional/keyword-only), or defaults. Return-annotation differences need not fail (Mock may omit them).
  • The test is data-driven over the method-name list (parametrized or looped), so adding a name to the list is the only edit needed to extend coverage — no copy-pasted assertion blocks.
  • The sweep-surface list is defined once as a module-level constant in the test (a single source of truth a reviewer can read in five seconds), not inlined per assertion.
  • The test PASSES against the current aligned GithubkitClient/MockGhClient pair with no production code change.
  • An adversarial proof: a test (or documented xfail/monkeypatched case) demonstrates the contract test FAILS when a method is renamed/removed or its parameter list is mutated on one side — proving the assertion actually bites and isn't vacuously true.
  • self is excluded from the compared signature (compare bound-method-equivalent parameter lists), so an instance-vs-class lookup mismatch doesn't produce false failures.

Test matrix

Unit (primary deliverable):

  • Parametrized contract test over the 6 sweep-surface method names: name exists on MockGhClient AND signature matches GithubkitClient. (one PASS case per method)
  • Missing-method case: assert the contract logic raises/fails when a name in the list is absent from MockGhClient (simulate via a throwaway stub class or monkeypatch.delattr on a copy — do NOT mutate the real MockGhClient).
  • Divergent-signature case (adversarial / sad path): build a stub whose method has a renamed/extra/removed param and assert the comparison helper reports a mismatch. This guards against the contract test passing vacuously.

Integration:

  • None required — this is a pure introspection test. Do not stand up GitHub clients or network.

E2E:

  • None. Explicitly out of scope.

Out of scope

  • Do NOT change any production signature on GithubkitClient or MockGhClient. If a real divergence is discovered, STOP and report it in the issue/PR rather than silently "fixing" the fake — the operator needs to see the drift.
  • Do NOT expand the checked surface beyond the six listed methods. Other client methods (PR ops, merge ops, token sourcing) are not what the sweeps depend on; widening scope here is bloat.
  • Do NOT add a runtime/CI guard, decorator, metaclass, or Protocol conformance machinery to production code. A test is sufficient.
  • Do NOT refactor MockGhClient or GithubkitClient, reorder methods, or touch the GhClient Protocol (line 168).
  • Do NOT add return-type-annotation equality to the failing conditions (the mock legitimately may omit annotations).

File pointers

  • src/forge_loop/gh_client.pyread only. GhClient(Protocol) at line 168, GithubkitClient at line 427, MockGhClient at line 1348. Source of the signatures under contract.
  • tests/test_gh_client.pypreferred home for the new contract test (co-locate with existing client tests). Alternatively a new tests/test_gh_client_contract.py if you prefer isolation — either is acceptable.
  • tests/conftest.py — check for existing fixtures/import conventions before adding new ones.

Original report

The branch, worktree, and epic sweeps (the operational-convergence GC) all assert their behavior against hand-rolled fakes and MockGhClient, never against the real GithubkitClient. If a real-client method signature drifts (renamed param, dropped method) without the fake being updated, every sweep test stays green against a stale fake while production GC silently breaks — a wrong-but-green failure on the exact path that reclaims operational exhaust.

Mechanism (one): add a single contract test that, via inspect.signature, asserts MockGhClient exposes each method the sweeps depend on — get_issue_state, delete_branch, issues_by_label, sub_issues, close_issue, add_comment — with the same name and parameter list as GithubkitClient. No production code change.

Falsifiable AC: the test FAILS if any of those sweep-relied methods present on GithubkitClient is missing from MockGhClient or has a divergent parameter list, and PASSES for the current aligned pair.

Estimate: ~80 net LOC.

Customer story

As a maintainer deciding whether an autonomous patch is semantically trustworthy, I need the fake GhClient the convergence sweeps are tested against to stay aligned with the real client, so a signature drift can't let a broken branch/worktree/epic sweep pass CI against a stale adapter.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions