Collect a worker's changed-file set from its worktree git diff

Parent: #440 · Epic: "Enforce leased filesystem write-roots against the worker's actual diff"

## Problem

The lease audit needs the worker's **actual** changed-file set — ground truth of what hit disk — but today there is no collector that reads it from git. The downstream check (`write_root_violations`, a sibling issue under #440) has nothing to feed it, so we'd be forced to infer touched files from the tool log, which an untrusted worker can lie about or bypass (e.g. `Bash(*)` writes that never appear as Write/Edit tool calls).

Concrete example: a worker leased `write_roots=("/tmp/forge-x/wt-loop-200",)` runs `git mv` plus a stray `echo > /etc/escape` via Bash. The tool log shows only the `git mv`. We need a collector that asks git directly: "what changed in this worktree?" — tracked modifications **and** untracked files — and returns them as absolute normalized paths the audit can compare against the lease roots.

This issue delivers ONLY that collector. It is a thin, pure function with an injected git runner (same pattern as `worktree_sweep.sweep`'s `remove` callable — no global `subprocess` import), so it is fully testable with a fake runner and never touches the real filesystem in unit tests.

## Acceptance criteria

- [ ] A function `worker_changed_paths(run_git, worktree_path, *, base_ref) -> tuple[str, ...]` exists in the sandbox package.
- [ ] It invokes the **injected** `run_git` callable twice: once for tracked modifications (`git diff --name-only <base_ref>`) and once for untracked-but-not-ignored files (`git ls-files --others --exclude-standard`). It does NOT import or call `subprocess` directly.
- [ ] Each returned name is resolved against `worktree_path` and normalized to an **absolute** path (resolve `..`/symlink-style normalization via `os.path.normpath`/`abspath`, consistent with `worktree_sweep._norm`).
- [ ] The result is **deduplicated** (a file that is both diffed and listed appears once) and **order-stable** (deterministic ordering — e.g. first-seen insertion order — so tests can assert exact equality).
- [ ] Primary criterion: given a fake `run_git` returning a known set of modified + untracked names, the collector returns each name resolved to an absolute normalized path under `worktree_path`, deduplicated and order-stable.
- [ ] Best-effort / non-crashing: if `run_git` raises (git missing, bad ref, non-zero exit surfaced as an exception), the collector swallows it and returns `()` — it MUST NOT propagate an exception that could abort the tick (mirror the `# noqa: BLE001 — best-effort, never crash the tick` convention from `worktree_sweep.sweep`).
- [ ] Empty git output (no changes) yields `()`, not a tuple containing the empty string.
- [ ] New code passes all four CI gates (`task ci`): ruff format, ruff check, mypy (clean for new code), pytest with coverage floor.

## Test matrix

Unit (in `tests/test_sandbox_changed_paths.py`, fake `run_git` — no real git, no real FS):
- **Happy path / primary AC**: fake returns `{"src/a.py", "src/b.py"}` from diff and `{"new.txt"}` from ls-files → result is the three names each resolved absolute+normalized under `worktree_path`, in stable order.
- **Dedup**: a name appears in BOTH the diff and the ls-files output → appears exactly once in the result.
- **Normalization**: a returned name like `./src/../src/a.py` or a nested `sub/./x` resolves to the same canonical absolute path; trailing-slash and `.`/`..` segments collapse.
- **Empty output**: fake returns empty/whitespace for both commands → result is `()` (no `""` element).
- **Adversarial / sad path (required)**: fake `run_git` raises `RuntimeError`/`CalledProcessError` on the diff call → collector returns `()` and does NOT raise. Add a variant where the SECOND call (ls-files) raises after the first succeeded — still no exception escapes.
- **Argument assertion**: capture the argv passed to the fake runner and assert `base_ref` is threaded into the `git diff --name-only <base_ref>` invocation and that `--others --exclude-standard` is present on the ls-files invocation.

Integration / e2e: out of scope for this issue (the collector is pure with an injected runner). Wiring it into a real tick against a real worktree, and the `write_root_violations` comparison, are separate issues under #440.

## Out of scope

- Do NOT implement `write_root_violations` or any lease-comparison/enforcement logic — this issue only **collects** paths.
- Do NOT wire the collector into the runner tick, `dispatch.py`, `tick_checks.py`, or any lease watchdog. No call sites yet.
- Do NOT import or call `subprocess` in the collector. The runner is injected, exactly like `sweep`'s `remove`.
- Do NOT add config, settings, or events for this. No `Settings` fields, no `@register_event`.
- Do NOT handle staged-vs-unstaged distinctions, rename detection (`--find-renames`), or `.gitignore` policy tuning beyond the two documented commands. Keep it ~90 net LOC incl. tests.
- Do NOT change the real git adapter / `Container.git` surface. If a concrete production caller eventually needs a real runner, that adapter wiring is a follow-up.

## File pointers

- **New module**: `src/forge_loop/sandbox/changed_paths.py` (candidate path — the `sandbox/` package already houses `policy.py` with `FilesystemScope.write_roots`, the natural home for the leased-write-root machinery this feeds). If you prefer to co-locate with the future consumer, a single `sandbox/write_roots.py` is acceptable — confirm with the epic's other open issues under #440 first.
- **New test**: `tests/test_sandbox_changed_paths.py`.
- **Pattern to copy**: `src/forge_loop/worktree_sweep.py` — `sweep(remove, ...)` for the injected-callable shape, `_norm()` for path normalization, and the `# noqa: BLE001 — best-effort GC, never crash the tick` swallow-and-continue style. Tests in `tests/test_worktree_sweep.py` show the fake-callable testing convention to mirror (`lambda ...: ...` injected, argv captured in a list).
- **Downstream contract**: `src/forge_loop/sandbox/policy.py` (`FilesystemScope.write_roots: tuple[str, ...]`) — your output feeds a comparison against these roots, so returning absolute normalized paths keeps the units consistent.

## Worker note

The `run_git` signature is yours to define (likely `Callable[[list[str]], str]` returning stdout, raising on failure) — pick the shape that mirrors how a thin git adapter would expose it, and document it in the docstring. Don't over-engineer; the fake runner in tests defines the contract.

## Original report

ONE mechanism: a thin collector `worker_changed_paths(run_git, worktree_path, *, base_ref) -> tuple[str, ...]` that returns the absolute, normalized set of paths the worker touched in its worktree — tracked modifications plus untracked files — by invoking an INJECTED git runner (no global subprocess, same injection pattern as worktree_sweep.sweep's `remove` callable). Combines `git diff --name-only <base_ref>` with `git ls-files --others --exclude-standard`, dedupes, and resolves each name against worktree_path so the output is feedable directly into write_root_violations.

Primary acceptance criterion (one sentence, test-provable): given a fake git runner returning a known set of modified + untracked names, the collector returns each name resolved to an absolute normalized path under worktree_path, deduplicated and order-stable. Best-effort and non-crashing: a git failure surfaces as an empty/explicit result, never an exception that aborts the tick. ~90 net LOC incl. tests with a fake runner.

### Customer story

An operator dispatching workers on a real repository without trusting their context or code needs the worker's ACTUAL changed-file set captured from git — not inferred from the tool log — so the lease audit runs against ground truth of what hit disk.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect a worker's changed-file set from its worktree git diff #442

Problem

Acceptance criteria

Test matrix

Out of scope

File pointers

Worker note

Original report

Customer story

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Collect a worker's changed-file set from its worktree git diff #442

Description

Problem

Acceptance criteria

Test matrix

Out of scope

File pointers

Worker note

Original report

Customer story

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions