Skip to content

refactor(agent,runtime): unify PATH/env handling via command prelude#63

Open
zpzjzj wants to merge 8 commits into
mainfrom
fix/env_set
Open

refactor(agent,runtime): unify PATH/env handling via command prelude#63
zpzjzj wants to merge 8 commits into
mainfrom
fix/env_set

Conversation

@zpzjzj
Copy link
Copy Markdown
Collaborator

@zpzjzj zpzjzj commented May 25, 2026

Why

The agent layer was injecting a shell-expansion string —
$HOME/.local/bin:$HOME/.nvm/current/bin:$PATH — as the literal value of
the PATH key in map[string]string env tables. Process-level env vars
are not shell-expressions, so every runtime had to compensate:

  • NoneRuntime ran user-provided values through os.Expand against
    the host environment to coincidentally produce the right PATH.
  • DockerRuntime and OpenSandboxRuntime each detected PATH
    values containing $, stripped them from the literal env list, and
    prepended an export PATH="…" line to the command so the inner
    shell would expand them. Docker additionally rejected $(…) to
    guard against command substitution firing inside that auto-generated
    export.

Three structurally identical patches lived in three files (~23 lines of
defensive commentary in docker.go alone), plus NoneRuntime behaved
differently from the other two for user-supplied environment.env
values.

What

Move PATH from the env-map layer to the command layer, where shell
expansion lives.

  1. Agent layer — new withAgentExecPath(cmd) helper prepends
    export PATH="$HOME/.local/bin:$HOME/.nvm/current/bin:$PATH" to the
    command itself. mergeExecOptionsEnv no longer injects PATH into
    the env map. Applied to all 8 agent Install / Run / Check call sites
    (claude-code, codex, qodercli, generic CLI).
  2. Runtimes — docker and opensandbox now pass every env var
    literally; no key gets special handling. The pathPrefix branch,
    the $(…) rejection, shellDoubleQuote, literalRemoteEnv,
    withRemoteEnvExpansion, and the matching cfg-time PATH skip are
    all deleted.
  3. NoneRuntimemergeEnv no longer routes values through
    expandEnvMap / os.Expand. Values forward literally, matching
    docker and opensandbox. This is the user-visible breaking change
    (see CHANGELOG).

Commits are atomic, in the order an audit will read them:

9d12711 refactor(agent): move PATH injection from env map to command prelude
479ff32 refactor(runtime): drop PATH special-case in docker/opensandbox Exec
bece2ef refactor(runtime): drop NoneRuntime env expansion for consistency
226344a docs(e2e): update mockEngineHome comment to reflect prelude mechanism

Safety / attack surface

The deleted $(…) rejection in docker.go guarded an attack vector
that only existed because the runtime was constructing
export PATH="<user-value>" from caller-supplied env. With env passed
via --env KEY=VALUE / sandbox Envs (process-level env-var setting,
no shell parsing), that vector no longer exists. The agent layer's
withAgentExecPath uses a compile-time constant (agentExecutablePath),
never user input.

Breaking change

NoneRuntime no longer expands \$VAR / \${VAR} in
environment.env or opts.Env values. Configs that depended on
host-side expansion (e.g. MY_VAR: \"\$HOME/foo\") should switch to
literal values or move the export into a setup_steps shell snippet.
Documented in CHANGELOG.

Test plan

  • go build ./...
  • go vet ./...
  • go test ./... (all packages green)
  • go test -tags=e2e ./e2e/... -run TestPipeline — end-to-end mock-engine path, which exercises the agent → runtime PATH prelude chain
  • Existing PATH-injection assertions in agent_test.go / claude_code_test.go / codex_test.go / qodercli_test.go flipped to negative: env map must NOT carry PATH, command must start with the export prelude
  • New TestDockerRuntime_ExecPassesEnvLiterally confirms PATH and \$(…)-containing values forward literally
  • New TestNoneRuntime_ForwardsEnvLiterally confirms no host-side expansion
  • CI to confirm on Linux runners

🤖 Generated with Claude Code

@zpzjzj zpzjzj requested a review from hittyt as a code owner May 25, 2026 07:18
@zpzjzj zpzjzj requested review from jwx0925, lbfsc and lijunfeng722 May 25, 2026 07:53
@zpzjzj zpzjzj self-assigned this May 25, 2026
@zpzjzj zpzjzj added the enhancement New feature or request label May 25, 2026
@zpzjzj
Copy link
Copy Markdown
Collaborator Author

zpzjzj commented May 25, 2026

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 226344a804

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread internal/runtime/docker.go
@zpzjzj zpzjzj force-pushed the fix/env_set branch 4 times, most recently from 8cae05a to 4ee2bda Compare May 27, 2026 01:14
zpzjzj and others added 8 commits May 27, 2026 16:23
…updates

Lets setup-time orchestrators (notably each agent's Install) seed the
runtime's persistent env baseline with values resolved through the
target shell. Subsequent Exec calls inherit these vars unless overridden
by opts.Env. The intended use case is one-time setup before the runtime
is exercised in parallel — concurrent MergeEnv vs. Exec is not safe and
documented as such.

The three real runtimes (NoneRuntime, DockerRuntime, OpenSandboxRuntime)
implement MergeEnv as a simple cfg.Env overlay. For DockerRuntime, the
container's entrypoint (sleep infinity) is started with the env present
at Create time, so post-Create MergeEnv only affects subsequent
`docker exec` calls — the entrypoint runs unaffected.

Six in-tree test mocks (mockRuntime, scriptJudgeRuntime, mockJudgeTestRuntime,
claudeCodeTestRuntime, codexTestRuntime, qoderTestRuntime) gain trivial
no-op stubs to keep implementing the interface; tests that need to
observe MergeEnv behaviour will get real recording in later commits.

No caller changes — this commit only widens the interface and adds the
plumbing. Tests for the new behaviour live next to each real runtime.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…a MergeEnv

Move PATH provisioning from "agent injects $-string into env map and hopes
the runtime expands it" to "agent asks the runtime's own shell to expand
the string, then writes the literal result to the runtime's env baseline".

Concretely:

- agent.go: `mergeExecOptionsEnv` no longer injects PATH; it now only
  merges credentials, observability, and caller env. The
  `agentExecutablePath` constant is gone — it was the shell-expansion
  string that all the runtime-side compensation code existed to handle.
- agent.go: new private helper `probeAndMergePATH(ctx, rt, probeCmd)`
  runs the supplied probe (a `printf '%s' "$HOME/..."`-style command)
  through the runtime, then calls `rt.MergeEnv({"PATH": <result>})`.
  Probe failure logs a warning and leaves the runtime's default PATH
  in effect.
- Each agent's `Install` calls the helper at the top with its own
  per-agent probe constant:
    - claude-code & codex: include $HOME/.local/bin AND
      $HOME/.nvm/current/bin (their CLIs are node scripts whose shebang
      needs node, which lives under the nvm symlink).
    - qodercli: $HOME/.local/bin only (qodercli is a self-contained
      binary placed by its official installer; no nvm needed).
  The probe constants are deliberately not shared even where claude
  and codex have identical values — a future divergence shouldn't
  silently cross-contaminate.

Type=none semantic change: the evaluator skips `ag.Install` when
environment.type=none, so the probe doesn't run there either. The
host shell's PATH is now the contract for type=none, matching the
Unix expectation that tools inherit your shell env. This drops the
previously-forced `$HOME/.local/bin:$HOME/.nvm/current/bin:$PATH`
for type=none users; documented in a follow-up commit's CHANGELOG.

Tests: agent_test.go gains three TestProbeAndMergePATH_* cases for
happy/failure/empty-stdout paths. The three per-agent install tests
have their `lastExecEnv["PATH"] == agentExecutablePath` assertion
replaced with the new shape: PATH absent from opts.Env (agent doesn't
inject) AND `mergedEnv["PATH"]` populated (probe→MergeEnv ran). The
three fake runtimes (claudeCodeTestRuntime, codexTestRuntime,
qoderTestRuntime) gain a probeResponseStdout field, a recording
MergeEnv impl, and a guard that detects the probe command shape so
it isn't recorded as the agent's "real" command. e2e/pipeline_test.go
comment updated to describe the new mechanism (and to note why the
mock-engine path is unaffected: it uses type=none with explicit
mockEngineEnv-supplied PATH).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…x Exec

Both runtimes carried matching hacks to detect PATH values containing
\$, strip them from the literal env list, and prepend `export PATH=...`
to the command so the inner shell would expand them. Docker also
rejected \$(...) to guard against command substitution firing inside
that auto-generated export.

The agent layer no longer pushes \$-bearing PATH (see prior commit:
each agent's Install probes the runtime shell for the literal PATH and
hands it to runtime.MergeEnv). With no caller pushing such values, the
runtime can pass every env key literally — no key special-cased, no
\$(...) rejection needed because the runtime never constructs a shell
command from a caller-supplied env value.

Removes:
- docker.go: PATH-with-\$ skip in buildCreateArgs and the pathPrefix
  block in Exec (including the \$(...) rejection and the comment that
  explained why it was needed).
- opensandbox.go: literalRemoteEnv, withRemoteEnvExpansion, and the
  shellDoubleQuote helper they shared. opensandbox Exec now just
  passes Command + Envs straight through to the SDK.

Tests updated to assert literal pass-through (including \$-bearing PATH
and \$(...) forms) rather than the deleted special-case behaviour.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
mergeEnv previously routed values through expandEnvMap, which used
os.Expand to substitute \$VAR / \${VAR} against the host env before
handing the result to cmd.Env. Docker and opensandbox have always
forwarded values literally, so the same eval.yaml could behave
differently under different runtimes.

Drop expandEnvMap and forward values literally in NoneRuntime as well.
Callers that need shell-side expansion should resolve the value first
(see internal/agent.probeAndMergePATH) or prepend `export X=...` to
their command. CHANGELOG documents the migration.

Tests: TestNoneRuntime_ExecExpandsPathFromRuntimeEnv replaced with
TestNoneRuntime_ForwardsEnvLiterally, which asserts the inverse
invariant — \$-bearing values reach the child process as the literal
string, with no host-side substitution.

CHANGELOG: [Unreleased] section now lists the three concrete changes
shipped by this PR — `runtime.MergeEnv` added, agent layer no longer
injects PATH, NoneRuntime stops expanding `\$VAR`. Includes the
type=none migration note about the host shell's PATH now being the
contract.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…est probe match

Two follow-ups from PR review:

1. CHANGELOG was silent on the docker/opensandbox side of the "values
   forward literally" rule. The doc only called out NoneRuntime
   dropping host-side $VAR expansion, but the same PR also stops
   silently dropping `environment.env.PATH` values containing `$` in
   docker (previously buildCreateArgs skipped such keys to keep the
   container's entrypoint executable). Repo grep finds zero users
   relying on this — phantom scenario, deliberately accepted in PR
   discussion — but the breakage mode (container fails to start) is
   loud enough that users hitting it deserve to see the migration
   note alongside the NoneRuntime one.

2. The three agent test fakes used `strings.HasPrefix(command,
   "printf '%s' \"$HOME/")` to detect probe Exec calls. Loose prefix
   match would silently swallow any future test that legitimately
   issued such a command for non-probe reasons, returning the canned
   fake PATH instead of recording the real exec. Tighten to exact
   match against each agent's known probe constant
   (claudeCodeExecPathProbeCmd / codexExecPathProbeCmd /
   qoderExecPathProbeCmd). Production code never had this issue —
   real runtimes don't dispatch on command shape — so this is purely
   a test-isolation hardening.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…eanup

The earlier "drop PATH-with-$ special-case in docker/opensandbox Exec"
commit deleted literalRemoteEnv / withRemoteEnvExpansion / shellDoubleQuote
from opensandbox.go but didn't notice that PR #71 (merged into main while
this branch was in flight) had just added a TestOpenSandboxCommandHelpers
test specifically exercising those three helpers. After rebasing onto the
new main the symbols are referenced from a test but defined nowhere —
compile fails on every CI job that hits the runtime package.

The helpers were intentionally removed (no caller now constructs a shell
command from a $-bearing env value, so neither the auto-quote nor the
$(...) rejection has anything to guard). Drop the orphaned test.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rgeEnv impls

The three runtimes' MergeEnv methods were byte-identical:

  if len(env) == 0 { return }
  if r.cfg.Env == nil { r.cfg.Env = make(map[string]string, len(env)) }
  maps.Copy(r.cfg.Env, env)

Hoist the body into a package-level `mergeIntoEnvBaseline` helper in
runtime.go and shrink each runtime's MergeEnv to a one-line delegation.
The methods stay (the Runtime interface still demands them, and the
docstrings document each runtime's specific caveat — e.g. docker's
note that post-Create MergeEnv only affects subsequent `docker exec`
calls, not the long-running container entrypoint), but the behaviour
lives in exactly one place.

Reasoning for delegation over embedding: Go embedding would also work
(`type envBaseline struct{...}; MergeEnv(...)` embedded into each
concrete runtime), but that would require moving the env state out of
cfg.Env into a new field, threading initialization through every
construction site (including struct-literal `&NoneRuntime{cfg:...}`
patterns in tests), and renaming `r.cfg.Env` reads in every Exec
path. The free-helper form gives the same "one implementation"
property with zero changes to existing constructors or readers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…hods

CI lint surfaced two classes of nits the local incremental lint missed:

1. gofumpt formatting on the trailing close-brace of claude_code_test.go,
   codex_test.go, and qodercli_test.go. Applied verbatim.

2. dupl flagged the three agent Install methods as near-duplicates of
   each other (49-68 lines pair-wise). They are intentionally similar:
   each agent's Install shares the same probe→merge→exec lifecycle and
   only varies in the per-agent probe constant and default install
   command builder, which are already pulled out into per-agent
   constants and functions. Extracting the orchestration into a
   BaseAgent helper would add an indirection layer that obscures the
   per-agent control flow without removing real complexity. Match the
   existing pattern (codex.go:232 / qodercli.go:71 on Run) and pin
   `//nolint:dupl` with the rationale on each Install.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread CHANGELOG.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants