Skip to content

fix: per-workspace broker ignores CODEX_HOME — make it account-aware so multi-account fallback works#369

Open
CarlosZiegler wants to merge 5 commits into
openai:mainfrom
CarlosZiegler:fix/broker-account-aware-codex-home
Open

fix: per-workspace broker ignores CODEX_HOME — make it account-aware so multi-account fallback works#369
CarlosZiegler wants to merge 5 commits into
openai:mainfrom
CarlosZiegler:fix/broker-account-aware-codex-home

Conversation

@CarlosZiegler

@CarlosZiegler CarlosZiegler commented Jun 12, 2026

Copy link
Copy Markdown

Problem

The companion keeps a persistent broker (codex app-server) per workspace. The broker inherits CODEX_HOME once, at spawn time (spawnBrokerProcessenv: options.env ?? process.env), and ensureBrokerSession reuses any live broker for that cwd unconditionally:

const existing = loadBrokerSession(cwd);
if (existing && (await isBrokerEndpointReady(existing.endpoint))) {
  return existing;   // caller's current CODEX_HOME is never considered
}

So whatever account was active on the very first invocation in a workspace is the account used forever (until the broker dies). Any later call with a different CODEX_HOME silently runs on the wrong account.

This bites hardest in the exact scenario where switching accounts matters: rate-limit / out-of-credits fallback. Real-world case: a Business workspace where the developer's seat hit its 5h usage window while other seats (non-developer teammates) sat idle at ~100% capacity. An orchestrator retried a failed task three times, rotating CODEX_HOME between three logged-in homes — all three attempts ran on the exhausted default account (verified: all three session rollout files landed under ~/.codex/sessions/, and direct codex exec with the same CODEX_HOME values succeeded).

Reproduce

# two logged-in homes: ~/.codex and ~/.codex-alice
mkdir -p /tmp/repro && cd /tmp/repro && git init -q
node "$PLUGIN_ROOT/scripts/codex-companion.mjs" task "reply OK" --wait                                # broker born on default account
CODEX_HOME="$HOME/.codex-alice" node "$PLUGIN_ROOT/scripts/codex-companion.mjs" task "reply OK" --wait  # expected: alice — actual: default
find ~/.codex-alice/sessions -name '*.jsonl' -mmin -2   # empty

Affected: v1.0.1 through v1.0.4 (broker-lifecycle.mjs unchanged between those tags).

Fix

Make the broker session account-aware:

  • store the spawning CODEX_HOME in broker.json (codexHome, "" = default account);
  • in ensureBrokerSession, reuse the live broker only when the caller's CODEX_HOME matches;
  • on mismatch, gracefully stop the old broker via the existing broker/shutdown RPC (sendBrokerShutdown) — not a stale-teardown, since the old broker is healthy and may be mid-turn — then spawn a fresh one with the caller's env.

Legacy broker.json files without the field are treated as default-account (?? ""), so same-account callers keep reusing warm brokers with no restart churn.

Tests

New tests/broker-lifecycle.test.mjs with a minimal fake-broker-fixture.mjs (listens on the unix endpoint, records the CODEX_HOME it was spawned with, exits on broker/shutdown):

  • same account → live broker reused (same pid/sessionDir);
  • switched account → old broker receives shutdown and exits, fresh broker spawns with the new env (asserted via the recorded CODEX_HOME), persisted state updated;
  • legacy session without codexHome → reused by default-account callers.

node --test tests/broker-lifecycle.test.mjs → 3/3 pass. Full suite: the 3 pre-existing failures on main (runtime.test.mjs status/result rendering) are unchanged; no new failures.

Docs

Adds a "Multiple Accounts (CODEX_HOME)" README section: per-account codex login into separate homes, shell aliases for interactive use, and the two footguns that make this bug easy to misdiagnose — aliases not existing in non-interactive shells, and env-assignment placement in compound commands (CODEX_HOME=... cd dir && codex ... applies the variable only to cd).

Related

Updates from review

  • Busy-aware rotation (review feedback, both bot findings addressed): account rotation never interrupts in-flight work. ensureBrokerSession probes the new broker/status RPC first, and the shutdown itself is now conditional — broker/shutdown { ifIdle: true } refuses atomically (same event-loop handling as the busy state) if a turn started after the probe. On any busy signal the call returns null and falls back to a directly spawned app server with the caller's env; rotation happens on the next idle call. Legacy brokers degrade gracefully (probe interpretable via their busy gate; ifIdle ignored — pre-existing behavior, no worse).
  • Docs: the README section now includes the full setup walkthrough (aliases → source → per-account codex login → per-account usage) and documents the busy-aware rotation semantics.
  • Tests: 5 lifecycle tests including "never kills a busy broker" and the probe→shutdown race simulation.

…ake effect

The broker (and the codex app-server it manages) inherits CODEX_HOME once at
spawn time, and ensureBrokerSession reused any live broker for the workspace
unconditionally — so a caller switching CODEX_HOME (e.g. account fallback
after a rate limit) silently kept running on the original account.

Store the spawning CODEX_HOME in broker.json and, when a caller arrives with
a different one, gracefully stop the old broker via the existing
broker/shutdown RPC and spawn a fresh one with the caller's env. Same-account
callers keep reusing the warm broker; legacy sessions without the field are
treated as default-account. Adds unit tests (fake broker fixture) and a README
section documenting multi-account usage and its shell gotchas.
@CarlosZiegler CarlosZiegler requested a review from a team June 12, 2026 10:43

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 912bd188bd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


if (existing) {
if (existingReady && (existing.codexHome ?? "") !== desiredCodexHome) {
await sendBrokerShutdown(existing.endpoint);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Avoid shutting down a busy broker on account changes

When a command runs with a different CODEX_HOME while the existing per-workspace broker is serving a background task or review, this unconditional broker/shutdown terminates that in-flight work. I checked plugins/codex/scripts/app-server-broker.mjs: broker/shutdown is handled before the busy check and shutdown() ends all connected sockets and closes the app client, so an active streaming turn is dropped instead of letting the new request fall back to direct execution or wait. This can make multi-account fallback kill unrelated active jobs in the same workspace.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in bc1258f. ensureBrokerSession now probes the new broker/status RPC (answered before the busy gate) before rotating. If the broker is busy, the call returns null and CodexAppServerClient.connect falls back to SpawnedCodexAppServerClient, which spawns a direct app server with the caller's env — so the new-account turn still runs on the right account and the in-flight work is untouched; rotation happens on the next call that finds the broker idle. Timeouts count as busy (an unresponsive broker is never killed mid-turn). Legacy brokers without broker/status degrade gracefully: their busy gate rejects the probe with BROKER_BUSY_RPC_CODE (read as busy), and when idle they forward it to an unknown-method error (read as idle). Covered by the new "account switch never kills a busy broker" test.

…i-account setup

Addresses review: an account switch could broker/shutdown a broker that was
mid-turn for another job in the same workspace. ensureBrokerSession now probes
the new broker/status RPC (answered before the busy gate; older brokers degrade
gracefully: their busy gate rejects the probe with BROKER_BUSY_RPC_CODE when
busy, and forward it to an unknown-method error when idle). If the broker is
busy, the call returns null and falls back to a directly spawned app server
with the caller's env — in-flight work is never interrupted; rotation happens
on the next idle call. Timeouts count as busy.

README: full alias setup example (add -> source -> per-account login -> use)
and the busy-aware rotation semantics.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bc1258f9da

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

// account rotation happens on the next call that finds the broker idle.
return null;
}
await sendBrokerShutdown(existing.endpoint);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Make broker shutdown conditional on remaining idle

When an account-B call rotates the broker, there is still a race between the idle probe and this shutdown: an account-A command can connect and start a turn after isBrokerBusy() returns false but before broker/shutdown is sent, and app-server-broker.mjs still handles broker/shutdown before its busy gate, so that new in-flight turn is closed. Fresh evidence is that the new code uses a separate non-atomic status probe followed by unconditional shutdown, rather than making the shutdown RPC refuse when the broker became busy.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also a fair point — fixed in bcc82b8 by making the shutdown itself conditional: broker/shutdown now accepts { ifIdle: true } and refuses with { shutdown: false, busy: true } when a request/stream is in flight. The busy check and the shutdown decision happen in the same event-loop message handling in app-server-broker.mjs, so a turn that started after the caller's idle probe atomically flips the outcome to a refusal instead of being dropped. ensureBrokerSession treats the refusal exactly like a busy probe: return null → direct-spawn fallback with the caller's env. The fake-broker fixture can now simulate becoming busy between broker/status and the shutdown, and the new "turn starting between the idle probe and the shutdown" test covers the race. Known limitation, noted in the code: legacy brokers ignore the param, so for them the window narrows to the probe but doesn't fully close — that only applies while a pre-update broker is still running.

… (ifIdle)

Closes the probe->shutdown race: broker/shutdown now accepts { ifIdle: true }
and refuses (shutdown: false, busy: true) when a request/stream is in flight —
the check and the shutdown happen in the same event-loop message handling, so
a turn that started after the caller's idle probe is never dropped. Account
rotation sends ifIdle and treats a refusal like the busy-probe result: return
null and fall back to a directly spawned app server with the caller's env.
Legacy brokers ignore the param (pre-existing behavior, no worse). Fixture can
simulate becoming busy between status and shutdown; new test covers the race.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bcc82b8da6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

if (await isBrokerBusy(existing.endpoint)) {
return null;
}
const didShutdown = await sendBrokerShutdown(existing.endpoint, { ifIdle: true });

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid ifIdle shutdowns against legacy brokers

When the saved session belongs to a pre-fix broker (no codexHome in broker.json), this sends the new ifIdle parameter to a process that does not implement it; I checked the parent app-server-broker.mjs, where broker/shutdown is handled unconditionally before the busy gate. The fresh evidence is the legacy reuse path here (existing.codexHome ?? "") plus the new ifIdle call: if an account-A command starts after the status probe but before this shutdown during an upgrade, the old broker ignores ifIdle and the account-B rotation still kills that in-flight turn.

Useful? React with 👍 / 👎.

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cd7fb09aa1

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +202 to 203
if (existing && existingReady && (existing.codexHome ?? "") === desiredCodexHome) {
return existing;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply account checks to reused broker status calls

This account match only applies to callers that go through ensureBrokerSession; callers using reuseExistingBroker still read loadBrokerSession(cwd)?.endpoint directly in CodexAppServerClient.connect. In practice, after account A has started a shared broker, running /codex:setup or another auth/status path with CODEX_HOME set to account B will connect to A's broker and report A's account/read/config/read results, so setup can say the wrong account is logged in even though task/review calls would rotate or fall back for B.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant