Skip to content

feat(agents): opencode host runtime — serve lifecycle + drive_turn (#588 phase 2)#592

Merged
jaylfc merged 3 commits into
devfrom
feat/opencode-runtime
Jun 5, 2026
Merged

feat(agents): opencode host runtime — serve lifecycle + drive_turn (#588 phase 2)#592
jaylfc merged 3 commits into
devfrom
feat/opencode-runtime

Conversation

@jaylfc
Copy link
Copy Markdown
Owner

@jaylfc jaylfc commented Jun 5, 2026

Phase 2 of the opencode taOS agent (#588), on top of the merged adapter (#591).

What

tinyagentos/opencode_runtime.py:

  • OpenCodeServer manages a host opencode serve process: write_config (the LiteLLM provider config pointing at the proxy with the agent's own scoped key), ensure_running (idempotent — spawns the server, polls GET /doc for health up to a deadline), stop, is_running, base_url.
  • drive_turn runs one turn through OpenCodeAdapter, wiring its reply dicts to the sink. Mirrors openclaw_acp_runtime: lifecycle (server) is separate from turn execution (drive_turn takes a base_url), and it never raises — failures degrade to a single error reply (no raw-exception leak, matching the openclaw pattern).

Tests

tests/test_opencode_runtime.py (14): write_config JSON shape, ensure_running idempotency (spawns once, no re-spawn when healthy) + health-timeout, and drive_turn wiring the sink + the error-on-failure path (one error reply, no raise). Full file set 38 green (with the adapter tests).

Next (#588): the taOS agent as an agent record with its own LiteLLM key, configurable in the Agents app, then rewire routes/taos_agent.py to drive this runtime + rename "taOS Assistant" to "taOS agent".

Summary by CodeRabbit

  • New Features

    • Add OpenCode runtime to spawn and manage a local agent server with health checks, optional auth, and graceful shutdown.
    • Add a turn driver to stream agent responses to callers with guaranteed cleanup and single-error reporting.
  • Tests

    • New test suite covering config file creation (structure, permissions, overwrite), server start/respawn/timeout behaviors, streaming replies, argument propagation, and error isolation.

)

OpenCodeServer manages a host 'opencode serve' process (writes the LiteLLM
provider config pointing at the proxy with the agent's own key, spawns the
server, waits on /doc health) and drive_turn runs one turn through
OpenCodeAdapter, wiring its reply dicts to the sink. Mirrors openclaw_acp_runtime;
never raises out of drive_turn (degrades to an error reply).
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 5, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: bf99c4c0-8053-4765-9ce5-03111c44f790

📥 Commits

Reviewing files that changed from the base of the PR and between f0ea8f8 and b97d67f.

📒 Files selected for processing (2)
  • tests/test_opencode_runtime.py
  • tinyagentos/opencode_runtime.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tinyagentos/opencode_runtime.py

📝 Walkthrough

Walkthrough

Adds tinyagentos/opencode_runtime.py, a new host-side module for managing an opencode serve subprocess and driving agent turns via streaming to a sink. Introduces OpenCodeServerConfig for subprocess parameters, OpenCodeServer for idempotent startup/shutdown and health polling, and drive_turn() for running turns with guaranteed error isolation. Includes comprehensive test suite validating configuration serialization, subprocess lifecycle, health timeouts, and turn execution with sink wiring and error degradation.

Changes

OpenCode Runtime Integration

Layer / File(s) Summary
Server configuration and config file writing
tinyagentos/opencode_runtime.py, tests/test_opencode_runtime.py
OpenCodeServerConfig dataclass parameterizes subprocess, LiteLLM proxy, and auth settings. write_config() serializes to $HOME/.config/opencode/opencode.json with provider/model mapping. Tests verify directory creation, JSON structure, single-model serialization, and idempotent overwrites.
Server lifecycle and health polling
tinyagentos/opencode_runtime.py, tests/test_opencode_runtime.py
OpenCodeServer manages subprocess: base_url property, is_running() state check, ensure_running() idempotent startup with /doc health polling until ready or TimeoutError, stop() graceful termination with kill-wait, and _health_check() async HTTP health checks with optional Basic auth. Tests verify idempotency, respawn on death, and timeout behavior.
Turn driver with error isolation
tinyagentos/opencode_runtime.py, tests/test_opencode_runtime.py
drive_turn() creates and configures OpenCodeAdapter, ensures session, sends prompt, streams adapter replies (delta, final) to sink. Catches all exceptions and guarantees exactly one error dict sent to sink (awaiting if coroutine); always closes adapter in finally. Tests verify sink wiring, config propagation, and error isolation across adapter and session failures.

Sequence Diagram

sequenceDiagram
  participant Caller
  participant drive_turn
  participant OpenCodeAdapter
  participant Sink
  Caller->>drive_turn: text, trace_id, sink, base_url, model_id, ...
  drive_turn->>OpenCodeAdapter: create with config
  drive_turn->>OpenCodeAdapter: ensure_session()
  drive_turn->>OpenCodeAdapter: prompt(text)
  alt Success
    OpenCodeAdapter->>Sink: delta dict with trace_id
    OpenCodeAdapter->>Sink: final dict with trace_id
  else Adapter Exception
    drive_turn->>Sink: error dict with trace_id
  else ensure_session Exception
    drive_turn->>Sink: error dict with trace_id
  end
  drive_turn->>OpenCodeAdapter: close()
  drive_turn->>Caller: return (no raise)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I stitched a config in moonlit code,
I watched the server hum and bode,
When prompts stream delta, final, bright,
I tuck errors safe and end the night,
Hopping home while logs glow low.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main changes: introducing OpenCodeServer for subprocess lifecycle management and drive_turn function for turn execution, with reference to phase 2.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/opencode-runtime

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
tests/test_opencode_runtime.py (1)

124-215: ⚡ Quick win

Add a regression for the live-but-unhealthy startup path.

The suite covers healthy and dead processes, but not the case where _proc is still alive while _health_check() stays false for a few polls. That branch is where ensure_running() can currently double-spawn on the same port and miss its deadline budget, so it needs a targeted regression test.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_opencode_runtime.py` around lines 124 - 215, Add a regression test
that exercises the “live-but-unhealthy” startup path in
OpenCodeServer.ensure_running: simulate asyncio.create_subprocess_exec returning
a live _FakeProcess (with returncode None) while OpenCodeServer._health_check
initially returns False for several polls then True (or stays False to trigger
timeout), and assert that ensure_running does not double-spawn on the same port
(i.e., spawn_count increments only once) and respects the deadline behavior;
reference the existing test helpers/_symbols: OpenCodeServer.ensure_running,
OpenCodeServer._health_check, asyncio.create_subprocess_exec, _FakeProcess, and
server._proc.returncode to implement the scenario and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tinyagentos/opencode_runtime.py`:
- Around line 81-99: The config file is written with default permissions,
leaking the LiteLLM key; ensure the directory is created with owner-only
permissions and the file is created with owner-read/write only. Change the
config_dir.mkdir call to set mode=0o700 (and keep parents=True, exist_ok=True)
and, after writing the payload (where config_path.write_text is used), call
config_path.chmod(0o600) (or atomically open/write the file with owner-only
mode) so the directory and opencode.json are owner-only; use os.umask if needed
to enforce modes on creation.
- Around line 145-146: The subprocess for the long-lived "opencode serve" child
is created with stdout=asyncio.subprocess.PIPE and
stderr=asyncio.subprocess.PIPE but those pipes are never drained; change the
creation in opencode_runtime.py so the child won't block: either set stdout and
stderr to asyncio.subprocess.DEVNULL (or None to inherit parent fds) when
calling create_subprocess_exec/create_subprocess_shell, or keep the PIPEs and
spawn async reader tasks that continuously read from process.stdout and
process.stderr (ensure you reference the same process variable used when
awaiting the child, e.g., the proc/process returned by create_subprocess_*).
Make one of these fixes so the OS pipe buffers cannot fill and stall the
runtime.
- Around line 127-148: The code currently may spawn a new opencode process while
self._proc is still alive; update the startup logic in the method that uses
self._proc/is_running() and _health_check() so it first polls the existing child
while it is alive and only spawns a new process if the child has exited.
Implement a real deadline using asyncio.get_running_loop().time() (compute
deadline = now + deadline_s) and loop: while loop.time() < deadline and
self._proc is not None and self.is_running(): await _health_check() with short
sleeps between checks; if the child exits during that loop fail fast and do not
spawn; only call asyncio.create_subprocess_exec(self._cfg.binary, "serve", ...)
when there is no live child. Apply the same change to the second similar block
(the one around the 154-166 region) so both startup paths use the same
live-child polling and time-based deadline logic.

---

Nitpick comments:
In `@tests/test_opencode_runtime.py`:
- Around line 124-215: Add a regression test that exercises the
“live-but-unhealthy” startup path in OpenCodeServer.ensure_running: simulate
asyncio.create_subprocess_exec returning a live _FakeProcess (with returncode
None) while OpenCodeServer._health_check initially returns False for several
polls then True (or stays False to trigger timeout), and assert that
ensure_running does not double-spawn on the same port (i.e., spawn_count
increments only once) and respects the deadline behavior; reference the existing
test helpers/_symbols: OpenCodeServer.ensure_running,
OpenCodeServer._health_check, asyncio.create_subprocess_exec, _FakeProcess, and
server._proc.returncode to implement the scenario and assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: b7fc76b8-8d47-4400-8cd3-4527aefd252f

📥 Commits

Reviewing files that changed from the base of the PR and between df2d7a3 and f0ea8f8.

📒 Files selected for processing (2)
  • tests/test_opencode_runtime.py
  • tinyagentos/opencode_runtime.py

Comment on lines +81 to +99
config_dir = Path(self._cfg.home) / ".config" / "opencode"
config_dir.mkdir(parents=True, exist_ok=True)
models = {mid: {} for mid in self._cfg.model_ids}
payload = {
"$schema": "https://opencode.ai/config.json",
"provider": {
"litellm": {
"npm": "@ai-sdk/openai-compatible",
"name": "LiteLLM",
"options": {
"baseURL": self._cfg.litellm_base_url,
"apiKey": self._cfg.litellm_key,
},
"models": models,
}
},
}
config_path = config_dir / "opencode.json"
config_path.write_text(json.dumps(payload, indent=2), encoding="utf-8")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Restrict the on-disk config permissions.

This file persists the agent-scoped LiteLLM key in plaintext, but the directory and file permissions are whatever the current umask allows. On a shared host, that can leave opencode.json readable by other local users; create the config dir/file with owner-only permissions before writing the key.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tinyagentos/opencode_runtime.py` around lines 81 - 99, The config file is
written with default permissions, leaking the LiteLLM key; ensure the directory
is created with owner-only permissions and the file is created with
owner-read/write only. Change the config_dir.mkdir call to set mode=0o700 (and
keep parents=True, exist_ok=True) and, after writing the payload (where
config_path.write_text is used), call config_path.chmod(0o600) (or atomically
open/write the file with owner-only mode) so the directory and opencode.json are
owner-only; use os.umask if needed to enforce modes on creation.

Comment thread tinyagentos/opencode_runtime.py
Comment thread tinyagentos/opencode_runtime.py Outdated
Comment thread tinyagentos/opencode_runtime.py Outdated
"serve",
"--port", str(self._cfg.port),
"--hostname", "127.0.0.1",
stdout=asyncio.subprocess.PIPE,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CRITICAL: Subprocess output pipes not consumed

The subprocess is created with stdout=asyncio.subprocess.PIPE and stderr=asyncio.subprocess.PIPE, but these pipes are never read. If the subprocess produces enough output to fill the OS pipe buffers, it will block waiting for the pipes to be read.

This could cause the opencode serve process to hang indefinitely.

Fix: Either redirect output to devnull (stdout=asyncio.subprocess.DEVNULL, stderr=asyncio.subprocess.DEVNULL) or create a task to read and log the output.

TimeoutError: if the server does not become healthy within *deadline_s*.
"""
# Fast path: already running and healthy.
if self.is_running() and await self._health_check():
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Potential for multiple opencode serve processes on same port

If an existing opencode serve process is unhealthy (not responding to /doc) but still running, the fast path check will fail, causing write_config() to be called and a new process to be spawned without stopping the existing unhealthy process.

This could result in two processes trying to bind to the same port, likely causing the new process to fail to start.

Fix: Before spawning a new process, stop any existing unhealthy process by calling await self.stop() if self.is_running() is True but the health check failed.

if await self._health_check():
logger.info("opencode_runtime: server healthy after %.1fs", elapsed)
return
await asyncio.sleep(poll_s)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WARNING: Health check polling loop may miss server becoming healthy

The polling loop sleeps for a fixed poll_s interval and then increments elapsed by poll_s. If the server becomes healthy just after a health check and just before the sleep, we will miss it and potentially wait nearly an entire poll_s interval longer than necessary, or even exceed the deadline and raise a false TimeoutError.

Fix: Calculate the remaining time before each sleep and sleep for the minimum of poll_s and the remaining time.

jaylfc added 2 commits June 5, 2026 10:34
…respawn

Addresses two CodeRabbit Majors on #592:
- The opencode.json config embeds the agent's LiteLLM key in plaintext; chmod it
  0600 (mirrors install_hermes.sh on ~/.hermes/.env), with a test asserting the mode.
- ensure_running reaps an existing live-but-unhealthy process before respawning
  (was orphaning it + colliding on the port), and polls against a wall-clock
  deadline so a slow health check can't overrun deadline_s.
… PIPEs

Addresses the kilo CRITICAL / CodeRabbit Major on #592: spawning the long-lived
server with stdout/stderr=PIPE deadlocks it once the OS pipe buffer fills (no one
drains it). Redirect both to a 0600 serve.log under the config dir (no unread
pipe, and the host taOS agent's serve logs are kept for diagnosis); close the
handle on stop().
@jaylfc jaylfc merged commit ed38a9e into dev Jun 5, 2026
6 checks passed
@jaylfc jaylfc deleted the feat/opencode-runtime branch June 5, 2026 10:10
@github-project-automation github-project-automation Bot moved this from Todo to Done in TinyAgentOS Roadmap Jun 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant