Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,6 @@ mess/
.coverage
.ralphify/
scripts/tui_dev/output/
.cheese/
.serena/

13 changes: 13 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,19 @@

All notable changes to ralphify are documented here.

## Unreleased

### Added

- **First-class opencode adapter** — agents that take the prompt as a positional argument instead of stdin are now first-class. Set `agent: opencode run` and ralphify adds `--format json`, appends your prompt as a safe positional argument (no `bash -c` wrapper, no quoting hazards), and parses opencode's JSON event stream for live tool-use tracking. See [Using with Different Agents](https://ralphify.co/docs/agents/#opencode) for the permission setup opencode needs to run autonomously.
- **`max_turns` enforcement** — the `max_turns` frontmatter field now actively caps tool-use events per iteration. Streaming adapters that count tool uses (Claude, Codex, opencode) are SIGTERM'd at the limit and the iteration is recorded as completed-at-cap; blocking adapters (Copilot) cannot be preempted mid-run, so their tool uses are counted post-hoc after the process exits and the iteration is marked completed-at-cap once the count reaches the limit; adapters that emit no countable events (Crush) treat the field as a no-op.
- **Agent lifecycle hooks** — new `hooks` frontmatter field plus an `AgentHook` Protocol (`ShellAgentHook`, `CombinedAgentHook`, `NoOpAgentHook`). Register shell commands to observe iteration start/end, prompt assembly, tool use, turn-approaching-limit, turn cap, and completion-signal events. Hooks are observers — a failing hook is logged but never aborts the run. See the new [Hooks](https://ralphify.co/docs/hooks/) page.
- **Per-CLI soft wind-down** — when `max_turns_grace` is set, Claude's `PreToolUse` and Codex's `PostToolUse` hooks are installed into a per-iteration tempdir (`CLAUDE_CONFIG_DIR` / `CODEX_HOME` overrides, so the user's real config stays untouched) to nudge the agent toward wrapping up before the hard cap. Adapters without a hook system downgrade to hard-cap-only.

### Changed

- **Prompt delivery is now an adapter concern** — the agent execution layer asks each adapter where the prompt goes (stdin vs. a positional argument) via a new `deliver_prompt` step. Existing stdin agents (Claude, Codex, Copilot, generic, and `bash -c` wrappers) are unaffected; arg-delivery agents spawn with `stdin=DEVNULL` and no stdin writer thread.

## 0.4.0b3 — 2026-04-12

### Improved
Expand Down
7 changes: 7 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ Ralph loops give you:
| **grow-coverage** | Write tests for untested modules, one per iteration, until coverage hits the target |
| **security-audit** | Hunt for vulnerabilities — scan, find, fix, verify, repeat |
| **clear-backlog** | Work through a TODO list or issue tracker, one task per loop |
| **promise-completion** | Work until a target is done, then emit a promise tag so the loop stops early |
| **write-docs** | Generate documentation for undocumented modules, one at a time |
| **improve-codebase** | Find and fix code smells, refactor patterns, modernize APIs |
| **migrate** | Incrementally migrate files from one framework or pattern to another |
Expand Down Expand Up @@ -95,11 +96,17 @@ Scaffold a ralph and start experimenting:
ralph scaffold my-ralph
```

The scaffolded `RALPH.md` includes the normal command/arg template plus a commented promise-completion path you can enable if the agent should stop early by emitting a matching `<promise>...</promise>` tag.

For a committed example, see [`examples/promise-completion/RALPH.md`](examples/promise-completion/RALPH.md) — it shows a loop that exits early once the requested target is complete.

Edit `my-ralph/RALPH.md`, then run it:

```bash
ralph run my-ralph # loops until Ctrl+C
ralph run my-ralph -n 5 # run 5 iterations then stop
# from a repo checkout:
ralph run examples/promise-completion -n 10 --target "stabilize the failing auth tests"
```

### What `ralph run` does
Expand Down
85 changes: 80 additions & 5 deletions docs/agents.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,13 @@ This page shows how to configure the [`agent` frontmatter field](quick-reference

## Agent comparison

| Agent | Stdin support | Streaming | Wrapper needed |
| Agent | Prompt delivery | Streaming | Wrapper needed |
|---|---|---|---|
| [Claude Code](#claude-code) | Native (`-p`) | Yes — real-time activity tracking | No |
| [Claude Code](#claude-code) | Stdin (`-p`) | Yes — real-time activity tracking | No |
| [opencode](#opencode) | Positional arg (`run "<prompt>"`) | Yes — tool-use tracking | No |
| [Aider](#aider) | Via bash wrapper | No | Yes (`bash -c`) |
| [Codex CLI](#codex-cli) | Native (`exec`) | No | No |
| [Codex CLI](#codex-cli) | Stdin (`exec`) | No | No |
| [Crush](#crush) | Stdin (`run`) | No | No |
| [Custom](#custom-wrapper-script) | You implement it | No | Yes (script) |

If you're not sure which to pick: **start with Claude Code.** It has the deepest integration, the best autonomous coding capabilities, and is the default.
Expand All @@ -36,9 +38,34 @@ Your agent must:

1. **Read a prompt from stdin** — the full assembled prompt is piped in
2. **Do work in the current directory** — edit files, run commands, make commits
3. **Exit when done** — exit code 0 means success, non-zero means failure
3. **Exit cleanly** — exit code `0` means the agent process succeeded; non-zero means failure
4. **Optionally emit a completion signal** — set `completion_signal` in frontmatter (default inner text: `RALPH_PROMISE_COMPLETE`) if you want the agent to print an explicit `<promise>...</promise>` marker

That's it. No special protocol, no API — just stdin in, work done, process exits.
Normal exit codes still indicate process success or failure. They do **not** trigger promise completion by themselves.

Ralphify only stops early on promise completion when both of these are true:

- `stop_on_completion_signal: true`
- the matching `<promise>...</promise>` tag is detected in agent output or captured result text

`completion_signal` is the inner promise text. For example, `completion_signal: COMPLETE` means the agent must output `<promise>COMPLETE</promise>`.

Ralphify still keeps its own command/prompt loop architecture. Only the promise tag format and matching align with Ralph-Wiggum.

Minimal example:

```markdown
---
agent: claude -p --dangerously-skip-permissions
completion_signal: COMPLETE
stop_on_completion_signal: true
---

Implement the next todo. When the work is fully complete, print
<promise>COMPLETE</promise> and exit.
```

That's it. No API required — just stdin in, output out, process exits.

## Claude Code

Expand Down Expand Up @@ -74,6 +101,32 @@ This enables ralphify to:
- Track agent activity in real time
- Extract the final result text from the agent's response

## opencode

[opencode](https://opencode.ai) takes the prompt as a **positional argument** to its `run` subcommand rather than on stdin. Ralphify has a first-class adapter for it — no `bash -c` wrapper needed.

```markdown
---
agent: opencode run --agent build
---
```

| Flag | Purpose |
|---|---|
| `run` | Non-interactive mode — runs one prompt and exits |
| `--agent build` | Selects an agent profile permissive enough to edit files autonomously (see the caveat below) |

When ralphify detects that the agent command's binary is `opencode`, it automatically:

- Adds `--format json` so opencode emits a parseable event stream.
- Appends the assembled prompt as the final positional argument (no stdin, no shell — quotes, `$(...)`, and newlines in the prompt are passed through safely as a single argument).
- Parses the JSON stream to track tool use in real time.

!!! warning "opencode refuses writes by default"
opencode's built-in agents start with restrictive `ask`/`deny` permission presets ([anomalyco/opencode #10411](https://github.com/anomalyco/opencode/issues/10411), [#13851](https://github.com/anomalyco/opencode/issues/13851)). An unconfigured `opencode run` will stall waiting for approval or refuse to edit files — there is no one to approve in an autonomous loop.

This is opencode-side configuration, not something ralphify can override. Before looping, set up an agent profile (or permission config) that allows the edits and commands your prompt needs — the opencode analogue of Claude Code's `--dangerously-skip-permissions`. See [opencode's permissions docs](https://opencode.ai/docs/permissions/) for the `--agent` profile and permission settings.

## Aider

[Aider](https://aider.chat) is an AI pair-programming tool that works with multiple LLM providers.
Expand Down Expand Up @@ -117,6 +170,28 @@ agent: codex exec --sandbox danger-full-access -
| `--sandbox danger-full-access` | Full filesystem access for autonomous operation |
| `-` | Read prompt from stdin |

## Crush

[Charm Crush](https://github.com/charmbracelet/crush) is TUI-first but supports non-interactive use via its `run` subcommand, which reads the prompt from stdin. Ralphify has a first-class adapter for it — no `bash -c` wrapper needed.

```markdown
---
agent: crush run
---
```

| Flag | Purpose |
|---|---|
| `run` | Non-interactive mode — runs one prompt from stdin and exits |

When ralphify detects that the agent command's binary is `crush`, it automatically adds `--quiet` to hide the progress spinner. `crush run` auto-approves every permission request for the duration of the invocation, so no `--yolo`-style flag is needed to run autonomously.

!!! info "Configure a provider first"
`crush run` exits with "no providers configured" if no model provider is set up. Configure one non-interactively before looping — e.g. export `ANTHROPIC_API_KEY` (or another provider's key) or commit a `crush.json`. Run `crush` once interactively if you prefer the guided setup.

!!! warning "No structured output — turn capping unavailable"
Crush emits plain text only (no JSON/streaming-event mode), so ralphify runs it in [blocking mode](#blocking-mode-all-other-agents) and cannot count tool calls or enforce `max_turns` for it. Completion still works via the [`<promise>` tag](#what-ralphify-needs-from-an-agent) scanned from stdout. Use [`--timeout`](cli.md#ralph-run) as the safety net instead of a turn cap.

## Custom wrapper script

For full control, write a wrapper script that reads stdin and calls your agent however it needs to be called.
Expand Down
9 changes: 8 additions & 1 deletion docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,8 @@ config = RunConfig(
|---|---|---|---|
| `agent` | `str` | -- | Full agent command string |
| `ralph_dir` | `Path` | -- | Path to the ralph directory |
| `ralph_file` | `Path` | -- | Path to the RALPH.md file |
| `ralph_file` | `Path | None` | `None` | Path to the RALPH.md file. Supply exactly one of `ralph_file` or `prompt`. |
| `prompt` | `str | None` | `None` | In-memory prompt body (no frontmatter). Supply exactly one of `ralph_file` or `prompt`. See [Embedding](embedding.md#running-a-prompt-from-memory). |
| `commands` | `list[Command]` | `[]` | Commands to run each iteration |
| `args` | `dict[str, str]` | `{}` | User argument values |
| `max_iterations` | `int | None` | `None` | Max iterations (`None` = unlimited) |
Expand Down Expand Up @@ -387,6 +388,12 @@ When extra listeners are registered, events are broadcast to both the built-in q
| `resume_run(run_id)` | Resume a paused run. |
| `list_runs()` | Return a snapshot of all registered runs. |
| `get_run(run_id)` | Look up a run by ID. |
| `wait_for_any(run_ids, timeout=None)` | Block until at least one of `run_ids` reaches a terminal status; returns the finished IDs (`[]` on timeout). |
| `wait_for_all(run_ids, timeout=None)` | Block until all `run_ids` finish or `timeout` elapses; returns `True` iff all finished. |
| `get_result(run_id)` | Snapshot the run's status and counts as a frozen `RunResult`. Raises `KeyError` if unknown. |
| `shutdown(timeout=None)` | Request stop on every run and join their threads; returns `True` iff all joined in time. |

For the create → start → wait → result → shutdown lifecycle and thread-safety notes, see [Embedding](embedding.md).

---

Expand Down
7 changes: 6 additions & 1 deletion docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,6 +113,7 @@ The loop also stops automatically when:

- All `-n` iterations have completed
- `--stop-on-error` is set and the agent exits non-zero or times out
- `stop_on_completion_signal: true` is set in frontmatter and the matching `<promise>...</promise>` tag is detected in agent output or captured result text

### Peeking at live agent output

Expand Down Expand Up @@ -152,7 +153,7 @@ ralph scaffold # Creates RALPH.md in the current directory
|---|---|---|
| `[NAME]` | none | Directory name. If omitted, creates RALPH.md in the current directory |

The generated template includes an example command (`git-log`), an example arg (`focus`), and a prompt body with placeholders for both. Edit it, then run [`ralph run`](#ralph-run). See [Getting Started](getting-started.md) for a full walkthrough.
The generated template includes an example command (`git-log`), an example arg (`focus`), a prompt body with placeholders for both, and commented `completion_signal` / `stop_on_completion_signal` lines showing the promise-completion path. Uncomment them if you want the agent to stop early by emitting a matching `<promise>...</promise>` tag. Then run [`ralph run`](#ralph-run). See [Getting Started](getting-started.md) for a full walkthrough.

Errors if `RALPH.md` already exists at the target location.

Expand Down Expand Up @@ -190,6 +191,10 @@ Your instructions here. Reference args with {{ args.dir }}.
| `commands` | list | no | Commands to run each iteration (each has `name` and `run`) |
| `args` | list of strings | no | Declared argument names for user arguments. Letters, digits, hyphens, and underscores only. |
| `credit` | bool | no | Append co-author trailer instruction to prompt (default: `true`) |
| `completion_signal` | string | no | Inner text for the completion promise tag. `COMPLETE` means the agent must emit `<promise>COMPLETE</promise>` (default inner text: `RALPH_PROMISE_COMPLETE`) |
| `stop_on_completion_signal` | bool | no | Stop the loop early when the matching `<promise>...</promise>` tag is detected (default: `false`) |

Exit code `0` still only means the agent process succeeded. Ralphify keeps its own loop architecture; only the promise tag format and matching align with Ralph-Wiggum.

### Commands

Expand Down
12 changes: 11 additions & 1 deletion docs/contributing/codebase-map.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,17 @@ src/ralphify/ # All source code
├── _events.py # Event types, emitter protocol, and BoundEmitter convenience wrapper
├── _keypress.py # Cross-platform single-keypress listener (powers the `p` peek toggle)
├── _output.py # ProcessResult base class, subprocess constants (SESSION_KWARGS, SUBPROCESS_TEXT_KWARGS), format durations
└── _brand.py # Brand color constants shared across CLI and console rendering
├── _brand.py # Brand color constants shared across CLI and console rendering
├── hooks.py # Agent lifecycle hooks — AgentHook Protocol, ShellAgentHook, CombinedAgentHook
├── _wind_down_shim.py # Module invoked by per-CLI hooks to nudge agents toward max_turns wind-down
└── adapters/ # Pluggable CLI adapter layer — one module per agent CLI
├── _protocol.py # CLIAdapter Protocol, AdapterEvent, Invocation, stdin_invocation, ADAPTERS registry
├── claude.py # Claude Code adapter (stdin, stream-json, PreToolUse wind-down)
├── codex.py # Codex adapter (stdin, --json, PostToolUse wind-down)
├── copilot.py # GitHub Copilot adapter (stdin, blocking)
├── crush.py # Crush adapter (stdin, blocking, plain text — no tool-use counting)
├── opencode.py # opencode adapter (prompt as positional arg, --format json)
└── _generic.py # Fallback adapter for unknown CLIs (stdin, no parsing)

tests/ # Pytest tests — one test file per module
docs/ # MkDocs site (Material theme) — user-facing documentation
Expand Down
40 changes: 37 additions & 3 deletions docs/cookbook.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
---
title: Ralph Loop Recipes
description: Copy-pasteable ralph loop setups for autonomous ML research, test coverage, code migration, security scanning, deep research, documentation, bug fixing, and codebase improvement.
keywords: ralphify cookbook, autonomous coding recipes, RALPH.md examples, documentation loop, bug fixing loop, codebase improvement, deep research agent, code migration loop, security scanning agent, test coverage automation, autoresearch, autonomous ML research
description: Copy-pasteable ralph loop setups for autonomous ML research, promise-based early exit, test coverage, code migration, security scanning, deep research, documentation, bug fixing, and codebase improvement.
keywords: ralphify cookbook, autonomous coding recipes, RALPH.md examples, promise completion, early exit agent loop, documentation loop, bug fixing loop, codebase improvement, deep research agent, code migration loop, security scanning agent, test coverage automation, autoresearch, autonomous ML research
---

# Cookbook

!!! tldr "TL;DR"
8 copy-pasteable ralph loops: [autoresearch](#autoresearch), [codebase improvement](#codebase-improvement), [documentation](#documentation-loop), [bug hunting](#bug-hunter), [deep research](#deep-research), [code migration](#code-migration), [security scanning](#security-scan), and [test coverage](#test-coverage). Each is a real, runnable example from the `examples/` directory.
9 copy-pasteable ralph loops: [autoresearch](#autoresearch), [codebase improvement](#codebase-improvement), [documentation](#documentation-loop), [bug hunting](#bug-hunter), [deep research](#deep-research), [code migration](#code-migration), [security scanning](#security-scan), [test coverage](#test-coverage), and [promise completion](#promise-completion). Each is a real, runnable example from the `examples/` directory.

Copy-pasteable setups for common autonomous workflows. Each recipe is a real, runnable ralph from the [`examples/`](https://github.com/computerlovetech/ralphify/tree/main/examples) directory.

Expand Down Expand Up @@ -269,6 +269,40 @@ The coverage report gives the agent a clear metric to improve and shows exactly

---

## Stop the loop when the task is complete {: #promise-completion }

A loop that uses ralphify's promise-completion path to stop before the iteration budget. The agent keeps working until the requested target is done, then emits `<promise>COMPLETE</promise>` so the run ends immediately instead of burning the remaining iterations.

**`promise-completion/RALPH.md`**

```markdown
--8<-- "examples/promise-completion/RALPH.md"
```

```bash
ralph run promise-completion -n 10 --target "stabilize the failing auth tests"
```

```text
▶ Running: promise-completion
3 commands · max 10 iterations

── Iteration 1 ──
Commands: 3 ran
✓ Iteration 1 completed (51.4s)

── Iteration 2 ──
Commands: 3 ran
✓ Iteration 2 completed via promise tag <promise>COMPLETE</promise> (43.2s)

──────────────────────
Done: 2 iterations — 2 succeeded
```

Set `completion_signal` to the inner promise text and `stop_on_completion_signal: true` to enable early exit. The agent must emit the matching `<promise>...</promise>` tag — bare text does not count.

---

## Next steps

- [CLI Reference](cli.md) — all `ralph run` options (`--timeout`, `--stop-on-error`, `--delay`, user args)
Expand Down
Loading