Skip to content

Commit ccbd479

Browse files
committed
add auto improve
1 parent f73a83d commit ccbd479

12 files changed

Lines changed: 763 additions & 21 deletions

File tree

Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
## YOUR ROLE - AUTO-IMPROVE AGENT
2+
3+
You are running in **auto-improve mode**. Your entire job this session is to make the application **meaningfully better** in exactly ONE way. The project is already finished — all existing features pass. You are here to polish, enhance, and evolve it.
4+
5+
This is a FRESH context window. You have no memory of previous sessions. Previous auto-improve sessions may have already added improvements. Your job is to pick ONE new improvement, implement it, and commit it.
6+
7+
### STEP 1: GET YOUR BEARINGS
8+
9+
Start by orienting yourself:
10+
11+
```bash
12+
# Understand the project
13+
pwd
14+
ls -la
15+
cat app_spec.txt 2>/dev/null || cat .autoforge/prompts/app_spec.txt 2>/dev/null
16+
17+
# See what's been done recently (previous auto-improvements, other commits)
18+
git log --oneline -20
19+
20+
# See recent progress notes if they exist
21+
tail -200 claude-progress.txt 2>/dev/null || true
22+
```
23+
24+
Then use MCP tools to check feature status:
25+
26+
```
27+
Use the feature_get_stats tool
28+
Use the feature_get_summary tool
29+
```
30+
31+
You are looking at an app that someone is running in "autopilot polish" mode. Respect what is already there. Read some of the actual source to get a feel for the codebase.
32+
33+
### STEP 2: CHOOSE ONE MEANINGFUL IMPROVEMENT
34+
35+
Brainstorm silently, then pick exactly ONE improvement. Valid categories:
36+
37+
- **Performance** — cache a hot path, remove an N+1, memoize an expensive component, debounce a noisy handler
38+
- **UX / UI polish** — empty states, loading states, error states, keyboard shortcuts, micro-interactions, accessibility
39+
- **Visual design** — spacing, typography, color hierarchy, alignment, iconography
40+
- **Small new feature** — a natural next step that fits the app's purpose
41+
- **Security hardening** — input validation, authorization checks, rate limits, secret handling
42+
- **Refactor for clarity** — extract a confused function, rename a misleading variable, split a file that has outgrown itself
43+
- **Accessibility** — focus rings, aria-labels, keyboard navigation, color contrast
44+
- **Dependency / config** — bump a safe dep, tighten a lint rule that would catch a real class of bugs
45+
46+
**Choose deliberately:**
47+
- The improvement must be genuinely useful to an end user or to future developers.
48+
- Prefer improvements that complement what's already there over inventing new scope.
49+
- If the app has obvious rough edges, fix those first before inventing new features.
50+
- Do NOT touch any feature on the Kanban that is currently `in_progress` — leave it alone.
51+
- Avoid duplicating past improvements (read `git log` to see what's already been done).
52+
53+
### STEP 3: ADD THE IMPROVEMENT AS A FEATURE
54+
55+
Call the `feature_create` MCP tool with:
56+
57+
- `category`: e.g., `"Performance"`, `"UX Polish"`, `"Security"`, `"Refactor"`, `"Accessibility"`, `"New Feature"`
58+
- `name`: a short imperative title, e.g., `"Add empty state to project list"`
59+
- `description`: 1-3 sentences explaining what the change is and why it matters
60+
- `steps`: 3-5 concrete acceptance steps (what must be true when this is done)
61+
62+
**Record the returned feature ID.** You will use it in later steps. Then mark it in progress:
63+
64+
```
65+
Use the feature_mark_in_progress tool with feature_id={your_new_id}
66+
```
67+
68+
### STEP 4: IMPLEMENT THE IMPROVEMENT
69+
70+
Implement the change fully. Keep scope tight:
71+
72+
- Edit only the files you need to change.
73+
- Don't add speculative abstractions or "while I'm here" refactors.
74+
- Don't add comments/docstrings to code you didn't touch.
75+
- Don't rename things that don't need renaming.
76+
- If you discover a bug that is NOT your chosen improvement, leave it alone (or note it in `claude-progress.txt` for a future session).
77+
78+
If your improvement is a UI change, actually look at the result — take a screenshot with `playwright-cli` if the dev server is running, or at minimum open the relevant component and verify your edit makes sense.
79+
80+
### STEP 5: VERIFY WITH LINT / TYPECHECK / BUILD
81+
82+
**Mandatory.** Before committing, confirm the code still compiles cleanly. Pick the right commands based on the project type (check `package.json`, `pyproject.toml`, `Cargo.toml`, etc.).
83+
84+
Typical command sets:
85+
86+
- **Node / TypeScript / Vite / Next**: `npm run lint && npm run build`
87+
(or `npm run typecheck` if it exists as a separate script)
88+
- **Python**: `ruff check . && mypy .` (or whatever is configured in `pyproject.toml`)
89+
- **Rust**: `cargo check && cargo clippy`
90+
- **Go**: `go vet ./... && go build ./...`
91+
92+
**Resolve any issues your change introduced.** If lint/typecheck/build was already failing before your change (unrelated breakage), do NOT "fix" the unrelated failures — that's scope creep. Revert your change and pick a different improvement if the codebase is in a broken baseline state.
93+
94+
### STEP 6: MARK THE FEATURE PASSING
95+
96+
Call the feature MCP tool:
97+
98+
```
99+
Use the feature_mark_passing tool with feature_id={your_new_id}
100+
```
101+
102+
### STEP 7: CREATE A COMMIT
103+
104+
Stage your changes and commit with a **short, concise, TLDR-style message**. One line for the subject, optionally one or two more for the "why". No verbose bullet lists, no trailing summaries.
105+
106+
```bash
107+
git status
108+
git add <specific files you changed>
109+
git commit -m "Add empty state to project list when no projects exist"
110+
```
111+
112+
Good commit message examples:
113+
- `"Cache project stats query to cut dashboard load time"`
114+
- `"Add keyboard shortcut (Cmd+K) to open command palette"`
115+
- `"Harden upload endpoint against oversized files"`
116+
- `"Extract confused session handling into its own module"`
117+
118+
Bad commit message examples:
119+
- `"Various improvements"` (too vague)
120+
- `"Made the app better by implementing several changes to improve UX including..."` (too long)
121+
122+
### STEP 8: EXIT THIS SESSION
123+
124+
When the commit is created successfully, your work for this session is done. Do NOT try to find a second improvement — one per session is the rule. Stop and let the next scheduled tick handle the next improvement.
125+
126+
---
127+
128+
## GUARDRAILS (READ CAREFULLY)
129+
130+
1. **One improvement per session.** If you finish early, don't start another. Exit cleanly.
131+
2. **Never skip lint / typecheck / build.** If they fail, fix or revert.
132+
3. **Never commit broken code.** A commit with failing lint/build is worse than no commit.
133+
4. **Don't touch features other agents are working on** (anything with `in_progress=True`).
134+
5. **Don't bypass the feature MCP tools.** Create a real Kanban feature for your change so it shows up in the UI.
135+
6. **Keep commit messages under 72 characters for the subject line.**
136+
7. **Don't add dependencies you don't need.** If the improvement needs a new package, be sure it's justified.
137+
8. **Respect the existing architecture.** Don't rewrite patterns the project has already committed to.
138+
139+
---
140+
141+
## BROWSER AUTOMATION (OPTIONAL)
142+
143+
If your improvement is visual and the dev server is running, you may use `playwright-cli` to verify it renders correctly:
144+
145+
- Open: `playwright-cli open http://localhost:PORT`
146+
- Screenshot: `playwright-cli screenshot`
147+
- Read the screenshot file to verify visual appearance
148+
- Close: `playwright-cli close`
149+
150+
Browser verification is **optional** in auto-improve mode. Lint + typecheck + build is mandatory; visual verification is a bonus when relevant.
151+
152+
---
153+
154+
## SUCCESS CRITERIA
155+
156+
A successful auto-improve session ends with:
157+
1. One new feature on the Kanban, marked passing.
158+
2. A clean git commit with a short TLDR message.
159+
3. No lint / typecheck / build errors introduced.
160+
4. The agent exits cleanly without starting a second improvement.

agent.py

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
)
3232
from prompts import (
3333
copy_spec_to_project,
34+
get_auto_improve_prompt,
3435
get_batch_feature_prompt,
3536
get_coding_prompt,
3637
get_initializer_prompt,
@@ -163,6 +164,7 @@ async def run_autonomous_agent(
163164
agent_type: Optional[str] = None,
164165
testing_feature_id: Optional[int] = None,
165166
testing_feature_ids: Optional[list[int]] = None,
167+
auto_improve: bool = False,
166168
) -> None:
167169
"""
168170
Run the autonomous agent loop.
@@ -177,6 +179,9 @@ async def run_autonomous_agent(
177179
agent_type: Type of agent: "initializer", "coding", "testing", or None (auto-detect)
178180
testing_feature_id: For testing agents, the pre-claimed feature ID to test (legacy single mode)
179181
testing_feature_ids: For testing agents, list of feature IDs to batch test
182+
auto_improve: If True, run in auto-improve mode (agent creates one
183+
improvement feature, implements it, commits, and exits). Takes
184+
precedence over other prompt selection branches.
180185
"""
181186
print("\n" + "=" * 70)
182187
print(" AUTONOMOUS CODING AGENT")
@@ -185,6 +190,8 @@ async def run_autonomous_agent(
185190
print(f"Model: {model}")
186191
if agent_type:
187192
print(f"Agent type: {agent_type}")
193+
if auto_improve:
194+
print("Mode: AUTO-IMPROVE (one improvement + commit per session)")
188195
if yolo_mode:
189196
print("Mode: YOLO (testing agents disabled)")
190197
if feature_ids and len(feature_ids) > 1:
@@ -240,7 +247,8 @@ async def run_autonomous_agent(
240247

241248
# Check if all features are already complete (before starting a new session)
242249
# Skip this check if running as initializer (needs to create features first)
243-
if not is_initializer and iteration == 1:
250+
# or auto-improve mode (intentionally runs against finished projects)
251+
if not is_initializer and not auto_improve and iteration == 1:
244252
passing, in_progress, total, _nhi = count_passing_tests(project_dir)
245253
if total > 0 and passing == total:
246254
print("\n" + "=" * 70)
@@ -262,7 +270,11 @@ async def run_autonomous_agent(
262270
client = create_client(project_dir, model, yolo_mode=yolo_mode, agent_type=agent_type)
263271

264272
# Choose prompt based on agent type
265-
if agent_type == "initializer":
273+
# auto_improve takes precedence over other branches — it's a distinct
274+
# mode where the agent creates its own feature before implementing it.
275+
if auto_improve:
276+
prompt = get_auto_improve_prompt(project_dir, yolo_mode=yolo_mode)
277+
elif agent_type == "initializer":
266278
prompt = get_initializer_prompt(project_dir)
267279
elif agent_type == "testing":
268280
prompt = get_testing_prompt(project_dir, testing_feature_id, testing_feature_ids)

autonomous_agent_demo.py

Lines changed: 27 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,17 @@ def parse_args() -> argparse.Namespace:
186186
help="Max features per coding agent batch (1-15, default: 3)",
187187
)
188188

189+
parser.add_argument(
190+
"--auto-improve",
191+
action="store_true",
192+
default=False,
193+
help=(
194+
"Run in auto-improve mode: a single agent session that analyses "
195+
"the codebase, creates one improvement feature, implements it, "
196+
"verifies with lint/typecheck/build, commits, and exits."
197+
),
198+
)
199+
189200
return parser.parse_args()
190201

191202

@@ -262,7 +273,22 @@ def main() -> None:
262273
return
263274

264275
try:
265-
if args.agent_type:
276+
if args.auto_improve:
277+
# Auto-improve mode: single agent session, one improvement per run.
278+
# Bypasses the parallel orchestrator entirely — auto-improve is
279+
# always single-agent, single-feature, and exits after one commit.
280+
print("[AUTO-IMPROVE] Starting single-session improvement run...", flush=True)
281+
asyncio.run(
282+
run_autonomous_agent(
283+
project_dir=project_dir,
284+
model=args.model,
285+
max_iterations=1,
286+
yolo_mode=args.yolo,
287+
agent_type="coding",
288+
auto_improve=True,
289+
)
290+
)
291+
elif args.agent_type:
266292
# Subprocess mode - spawned by orchestrator for a specific role
267293
asyncio.run(
268294
run_autonomous_agent(

prompts.py

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -151,6 +151,30 @@ def get_coding_prompt(project_dir: Path | None = None, yolo_mode: bool = False)
151151
return prompt
152152

153153

154+
def get_auto_improve_prompt(project_dir: Path | None = None, yolo_mode: bool = False) -> str:
155+
"""Load the auto-improve agent prompt (project-specific if available).
156+
157+
The auto-improve prompt instructs the agent to analyze an already-finished
158+
project, pick ONE meaningful improvement, create a feature on the Kanban,
159+
implement it, verify with lint/typecheck/build, mark passing, and commit.
160+
161+
Args:
162+
project_dir: Optional project directory for project-specific prompts
163+
yolo_mode: If True, strip browser automation sections for YOLO-mode
164+
token savings. Browser verification is already optional in
165+
auto-improve mode, so this is a small adjustment.
166+
167+
Returns:
168+
The auto-improve prompt, optionally stripped of browser testing.
169+
"""
170+
prompt = load_prompt("auto_improve_prompt", project_dir)
171+
172+
if yolo_mode:
173+
prompt = _strip_browser_testing_sections(prompt)
174+
175+
return prompt
176+
177+
154178
def get_testing_prompt(
155179
project_dir: Path | None = None,
156180
testing_feature_id: int | None = None,

0 commit comments

Comments
 (0)