Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
190 commits
Select commit Hold shift + click to select a range
f22c9d6
Add project path and scope vocabulary types
brendanddev Apr 28, 2026
5448b53
Add runtime path resolver primitives
brendanddev Apr 28, 2026
cc97ec9
Insert runtime path resolution boundary before tool dispatch
brendanddev Apr 28, 2026
f63ad33
Migrate read-only tools to runtime-resolved inputs
brendanddev Apr 28, 2026
43dbb9c
Migrate write/edit tools to resolved inputs with approval path valida…
brendanddev Apr 28, 2026
30ba52f
Remove legacy tool context after git migration
brendanddev Apr 29, 2026
7c5ed97
Expose mutation tools in surface hint and fix post-read answer phase …
brendanddev Apr 29, 2026
053bac8
Stabilize mutation failure and answer-phase finalization
brendanddev Apr 29, 2026
f053856
Enforce deterministic runtime control flow after tool execution
brendanddev Apr 29, 2026
781b86d
Enforce project-relative path output via regression tests
brendanddev Apr 29, 2026
b23c973
Bind restored sessions to canonical project roots
brendanddev Apr 29, 2026
2087cc9
Add bounded project structure snapshot builder
brendanddev Apr 29, 2026
a27acda
Cache project structure snapshots in runtime
brendanddev Apr 29, 2026
35fe054
Inject bounded project snapshot as ephemeral context
brendanddev Apr 29, 2026
a77186c
Move benchmark runs into dedicated folder
brendanddev Apr 29, 2026
48de5c8
Organize runtime project module
brendanddev Apr 29, 2026
4e253fe
Reorganize runtime protocol modules
brendanddev Apr 29, 2026
aafc804
Organize runtime investigation modules
brendanddev Apr 29, 2026
6bc6366
Organize runtime orchestration modules
brendanddev Apr 29, 2026
0b2c1fb
Add phase 16 pre baseline
brendanddev Apr 29, 2026
deb3179
Enforce candidate-only reads after search
brendanddev Apr 29, 2026
38c6e6b
Enforce candidate-only reads across retrieval turns
brendanddev Apr 29, 2026
8e2ebe8
Guide non-candidate read recovery with mode-aware candidate selection
brendanddev Apr 30, 2026
cfe76cd
Enforce grounded answers with read-set and usage evidence guards
brendanddev Apr 30, 2026
d0cd336
Improve source candidate selection for direct path and general lookups
brendanddev Apr 30, 2026
cec1f7d
Improve direct reads and runtime failure escalation
brendanddev Apr 30, 2026
10b418c
Update search functionality with rg
brendanddev Apr 30, 2026
ec73364
Add model activity progress events
brendanddev Apr 30, 2026
d9215c8
Harden provider contract and remove stringly timing
brendanddev Apr 30, 2026
633701d
Decouple config root from project root
brendanddev May 1, 2026
9a460c5
Bound list_dir output
brendanddev May 1, 2026
701c7bf
Centralize noisy directory exclusions
brendanddev May 1, 2026
beeae39
Add external repo validation tests and Phase 18 benchmark baseline
brendanddev May 1, 2026
6bc321b
Add runtime recovery observability traces
brendanddev May 2, 2026
27a5dd4
Add runtime-owned non-candidate read redirection
brendanddev May 2, 2026
9be16ac
Update docs to align with current project state
brendanddev May 2, 2026
7b7707d
Add pre-execution intercept for non-candidate reads
brendanddev May 4, 2026
8fb0eb6
Implement bounded Answer-Only retry after answer_guard rejection
brendanddev May 5, 2026
ce883eb
Remove AnswerGuardRetry state and simplify retry to phase-based recovery
brendanddev May 5, 2026
84cb584
Add read classification tracking for direct vs candidate reads
brendanddev May 5, 2026
4bd98b8
Add answer-guard dispatch + fix runtime-dispatched call blocking
brendanddev May 5, 2026
a889cff
Document Phase 18.4 benchmark baseline
brendanddev May 5, 2026
2a2ea48
Convert recovery corrections to RuntimeDispatch
brendanddev May 6, 2026
af3d1cd
Update cap/dispatch tests to match RuntimeDispatch behavior
brendanddev May 7, 2026
8c33a21
Broaden simple edit seeding to cover "and change" phrasings
brendanddev May 7, 2026
16e4680
Add bounded answer-only retry after answer-guard rejection
brendanddev May 7, 2026
9e1a064
Add bare "change" variant to simple edit grammar
brendanddev May 7, 2026
b9d3429
Cover "edit the file" prefix in bare change grammar and reject defini…
brendanddev May 7, 2026
183fd62
Seed filename search hint for bare-filename explanation queries
brendanddev May 7, 2026
577e8eb
Resolve bare-filename explain queries as direct reads and fix symlink
brendanddev May 7, 2026
b109300
Add token count and context window usage to perf logs
brendanddev May 7, 2026
136d57e
Emit token counts from llama.cpp backend
brendanddev May 7, 2026
c2fc591
Move provider config validation to startup
brendanddev May 7, 2026
d8c90fb
Add Phase 19.4 baseline benchmark and update README
brendanddev May 7, 2026
1e07940
Extend Gate 6a to General mode and add natural language follow-up phr…
brendanddev May 7, 2026
ca1fa19
Refactor TurnPerformance into telemetry module
brendanddev May 7, 2026
1eaefaa
Extract anchor resolution into dedicated module
brendanddev May 7, 2026
21e8ec8
Extract ContextPolicy into dedicated module and split tool_codec into…
brendanddev May 8, 2026
8bb974d
Add CallSiteLookup investigation mode
brendanddev May 8, 2026
5a5ff25
Add Phase 20.4 baseline benchmark and bump version
brendanddev May 8, 2026
6347391
Restore most recent session for current project and persist and resto…
brendanddev May 8, 2026
5b9eb2a
Add project-scoped session management commands
brendanddev May 8, 2026
ac29e48
Expand restore window with structured session summaries
brendanddev May 8, 2026
3aa9f06
Add approval-gated shell tool
brendanddev May 21, 2026
82287bd
Improve model routing for shell tool invocation
brendanddev May 21, 2026
bea304e
Finish runtime-seed shell commands, bypass model for tool selection
brendanddev May 21, 2026
c46f8f4
Add allowlist validation, reject non-cargo commands and inject synthe…
brendanddev May 21, 2026
ef9df62
Add persistent LlamaContext across turns, eliminate per-turn ctx_crea…
brendanddev May 21, 2026
4fd54b4
Evict generated token positions from KV cache, fix incremental prefil…
brendanddev May 21, 2026
c4ba0f8
Add post-edit test validation loop with configurable test command and…
brendanddev May 21, 2026
fe10347
Add rompt inspection hotkey Ctrl+P that dumps to temp file and add ev…
brendanddev May 22, 2026
2e27d6f
Add mutation undo/rollback with /undo command
brendanddev May 22, 2026
d3e07b7
Allow load .env from project root on startup
brendanddev May 22, 2026
d81d82f
Add provider switching commands /providers list and /providers use
brendanddev May 22, 2026
2e5a00b
Add ollama as a provider for the systems LLM
brendanddev May 22, 2026
8d4d09d
Resolve streaming, system message merging, and timeout issues with Ol…
brendanddev May 22, 2026
5bac04f
Add OpenRouter as a provider
brendanddev May 22, 2026
5f09469
Fix issue with project snapshot being added on correction retry round…
brendanddev May 22, 2026
7256e25
Attempt to fix project snapshot on correction retries and relocate sh…
brendanddev May 22, 2026
ec447f8
Fix runtime by filtering mutation tools from system prompt on retriev…
brendanddev May 22, 2026
77a0247
Fix runtime issue by seeding read_file directly on prose-after-search…
brendanddev May 22, 2026
572c130
Add Phase 25 baseline results
brendanddev May 23, 2026
2fc094c
Add Groq as an LLM provider
brendanddev May 23, 2026
cc3dd4e
Block shell seeding on GitReadOnly surface
brendanddev May 23, 2026
dda916f
Fix prompt analysis logic by extending direct read detection to 'find…
brendanddev May 23, 2026
7940128
Move tool-call heuristic into tool_codec, extract TUI file write, fix…
brendanddev May 24, 2026
67582ba
Refactor runtime engine by splitting engine.rs into focused modules (…
brendanddev May 24, 2026
c81fcbd
Extract investigation test block into tests/investigation_inline.rs
brendanddev May 24, 2026
1522128
Introduce TurnContext, TurnState, and TurnSignal to decompose run_tur…
brendanddev May 24, 2026
c8513c2
Decompose run_loop_body into check_tool_call_gates, handle_no_tool_ca…
brendanddev May 24, 2026
fe0eb09
Introduce core/ as shared type layer for AppError, Result, and Config…
brendanddev May 24, 2026
8e6e347
Add windows compatibility by stripping UNC prefix, gateing llama-cpp …
brendanddev May 24, 2026
a8053f4
Add phase 26 baseline
brendanddev May 25, 2026
aa1bffd
Add dispatch to definition site candidate after usage exhausted on Us…
brendanddev May 25, 2026
db97761
Fix definition site candidate dispatch on UsageLookup
brendanddev May 25, 2026
a48180f
Fix by bypassing Gate 1 for runtime-dispatched definition site reads …
brendanddev May 25, 2026
e4997f5
Add scope guard path normalization + answer guard dispatch regardless…
brendanddev May 25, 2026
8d75c62
Add definition site bypass must not consume a candidate read slot and…
brendanddev May 25, 2026
2a0f5ca
Fix issue with scroll by clamping scroll_offset to max_scroll to prev…
brendanddev May 25, 2026
b609de0
Add expand toggle for file content truncation
brendanddev May 25, 2026
5272019
Fix duplicate summary message and expand toggle rendering
brendanddev May 25, 2026
64d203f
Fix duplicate content issue by ensuring the toggle hides assistant fi…
brendanddev May 25, 2026
214ee39
Strip role prefix from expanded file content message, add diff render…
brendanddev May 25, 2026
b1ea9d0
Fix windows compatibility issue, strip UNC prefix from AppPaths::root…
brendanddev May 25, 2026
3ee5af5
Fix windows compatibility issue by stripping UNC prefix from canonica…
brendanddev May 26, 2026
9b6b674
Add phase 27 benchmark run and update README
brendanddev May 26, 2026
43bbae4
Add DirectReadCompleted runtime event to signal end of direct read tu…
brendanddev May 26, 2026
4bf6da7
Normalize backslash path separators in rg output and add git_branch t…
brendanddev May 26, 2026
14020e9
Fix issue with detecting git branch and how it renders in tui
brendanddev May 26, 2026
e7e1806
Reformat help command display and add additional slash commands for s…
brendanddev May 26, 2026
d414163
Fix issue with ls slash command rendering
brendanddev May 26, 2026
beabdd6
Add ai workflows to git tracking
brendanddev May 26, 2026
dce964f
Add /sync-claude command and architect agent
brendanddev May 26, 2026
1c90a62
Fix Windows scope prefix and path normalization
brendanddev May 26, 2026
c6dda4c
Add InvestigationGraph with import-aware candidate promotion
brendanddev May 27, 2026
edbd1cf
Add dynamic useful_candidate_reads_target scoring to replace fixed ta…
brendanddev May 27, 2026
2319d1d
Add LspManager persistent session infrastructure
brendanddev May 27, 2026
aa2e61c
chore: Run cargo fmt
brendanddev May 27, 2026
a6c915a
Add lsp_definition tool wiring
brendanddev May 27, 2026
f62a5fb
Add lsp_definition to format_instructions system prompt
brendanddev May 27, 2026
e5e562b
Add missing phase 28 baseline benchmark doc
brendanddev May 27, 2026
84185c7
Fix resolver logic by falling back to parent directory when scope pat…
brendanddev May 27, 2026
7421009
Create new claude files and update existing
brendanddev May 27, 2026
33c11d1
Fix lsp tool, add runtime-seeded lsp_definition dispatch on Definitio…
brendanddev May 27, 2026
0baf43b
Fix LSP logic, prefer declaration-site coordinates when seeding lsp_d…
brendanddev May 27, 2026
2636d46
Add refactor agent + command
brendanddev May 27, 2026
e72f0d0
chore: Add debug-investigation and debug-runtime skills, convert dev …
brendanddev May 27, 2026
f56cf56
Add runtime integration test suite
brendanddev May 27, 2026
fd8621a
Inject hover context after successful lsp_definition
brendanddev May 28, 2026
9f897d0
Inject diagnostics after approved mutations
brendanddev May 28, 2026
c6b51f9
Add /lsp status slash command
brendanddev May 28, 2026
505090d
Resolve all LSP warnings, wire unused values and remove dead re-exports
brendanddev May 28, 2026
52efef5
Add investigation planner skill
brendanddev May 28, 2026
325a51b
Fix issue with LSP, strip project root prefix from lsp_definition out…
brendanddev May 28, 2026
5cf3bdb
Add Phase 29 benchmark run doc
brendanddev May 28, 2026
5f6be11
Fix runtime loop using LSP on non rust files by skipping lsp_definiti…
brendanddev May 28, 2026
059083b
Fix investigation, cap useful_candidate_reads_target at 1 for Definit…
brendanddev May 28, 2026
6e404aa
Add DefinitionLookup query refinement for truncated search tail
brendanddev May 28, 2026
737c74c
Add Phase 29 regression benchmark runs
brendanddev May 28, 2026
52c6e41
Setup project indexing, add pure symbol extractor pipeline
brendanddev May 29, 2026
a158be8
Add SQLite schema and symbol store
brendanddev May 29, 2026
adc5ed5
Add mtime invalidation, on-demand build trigger, and /index commands
brendanddev May 29, 2026
e636c62
Attempt at fixing issue with recovery loop regression, gate premature…
brendanddev May 29, 2026
0c6c3e2
Fix recovery loop regression, advance to next unread candidate in pre…
brendanddev May 29, 2026
cc42a73
Wire symbol index into DefinitionLookup candidate promotion
brendanddev May 29, 2026
822f493
Add pub(crate) and pub(super) prefixes to symbol extractor, and pre s…
brendanddev May 29, 2026
ee4f79f
Add Phase 30 baseline bencmark run doc
brendanddev May 29, 2026
d15b2c8
Add context usage indicator with token estimation
brendanddev May 29, 2026
0686d61
Add presentation-only pruning of stale tool results
brendanddev May 29, 2026
8c9db21
Add /context stats and /compact slash commands
brendanddev May 29, 2026
3ac2ce7
Add auto-warning at 75% and auto-prune at 90% context usage
brendanddev May 29, 2026
a667f98
chore: Update docs
brendanddev May 29, 2026
fa8bd06
Replace full-repaint renderer with diff-based cell renderer and dirty…
brendanddev May 29, 2026
3e40692
Add worker thread for non-blocking rendering
brendanddev May 30, 2026
50bd95b
Add multi-line input, cursor line navigation, Ctrl+W, and paste norma…
brendanddev May 30, 2026
db8d50c
Add input history (Alt+Up/Down) and Ctrl+R reverse search
brendanddev May 30, 2026
4969183
Add collapsible transcript blocks
brendanddev May 30, 2026
886b77f
Fix issues with tui unfocused collapsible indent, update collapsible …
brendanddev May 31, 2026
b063bc6
Fix issue with clear scroll signal on Ctrl+O to prevent focus override
brendanddev Jun 1, 2026
c9d1570
Add inline approval widget and cursor shape
brendanddev Jun 1, 2026
58b1d90
Fix inline approval widget and missing diff preview
brendanddev Jun 1, 2026
508958a
Extract format/events modules, fix max_scroll bug
brendanddev Jun 1, 2026
8bdf1e7
Split app.rs into focused modules
brendanddev Jun 1, 2026
2043a50
Add tab autocomplete for slash commands in the tui
brendanddev Jun 1, 2026
373aa43
Fix overlay row offset, bottom item was painting over status bar
brendanddev Jun 1, 2026
e72eba8
Add command launcher overlay
brendanddev Jun 1, 2026
05bf91a
Fix command launcher not scrolling viewport when selection moves
brendanddev Jun 1, 2026
66e3334
Update tui, add spinner, activity indicator and visual polish
brendanddev Jun 1, 2026
527dec8
Add slower spinner and update launcher by centering viewport
brendanddev Jun 1, 2026
8fd519e
Fix missing cursor, never visible after startup
brendanddev Jun 1, 2026
02f1967
Fix approval widget, transcript placeholder, unicode wrapping, and co…
brendanddev Jun 1, 2026
2902728
Add usage analyzer skill
brendanddev Jun 1, 2026
7a8c4cd
Add transcript role badges with │ gutter, multi-span line rendering, …
brendanddev Jun 1, 2026
29c41d8
Fix generation cursor attaching to wrong assistant message by guardin…
brendanddev Jun 1, 2026
e180ffd
Add header runtime state label, fix prompt color signal, stable launc…
brendanddev Jun 1, 2026
94ac285
Add collapsible auto-classification, semantic summaries, and viewport…
brendanddev Jun 1, 2026
3163b24
Polish approval widget with kind labels, evidence gutter, and preview…
brendanddev Jun 1, 2026
42b78f0
chore: Update docs
brendanddev Jun 1, 2026
a7f9af8
Refactor TUI imports, extract transcript builder, relocate misplaced …
brendanddev Jun 1, 2026
2bcb842
Add PromptPhysicsConfig, primacy anchor, THUNK.md bootstrap
brendanddev Jun 2, 2026
32e8c2a
Implement periodic refresh injection in run_generate_turn
brendanddev Jun 2, 2026
8f0c6bd
Implement dynamic recency field injection before GenerateRequest
brendanddev Jun 2, 2026
8cdaf9d
Wire config default, /prompt-physics toggle, session-scoped enable/di…
brendanddev Jun 2, 2026
f72b482
Add PendingApprovalStage enum and LSP pre-edit safety check
brendanddev Jun 2, 2026
10c248d
Add write-then-verify loop and cargo check after approved mutation
brendanddev Jun 2, 2026
a3b5b8f
Add iterative self-correction gate and cargo check retry loop
brendanddev Jun 2, 2026
a19f666
Fix issue with mutation logic only targeting .rs files, add language-…
brendanddev Jun 2, 2026
76a3043
Add language guard on pre-check and post-execute diagnostics
brendanddev Jun 2, 2026
3da3bb0
Add multi-edit transactions, atomic execution, and rollback on failure
brendanddev Jun 2, 2026
dc2e375
chore: Add Phase 34 baseline benchmark run doc
brendanddev Jun 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 51 additions & 0 deletions .claude/agents/architect.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
name: architect
description: Audits code against thunk's architectural principles. Use when reviewing a completed slice, a new file, or any change that touches layer boundaries, state management, or control flow. Invoke with a specific file or directory to review.
---

You are a strict architectural reviewer for the `thunk` codebase. Your job is to identify violations of the core design principles — not style issues, not performance, not missing features. Only structural and architectural problems that will compound over time.

## What you enforce

**Layer boundaries**
- `tui/` contains no business logic — only rendering and event dispatch via RuntimeEvent/RuntimeRequest
- `tools/` are pure execution units — no orchestration, no control flow decisions
- `runtime/` owns all control flow — no model involvement in structural decisions
- `core/` has no outward dependencies (known exception: ToolError import in error.rs — do not flag this)
- Lower layers never import from higher layers
- Always import AppError/Config from `crate::core`, never `crate::app`

**Control flow**
- Runtime is the single source of correctness — flag any path where the model makes a structural decision
- No text-as-API between subsystems — flag any string parsing outside `tool_codec/`
- No correction logic outside `runtime/` and `tool_codec/` boundaries

**State management**
- New state fields in `InvestigationState` must reset in `new()`
- Gate corrections use the `_correction_issued` bool pattern — fire exactly once per turn
- `evidence_ready()` is the single source of truth for evidence state — no bypasses

**Mutation safety**
- All mutating tools must return `ToolRunResult::Approval(PendingAction)` — never `Immediate`
- No new paths to `execute_approved()` outside `ToolRegistry`
- Mutation tools never appear in system prompt — only in ephemeral per-turn hint

**Coupling**
- No tight coupling between orchestration layers — changes to one file should not require cascading changes across 5+ files
- No duplicated sources of truth for tool behavior
- No god files — flag any file exceeding 600 lines that is growing

## How to review

1. Read the files specified
2. Check each principle above systematically
3. Report only real violations — not stylistic preferences
4. For each violation: state the file and line, the principle violated, and the minimal fix
5. If nothing violates the principles, say so explicitly — do not invent issues

## What you do not flag
- Code style or formatting
- Performance (unless it involves architectural coupling)
- Missing features or incomplete implementations
- Things that are ugly but architecturally sound
- The known core/error.rs → tools/ ToolError import
55 changes: 55 additions & 0 deletions .claude/agents/refactor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
name: refactor
description: Analyzes files and modules for size, mixed responsibilities, and separation of concerns violations. Use when a file feels too large, a function is doing too much, or a module owns more than one distinct concern. Invoke with a specific file, directory, or line threshold.
---

You are a refactor reviewer for the `thunk` codebase. Your job is to identify files and functions that should be split — not for line count alone, but because they own more than one distinct responsibility or mix concerns that belong in separate layers.

## What you analyze

**File size**
- Any `.rs` file over 1000 lines is a candidate for review
- Flag files that are growing across phases — size trend matters more than absolute count
- `src/runtime/orchestration/tool_round.rs` and `src/runtime/orchestration/engine.rs` are known large files — analyze carefully before flagging

**Function size**
- Any function over 100 lines likely owns more than one responsibility
- Flag functions that mix policy decisions with execution, or parsing with dispatch

**Separation of concerns**
- Policy mixed with execution in the same function
- Parsing logic outside `tool_codec/`
- Orchestration logic inside `tools/`
- Multiple unrelated responsibilities in the same module

**Layering violations**
- Read `.claude/dev/module-map.md` before analyzing — ownership boundaries are defined there
- Flag any split that would require a lower layer to import from a higher layer
- Flag any proposed split that creates circular dependencies

## How to review

1. Read `.claude/rules/invariants.md` and `.claude/dev/module-map.md` first
2. If a specific file was given, analyze that file only
3. Otherwise run: `find src -name "*.rs" | xargs wc -l | sort -rn | head -20`
4. For each candidate file:
- List the distinct responsibilities it owns
- Identify functions over 100 lines
- Flag mixed concerns
5. For each proposed split:
- Name the new module and what moves there
- Identify all cross-module import changes required
- Estimate risk: low / medium / high
- Flag if the split touches public APIs
6. Prioritize by risk — highest impact splits first

## What you do not flag
- Line count alone without mixed responsibilities
- Style or formatting issues
- Performance concerns
- Incomplete implementations
- Known architectural exceptions documented in `.claude/rules/invariants.md`
- The known `core/error.rs` → `tools/` ToolError import

## Output format
For each file: state the file, its line count, the distinct responsibilities it owns, and whether a split is warranted. For each proposed split: state what moves where, the risk level, and what changes are required. If nothing warrants splitting, say so explicitly.
33 changes: 33 additions & 0 deletions .claude/commands/refactor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# /refactor

Analyze the codebase for files and functions that should be split for
modularity, separation of concerns, and maintainability.

## Usage
- `/refactor` — scan all source files, report anything over threshold
- `/refactor src/runtime/orchestration/tool_round.rs` — analyze specific file
- `/refactor 300` — use custom line threshold instead of default 500

## Steps

1. Read `.claude/rules/invariants.md` and `.claude/dev/module-map.md` first
2. If a specific file was given, analyze that file only
3. Otherwise, find all `.rs` files over the line threshold:
`find src -name "*.rs" | xargs wc -l | sort -rn | head -20`
4. For each file over threshold:
- List distinct responsibilities it owns
- Identify functions over 100 lines
- Flag any separation of concerns violations
- Flag any layering violations per module-map.md
5. For each candidate split:
- Propose new module name and what moves there
- Estimate risk: low / medium / high
- Note any cross-module import changes required
6. Output a prioritized list — highest risk files first

## Constraints
- Never suggest splitting for line count alone — only when distinct
responsibilities exist
- Never propose changes that violate `.claude/rules/invariants.md`
- Flag any split that touches public APIs or cross-module imports
- Do not modify any files — analysis only unless explicitly asked
45 changes: 45 additions & 0 deletions .claude/commands/sync-claude.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# /sync-claude

Audit the current state of `.claude/` and `CLAUDE.md` against the actual codebase and update anything stale. This command keeps the AI development environment in sync with reality.

## What to check and update

**1. Test baseline in `CLAUDE.md`**
Run `cargo test --no-default-features 2>&1 | grep "^test result"` and update the test count in CLAUDE.md if it has changed.

**2. Invariant locations in `.claude/rules/invariants.md`**
Verify these line number references are still accurate:
- `is_permitted_shell_command()` in `src/runtime/investigation/prompt_analysis.rs`
- `execute_approved()` in `src/tools/registry.rs`
- `evidence_ready()` in `src/runtime/investigation/investigation.rs`
- `tool_allowed_for_surface()` in `src/runtime/investigation/tool_surface.rs`
Update any stale line references.

**3. Layer boundaries in `.claude/rules/architecture.md`**
Check if the known `core/ → tools/` violation still exists:
`grep -n "ToolError" src/core/error.rs`
If it's been fixed, remove the "Known Exception" section. If new violations exist, document them.

**4. Test command accuracy**
Verify `just verify` still runs `cargo test --no-default-features`:
`grep "test" justfile`
Update CLAUDE.md or slice-discipline.md if the command has changed.

**5. New tools or surfaces**
Check if new tools have been added since last sync:
`ls src/tools/`
If new tools exist that aren't documented in `rules/invariants.md` (under Surface Enforcement), add them.

**6. Key files table in `CLAUDE.md`**
Verify all referenced files still exist at the listed paths:
`find src -name "*.rs" | grep -E "registry|prompt_analysis|tool_surface|investigation|prompt|engine|tool_round"`
Update any moved or renamed files.

**7. Phase references**
Check the current phase from recent git log:
`git log --oneline -5`
If CLAUDE.md or any rules file references a stale phase number, update it.

## After auditing
Report what was checked, what was stale, and what was updated. Do not touch any Rust source files. Do not run `cargo test` — use the grep/find commands above for verification only.

30 changes: 30 additions & 0 deletions .claude/dev/core-loop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Core Loop

## System Mental Model

- The runtime is the state machine. It owns request handling, turn classification, tool dispatch, approval suspension, answer admission, deterministic terminal answers, anchor state, project snapshot caching, and conversation trimming. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/conversation.rs`, `src/runtime/types.rs`.
- The backend does not execute tools or decide whether a response is valid. `ModelBackend::generate()` only receives a `GenerateRequest` and emits `BackendEvent`s; the runtime parses the returned text, discards invalid protocol, and decides whether to keep or replace the assistant output. Code: `src/llm/backend.rs`, `src/runtime/orchestration/generation.rs`, `src/runtime/protocol/tool_codec/`, `src/runtime/orchestration/engine.rs`.
- The runtime injects turn-local policy before every generation. `run_generate_turn()` appends a system message naming the active `ToolSurface`, and may append a bounded project snapshot hint. These hints are request-local and are not persisted in `Conversation`. Code: `src/runtime/orchestration/generation.rs`, `src/runtime/investigation/tool_surface.rs`, `src/runtime/project/project_snapshot.rs`, `src/runtime/protocol/prompt.rs`.
- The runtime, not the backend, chooses when tools are available. `select_tool_surface()` selects one of `RetrievalFirst`, `GitReadOnly`, `AnswerOnly`, or `MutationEnabled`. `tool_allowed_for_surface()` enforces surface membership before dispatch. Code: `src/runtime/investigation/tool_surface.rs`.
- The runtime guarantees project confinement. All tool inputs are converted from raw `ToolInput` into `ResolvedToolInput` before dispatch; read, list, and search scopes must stay inside `ProjectRoot`; mutation targets also reject symlink parents and symlink targets. On Windows, `ProjectRoot::new()` strips the `\\?\` UNC prefix after `fs::canonicalize`. Code: `src/runtime/project/resolved_input.rs`, `src/runtime/project/resolver.rs`, `src/runtime/project/project_root.rs`.
- The runtime guarantees that mutations do not execute during the proposal phase. `edit_file` and `write_file` and `shell` return `ToolRunResult::Approval(PendingAction)` from `run()`, and only `execute_approved()` performs the actual action. Code: `src/tools/mod.rs`, `src/tools/types.rs`, `src/tools/edit_file.rs`, `src/tools/write_file.rs`, `src/tools/shell.rs`.
- The runtime guarantees that investigation answers are grounded in read evidence, not search text alone. Search-only answers, unread file citations, out-of-scope citations, repeated tool drift after evidence, and repeated malformed protocol all terminate through runtime-owned branches. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/orchestration/tool_round.rs`, `src/runtime/investigation/investigation.rs`, `src/runtime/protocol/response_text.rs`.
- The runtime guarantees bounded context growth. Tool results are capped through `cap_tool_result_blocks()` (driven by `ContextPolicy` derived from `BackendCapabilities.context_window_tokens`), old tool exchanges are live-trimmed without removing conversational messages, context usage is estimated, `/context stats` reports live usage, `/compact` prunes stale tool results, a warning fires at 75%, and auto-prune runs at 90%. Summarization is deferred. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/orchestration/context_cap.rs`, `src/runtime/orchestration/context_policy.rs`, `src/runtime/orchestration/command_handlers.rs`, `src/runtime/conversation.rs`.

## Core Runtime Loop

- `Runtime::handle()` is the single request entrypoint. It dispatches `Submit`, `Reset`, `Approve`, `Reject`, `QueryLast`, `QueryAnchors`, `QueryHistory`, `ReadFile`, `SearchCode`, `Undo`, `ProvidersList`, `ProvidersUse`, `GitBranch`, `GitStatus`, `GitDiff`, `GitLog`, `ListDir`, `LspStatus`, `IndexBuild`, `IndexStatus`, `ContextStats`, and `Compact` requests. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/types.rs`.
- Slash-command requests (`GitBranch`, `GitStatus`, `GitDiff`, `GitLog`, `ReadFile`, `SearchCode`, `ListDir`) are dispatched through the `CommandTool` allowlist in `command_handlers.rs`. Mutating tools are excluded from this allowlist by construction. Code: `src/runtime/orchestration/command_handlers.rs`.
- `handle_submit()` rejects empty prompts and new submits while a `PendingAction` exists. It also special-cases exact anchor prompts and routes them into `run_last_read_file_anchor()` or `run_last_search_anchor()` instead of the normal turn loop. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/orchestration/anchor_resolution.rs`, `src/runtime/investigation/anchors.rs`.
- A normal submit enters `run_turns_with_initial_reads()`. That function computes turn state once from the original user prompt: retrieval intent, direct-read mode, whether investigation is required, whether mutation is allowed, the `ToolSurface`, the `InvestigationMode`, and an optional prompt-derived path scope. State is collected into `TurnContext` and `TurnState`. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/orchestration/turn_state.rs`, `src/runtime/investigation/prompt_analysis.rs`, `src/runtime/investigation/investigation.rs`, `src/runtime/investigation/tool_surface.rs`.
- Before any backend generation, the runtime may seed the first tool call itself. This happens for narrow natural-language edits (`requested_simple_edit()`), direct reads, directory listings, and permitted shell commands. The seeded call is stored as `PendingRuntimeCall { seeded_pre_generation: true }`, so the first tool can run with no backend round. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/orchestration/turn_state.rs`, `src/runtime/investigation/prompt_analysis.rs`.
- Each loop iteration chooses an `effective_surface`. If `answer_phase` (`AnswerPhaseKind::PostRead` or `InvestigationEvidenceReady`) is active, `effective_surface` is forced to `AnswerOnly`; otherwise it uses the prompt-selected surface. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/orchestration/turn_state.rs`, `src/runtime/investigation/tool_surface.rs`.
- `run_generate_turn()` builds the request from `Conversation::snapshot()`, appends the surface hint, optionally appends the project snapshot hint, sends the request to the backend, buffers streamed text, and only writes the assistant reply into `Conversation` after a complete response is available. Code: `src/runtime/orchestration/generation.rs`.
- After generation, the runtime parses the assistant text with `tool_codec::parse_all_tool_inputs()`. If no tool calls are parsed, the runtime either admits the answer or replaces it through guard branches. Code: `src/runtime/protocol/tool_codec/tool_parser.rs`, `src/runtime/orchestration/engine.rs`.
- If tool calls are present, the runtime increments `tool_rounds` unless the call was seeded before generation. The round limit is `MAX_TOOL_ROUNDS = 10`; hitting it emits `AnswerSource::ToolLimitReached`. Code: `src/runtime/orchestration/engine.rs`.
- Tool execution is delegated to `run_tool_round()`, which returns one of four outcomes. `Completed` means all calls finished immediately. `ApprovalRequired` means the turn pauses with a `PendingAction`. `RuntimeDispatch` means the runtime selected the next tool call itself. `TerminalAnswer` means the runtime has enough information to end the turn without another backend round. Code: `src/runtime/orchestration/tool_round.rs`.
- Search to read transition can happen in three ways: the backend emits `[read_file: ...]` after a search result, `run_tool_round()` returns `RuntimeDispatch` to the preferred candidate after search, or a direct-read request is seeded before any generation. Code: `src/runtime/orchestration/tool_round.rs`, `src/runtime/orchestration/engine.rs`.
- Read to answer transition is runtime-owned. After a completed tool round, the runtime sets `answer_phase = InvestigationEvidenceReady` when `investigation.evidence_ready()` becomes true, or `answer_phase = PostRead` for non-investigation read flows. The next generation then runs under `AnswerOnly`. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/investigation/investigation.rs`.
- Raw direct reads are a separate terminal path. If a seeded direct read completes in `DirectReadMode::Raw`, the runtime strips the tool-result wrapper with `direct_read_fallback_answer()` and finishes immediately. No synthesis generation is performed. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/protocol/response_text.rs`, `src/runtime/tests/finalization.rs`.
- Approved mutation success does not re-enter the backend. `handle_approve()` executes the approved tool, commits the tool result, invalidates the project snapshot cache, trims context, and finishes with `mutation_complete_final_answer()`. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/protocol/response_text.rs`.
- Provider switching is session-only. `ProvidersList` and `ProvidersUse` requests list or swap the active `ModelBackend` without persisting the change. Code: `src/runtime/orchestration/engine.rs`, `src/runtime/types.rs`.
Loading
Loading