Skip to content

Fix shell tool output replay consistency#44

Merged
ScriptSmith merged 2 commits into
mainfrom
shell-output-ids
Jun 2, 2026
Merged

Fix shell tool output replay consistency#44
ScriptSmith merged 2 commits into
mainfrom
shell-output-ids

Conversation

@ScriptSmith
Copy link
Copy Markdown
Owner

No description provided.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

This PR fixes two interrelated bugs in shell tool output replay: shell_call and shell_call_output items that share the same id were being collapsed (one silently dropped from persistence) and ordered incorrectly (output before call), causing previous_response_id replay to break on Anthropic-family models.

  • output_pos / item_index key fix (runner.rs): Both maps now key on a (type, id) qualified key for shell items, so the paired items each get their own persistence slot and live-stream output_index. Delta events still key on bare id since they never carry a type field.
  • Ordering fix (shell_tool.rs, responses_pipeline.rs): New order_shell_outputs_after_calls function stably reorders the collected output array to call-before-output; both the streaming (StreamRewriter::ordered_output_items) and buffered (collect_streaming_response_to_json) paths call it, keeping the wire order intact while persisting a replayable transcript.

Confidence Score: 5/5

Safe to merge — both the streaming and buffered paths are fixed consistently and covered by focused regression tests.

The changes are tightly scoped to the two broken maps (output_pos, item_index) and the missing reorder step. The qualified-key logic for shell items is isolated to index_for and record_output_item, with a clear fallback for all other item types. order_shell_outputs_after_calls is deterministic, handles all edge cases (already-ordered input, orphaned outputs, multi-call interleaving), and is exercised by both new and existing tests. The previous reviewer concern about item_index being unpatched is fully addressed here.

No files require special attention.

Important Files Changed

Filename Overview
src/services/server_tools/runner.rs Fixes output_pos and item_index to key on qualified (type, id) for shell items; adds ordered_output_items() to the terminal-event path. New unit tests cover both the distinct-slot and correct-persistence invariants.
src/services/shell_tool.rs Adds order_shell_outputs_after_calls to stably reorder output-before-call pairs; handles already-ordered input, orphaned outputs, and multi-call interleaving. Well-tested.
src/services/responses_pipeline.rs Applies order_shell_outputs_after_calls in the buffered (non-streaming) collection path so the persisted transcript matches the streaming path.

Sequence Diagram

sequenceDiagram
    participant U as Upstream
    participant SR as StreamRewriter
    participant DB as Persisted transcript

    U->>SR: "output_item.added shell_call id=toolu_1 -> slot 0"
    U->>SR: "output_item.added shell_call_output id=toolu_1 -> slot 1"
    Note over SR: index_for uses qualified key per type
    U->>SR: "output_item.done shell_call_output -> output_items[0]"
    U->>SR: "output_item.done shell_call -> output_items[1]"
    Note over SR: output_pos uses (type,id) - no collision
    U->>SR: response.completed
    SR->>SR: ordered_output_items() reorders to call-before-output
    SR->>DB: "output[0]=shell_call output[1]=shell_call_output"
Loading

Reviews (3): Last reviewed commit: "Review fixes" | Re-trigger Greptile

@ScriptSmith
Copy link
Copy Markdown
Owner Author

@greptile-apps

@ScriptSmith ScriptSmith merged commit 5d7df3a into main Jun 2, 2026
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant