Fix shell tool output replay consistency by ScriptSmith · Pull Request #44 · ScriptSmith/hadrian

ScriptSmith · 2026-06-02T09:08:42Z

No description provided.

greptile-apps · 2026-06-02T09:13:51Z

Greptile Summary

This PR fixes two interrelated bugs in shell tool output replay: shell_call and shell_call_output items that share the same id were being collapsed (one silently dropped from persistence) and ordered incorrectly (output before call), causing previous_response_id replay to break on Anthropic-family models.

output_pos / item_index key fix (runner.rs): Both maps now key on a (type, id) qualified key for shell items, so the paired items each get their own persistence slot and live-stream output_index. Delta events still key on bare id since they never carry a type field.
Ordering fix (shell_tool.rs, responses_pipeline.rs): New order_shell_outputs_after_calls function stably reorders the collected output array to call-before-output; both the streaming (StreamRewriter::ordered_output_items) and buffered (collect_streaming_response_to_json) paths call it, keeping the wire order intact while persisting a replayable transcript.

Confidence Score: 5/5

Safe to merge — both the streaming and buffered paths are fixed consistently and covered by focused regression tests.

The changes are tightly scoped to the two broken maps (output_pos, item_index) and the missing reorder step. The qualified-key logic for shell items is isolated to index_for and record_output_item, with a clear fallback for all other item types. order_shell_outputs_after_calls is deterministic, handles all edge cases (already-ordered input, orphaned outputs, multi-call interleaving), and is exercised by both new and existing tests. The previous reviewer concern about item_index being unpatched is fully addressed here.

No files require special attention.

Important Files Changed

Filename	Overview
src/services/server_tools/runner.rs	Fixes output_pos and item_index to key on qualified (type, id) for shell items; adds ordered_output_items() to the terminal-event path. New unit tests cover both the distinct-slot and correct-persistence invariants.
src/services/shell_tool.rs	Adds order_shell_outputs_after_calls to stably reorder output-before-call pairs; handles already-ordered input, orphaned outputs, and multi-call interleaving. Well-tested.
src/services/responses_pipeline.rs	Applies order_shell_outputs_after_calls in the buffered (non-streaming) collection path so the persisted transcript matches the streaming path.

Sequence Diagram

sequenceDiagram
    participant U as Upstream
    participant SR as StreamRewriter
    participant DB as Persisted transcript

    U->>SR: "output_item.added shell_call id=toolu_1 -> slot 0"
    U->>SR: "output_item.added shell_call_output id=toolu_1 -> slot 1"
    Note over SR: index_for uses qualified key per type
    U->>SR: "output_item.done shell_call_output -> output_items[0]"
    U->>SR: "output_item.done shell_call -> output_items[1]"
    Note over SR: output_pos uses (type,id) - no collision
    U->>SR: response.completed
    SR->>SR: ordered_output_items() reorders to call-before-output
    SR->>DB: "output[0]=shell_call output[1]=shell_call_output"

_{Reviews (3): Last reviewed commit: "Review fixes" | Re-trigger Greptile}

ScriptSmith · 2026-06-02T09:25:18Z

@greptile-apps

Fix shell tool output replay consistency

5feb2c6

Review fixes

545dd22

ScriptSmith merged commit 5d7df3a into main Jun 2, 2026
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix shell tool output replay consistency#44

Fix shell tool output replay consistency#44
ScriptSmith merged 2 commits into
mainfrom
shell-output-ids

ScriptSmith commented Jun 2, 2026

Uh oh!

greptile-apps Bot commented Jun 2, 2026 •

edited

Loading

Uh oh!

ScriptSmith commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

ScriptSmith commented Jun 2, 2026

Uh oh!

greptile-apps Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

ScriptSmith commented Jun 2, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Jun 2, 2026 •

edited

Loading