v1.8.47: Fix tool autonomy (preamble rewrite), post-gen diagnostics, code block UI jump

Brendan Gray · Brendan Gray · commit 51b3e82b4de3 · 2026-03-21T07:36:02.000-04:00
Fix I: Rewrite preamble 'When to Use Tools' sections in both DEFAULT_SYSTEM_PREAMBLE
and DEFAULT_COMPACT_PREAMBLE to use category-based positive guidance instead of negation.
Remove 'Do NOT use tools for general knowledge' that suppressed web_search/write_todos.
Update web_search and write_todos tool descriptions to remove negation language.

Fix J: Add post-generation diagnostic log line for ALL generation completions (not just
maxTokens). Logs stopReason, responseChars, tokensUsed, maxTokens, llamaStopReason.

Fix K: Add startExpanded prop to CodeBlock to prevent height jump when streaming
finalizes. Last assistant message keeps code blocks expanded after finalization,
preventing content repositioning while showing Apply/Save buttons.
diff --git a/main/constants.js b/main/constants.js
@@ -18,9 +18,11 @@ const _shellDesc = process.platform === 'win32'
 const DEFAULT_SYSTEM_PREAMBLE = `You are an AI coding assistant integrated into a local IDE. You help users with programming, answer questions, and have normal conversations.
 
 ## When to Use Tools
-- Use tools when the user asks you to DO something: create files, edit code, run commands, search the web, browse pages
-- Do NOT use tools for conversation, greetings, questions, opinions, explanations, or general knowledge
-- If the user hasn't asked for any action, respond conversationally — no tools needed
+- For greetings, opinions, and casual conversation: respond naturally without tools
+- For anything requiring current/live information (prices, news, weather, scores, events): use web_search
+- For creating or modifying files: use write_file, edit_file, or append_to_file — do NOT output entire files as code blocks in chat
+- For multi-step tasks (building features, refactoring, planning): call write_todos first to create a checklist, then work through each step
+- For running commands, browsing, or any other action: use the appropriate tool
 - When you have completed what the user asked for, stop and provide your response
 
 ## Continuation
@@ -60,17 +62,18 @@ If your output is cut off mid-generation, the system will automatically continue
 - Browser workflow: browser_navigate, then browser_snapshot, then interact using refs
 - If a tool fails, analyze the error and retry once with corrected parameters
 - When asked for creative writing (stories, poems, essays), respond directly unless the user asks for a file
-- Use web_search only when the answer requires current/live information
+- Use web_search when the answer requires current, live, or time-sensitive information
 - If the user asks for multiple files, create ALL of them — do not stop after the first
 - Always use the exact filename the user specifies`;
 
 const DEFAULT_COMPACT_PREAMBLE = `You are a helpful AI assistant integrated into a local IDE. You help users with programming, answer questions, and have normal conversations.
 
 ## When to Use Tools
-- Use tools ONLY when the user asks you to DO something: create files, edit code, run commands, search the web, plan multi-step tasks
-- Do NOT use tools for conversation, greetings, questions, opinions, explanations, or general knowledge
-- "hi", "what?", "my name is X", "how are you" — these are conversation. Just respond naturally, no tools
-- If the user hasn't asked for any file/code/command action, respond conversationally — no tools needed
+- For greetings, opinions, casual conversation: respond naturally without tools
+- For current/live information (prices, news, weather, scores, events, documentation): use web_search — you have real-time internet access
+- For creating or modifying files: use write_file, edit_file, or append_to_file — do NOT output entire files as code blocks in chat
+- For multi-step tasks: call write_todos first to create a checklist, then work through each step
+- For running commands, browsing, or any other action: use the appropriate tool
 - When you have completed what the user asked for, STOP and provide your response. Do not keep going
 
 ## File Operations
@@ -83,7 +86,7 @@ When creating or editing files, use tool calls (write_file, edit_file, append_to
 - Only claim you did something if you called the tool that did it
 - Before diagnosing a bug, read_file the relevant file first
 - For general knowledge, conversation, creative writing: answer directly — no tools needed
-- This assistant has real-time web access via web_search and fetch_webpage. Use these tools when the user asks about anything requiring live, current, or recently updated information — never refuse by saying you cannot access the internet.
+- You have real-time web access via web_search and fetch_webpage. Use web_search when the user asks about anything current, live, or time-sensitive — prices, weather, news, scores, events, real-time data. Never refuse by saying you cannot access the internet
 - After web_search or fetch_webpage, present findings clearly — cite specific data and source URLs from the results. Do not make up information that was not in the tool results
 - For multi-step tasks (building an app, implementing a feature with multiple files, a plan with several stages): call write_todos first to list your steps, then work through them one by one
 - run_command is available and uses ${_shellDesc} — always use the correct shell syntax for this environment
diff --git a/main/llmEngine.js b/main/llmEngine.js
@@ -853,6 +853,8 @@ class LLMEngine extends EventEmitter {
         finalStopReason = 'maxTokens';
         console.log(`[LLM] Generation stopped at maxTokens (${fullResponse.length} chars)`);
       }
+      const tokensUsed = this.sequence?.nextTokenIndex || 0;
+      console.log(`[LLM] Post-gen: stopReason=${finalStopReason}, responseChars=${fullResponse.length}, tokensUsed=${tokensUsed}, maxTokens=${merged.maxTokens}, llamaStopReason=${result?.metadata?.stopReason || 'unknown'}`);
       return {
         text: sanitized,
         rawText: fullResponse,
diff --git a/main/mcpToolServer.js b/main/mcpToolServer.js
@@ -269,7 +269,7 @@ class MCPToolServer {
     this._toolDefsCache = [
       {
         name: 'web_search',
-        description: 'Search the web for information using DuckDuckGo. Use when the answer may have changed since your training — anything that varies over time. Also use for documentation and error solutions when the current version matters.',
+        description: 'Search the web for current information using DuckDuckGo. Use for anything current, live, or time-sensitive: prices, weather, news, headlines, scores, events, stock data, documentation lookups.',
         parameters: {
           query: { type: 'string', description: 'Search query', required: true },
           maxResults: { type: 'number', description: 'Max results (default 5)', required: false },
@@ -759,7 +759,7 @@ class MCPToolServer {
       // ── Planning / TODO Tools ──
       {
         name: 'write_todos',
-        description: 'Create a checklist for complex multi-step tasks (e.g. building an app, refactoring code, planning a project). Do NOT use for simple questions, greetings, or conversation.',
+        description: 'Create a checklist to plan and track multi-step tasks. Use when a task involves multiple steps — building an app, refactoring code, implementing a feature, planning a project.',
         parameters: {
           items: { type: 'array', description: 'Array of todo strings or {text,status} objects', required: true },
         },
diff --git a/src/components/Chat/ChatPanel.tsx b/src/components/Chat/ChatPanel.tsx
@@ -1618,7 +1618,7 @@ ${e.message}`,
 
   // Render content with tool terminal and code block detection
   // Merges tool calls with their results into single collapsible blocks
-  const renderContentParts = (content: string, suppressTools = false) => {
+  const renderContentParts = (content: string, suppressTools = false, expandBlocks = false) => {
     // Pre-extract tool results for merging
     const toolResultMap = extractToolResults(content);
 
@@ -1733,7 +1733,7 @@ ${e.message}`,
           );
         } else {
           pendingWriteFP = null;
-          elements.push(<CodeBlock key={i} code={code} language={lang} onApply={() => onApplyCode(currentFile, code)} onSaveAsFile={handleSaveAsFile} />);
+          elements.push(<CodeBlock key={i} code={code} language={lang} onApply={() => onApplyCode(currentFile, code)} onSaveAsFile={handleSaveAsFile} startExpanded={expandBlocks} />);
         }
         continue;
       }
@@ -1858,11 +1858,11 @@ ${e.message}`,
     return elements;
   };
 
-  const renderMessage = (msg: ChatMessage): React.ReactNode[] => {
+  const renderMessage = (msg: ChatMessage, expandBlocks = false): React.ReactNode[] => {
     const hasMsgTools = !!(msg.toolsUsed && msg.toolsUsed.length > 0);
     // suppressTools=true: content rendering skips all tool blocks to prevent duplicate write_file
     // bubbles and wrong failure counts. msg.toolsUsed is the authoritative source for tool UI.
-    const parts = renderContentParts(msg.content, hasMsgTools);
+    const parts = renderContentParts(msg.content, hasMsgTools, expandBlocks);
     if (hasMsgTools) {
       const WRITE_TOOLS_MSG = ['write_file', 'create_file', 'edit_file', 'append_to_file'];
         const MSG_LANG_MAP: Record<string, string> = { ts: 'typescript', tsx: 'tsx', js: 'javascript', jsx: 'jsx', py: 'python', rs: 'rust', go: 'go', java: 'java', cs: 'csharp', cpp: 'cpp', c: 'c', html: 'html', css: 'css', json: 'json', yaml: 'yaml', yml: 'yaml', md: 'markdown', sh: 'bash', bat: 'batch', txt: 'text', xml: 'xml', sql: 'sql' };
@@ -3193,7 +3193,7 @@ ${e.message}`,
                         </div>
                       </div>
                     ) : (
-                    <div className="space-y-1">{renderMessage(msg)}</div>
+                    <div className="space-y-1">{renderMessage(msg, index === messages.length - 1)}</div>
                     )
                   ) : (
                     <>
diff --git a/src/components/Chat/ChatWidgets.tsx b/src/components/Chat/ChatWidgets.tsx
@@ -457,13 +457,14 @@ export const ToolCallGroup: React.FC<{ children: React.ReactNode; count: number
 // ── Code Block with Copy/Apply ──
 const COLLAPSE_LINE_THRESHOLD = 6; // Collapse code blocks taller than this many lines
 
-export const CodeBlock: React.FC<{ code: string; language: string; onApply: () => void; isToolCall?: boolean; isStreaming?: boolean; isAlreadyWritten?: boolean; onSaveAsFile?: (code: string, language: string) => void; defaultCollapsed?: boolean }> = ({ code, language, onApply, isToolCall, isStreaming, isAlreadyWritten, onSaveAsFile, defaultCollapsed }) => {
+export const CodeBlock: React.FC<{ code: string; language: string; onApply: () => void; isToolCall?: boolean; isStreaming?: boolean; isAlreadyWritten?: boolean; onSaveAsFile?: (code: string, language: string) => void; defaultCollapsed?: boolean; startExpanded?: boolean }> = ({ code, language, onApply, isToolCall, isStreaming, isAlreadyWritten, onSaveAsFile, defaultCollapsed, startExpanded }) => {
   const [copied, setCopied] = useState(false);
   const lineCount = code.split('\n').length;
   const isLong = lineCount > COLLAPSE_LINE_THRESHOLD;
   // Default to collapsed for long blocks; keep streaming blocks expanded so users can watch generation
   // defaultCollapsed forces collapsed on first render (used for live tool call generation bubbles)
-  const [expanded, setExpanded] = useState(defaultCollapsed ? false : !!isStreaming || !isLong);
+  // startExpanded overrides collapse for just-finalized messages to prevent height jump on finalization
+  const [expanded, setExpanded] = useState(defaultCollapsed ? false : !!isStreaming || !!startExpanded || !isLong);
 
   const handleCopy = () => {
     navigator.clipboard.writeText(code);

Original file line number	Diff line number	Diff line change
`@@ -853,6 +853,8 @@ class LLMEngine extends EventEmitter {`
`853`	`853`	`finalStopReason = 'maxTokens';`
`854`	`854`	console.log(`[LLM] Generation stopped at maxTokens (${fullResponse.length} chars)`);
`855`	`855`	`}`
	`856`	`+ const tokensUsed = this.sequence?.nextTokenIndex \|\| 0;`
	`857`	+ console.log(`[LLM] Post-gen: stopReason=${finalStopReason}, responseChars=${fullResponse.length}, tokensUsed=${tokensUsed}, maxTokens=${merged.maxTokens}, llamaStopReason=${result?.metadata?.stopReason \|\| 'unknown'}`);
`856`	`858`	`return {`
`857`	`859`	`text: sanitized,`
`858`	`860`	`rawText: fullResponse,`