Textagent
diff --git a/‎CHANGELOG-agentic-tool-calling.md‎
Lines changed: 69 additions & 0 deletions b/‎CHANGELOG-agentic-tool-calling.md‎
Lines changed: 69 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 2 additions & 1 deletion b/‎README.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎js/ai-assistant.js‎
Lines changed: 19 additions & 7 deletions b/‎js/ai-assistant.js‎
Lines changed: 19 additions & 7 deletions
@@ -0,0 +1,69 @@
+# Agentic Tool Calling — Smart AI Context Pipeline
+
+- Implemented agentic tool calling for Groq cloud models (two-pass generation loop)
+- Model dynamically decides which tools to call (Search, Weather, HN, GitHub, Slack) based on query intent
+- Added `buildToolDefinitions()` to register enabled connectors as OpenAI-format tools
+- Added `executeToolCall()` and `handleToolCalls()` for parallel tool execution with Pass 2 synthesis
+- Groq worker updated: non-streaming tool detection, `tool_calls` message type, `rawMessages` for Pass 2
+- `sendToAi()` extended with `tools` and `rawMessages` parameters (7 args)
+- Added `extractLocationFromQuery()` for smart city extraction from natural language queries
+- Fixed geocoding: was sending full sentence ("what is the temp on new delhi") to API, now extracts "new delhi"
+- Added `queryNeedsConnectors()` relevance filter — general queries skip connector injection entirely
+- Fixed: "what is algebra?" with connectors ON → no longer says "I only have Tokyo data"
+- Fixed: grounding header changed from "do not rely on training knowledge" to allowing general knowledge when data is irrelevant
+- Softened grounding instruction to "answer from data when relevant, from knowledge when not"
+- Exposed `getConfig`, `getToken`, `fetchWeatherDirect`, `fetchHNDirect`, etc. on `M.connectors` namespace
+- Added model-aware context budgets: 4K chars for local WebGPU, 30K for non-tool-calling cloud
+- Fixed duplicate typing indicator between Pass 1 and Pass 2
+- Improved WebGPU buffer overflow: user-friendly error message for GPU memory errors
+- Added model-size-aware context limits in local worker (0.8B→4K, 2B→8K, 4B+→32K)
+
+---
+
+## Summary
+
+Transitioned the AI assistant from a "firehose" architecture (inject all connector data for every query) to an **agentic tool-calling** system where cloud models (Groq) dynamically decide which data sources to query, and local models receive a **query-relevance filter** that prevents irrelevant connector data from polluting general knowledge answers.
+
+---
+
+## 1. Agentic Tool Calling (Cloud — Groq)
+**Files:** `js/ai-chat.js`, `public/ai-worker-groq.js`, `js/ai-assistant.js`
+**What:** Cloud models now receive tool definitions (OpenAI format) and decide whether to call them. Tools execute in parallel on the main thread, results are fed back for a second generation pass.
+**Impact:** "What is algebra?" on Groq → zero unnecessary API calls. "Weather in Paris?" → model calls only `get_weather`. Reduces latency, saves API tokens, eliminates noise.
+
+## 2. Query-Relevance Filter (Local Models)
+**Files:** `js/ai-chat.js`
+**What:** Added `queryNeedsConnectors()` that checks for weather/news/GitHub/Slack keywords before injecting connector data. General knowledge queries ("what is algebra?", "write a poem") skip connector fetch entirely.
+**Impact:** Local models no longer refuse general questions when connectors are enabled. Previously, the model would say "I only have Tokyo data" for any non-weather query.
+
+## 3. Smart Location Extraction
+**Files:** `js/connectors.js`
+**What:** Added `extractLocationFromQuery()` with 5 extraction strategies (preposition patterns, weather patterns, capitalized words, filler filtering) to extract city names from natural language queries before geocoding.
+**Impact:** "what is the temp in new delhi" → extracts "new delhi" → geocodes correctly. Was sending full sentence to API which failed.
+
+## 4. Softened Grounding Header
+**Files:** `js/ai-chat.js`
+**What:** Changed from "Do not rely on your training knowledge" to "If the user asks about something unrelated to this data, answer from your general knowledge."
+**Impact:** Models can now answer general questions even when connector data is present in context, while still prioritizing live data for relevant queries.
+
+## 5. Connector API Exposure
+**Files:** `js/connectors.js`
+**What:** Exposed `getConfig`, `getToken`, `fetchWeatherDirect`, `fetchHNDirect`, `fetchGitHubDirect`, `fetchSlackDirect` on the `M.connectors` namespace for on-demand tool execution.
+**Impact:** Tool calling system can invoke individual connectors directly without going through the firehose pipeline.
+
+## 6. WebGPU Error Handling
+**Files:** `public/ai-worker.js`
+**What:** Translates GPU buffer errors (`mapAsync`, `Invalid Buffer`, `GPUBuffer`) into user-friendly messages. Added model-size-aware context limits (0.8B→4K, 2B→8K, 4B+→32K chars).
+**Impact:** Users see actionable error messages instead of cryptic GPU errors. Smaller models no longer crash on large context.
+
+---
+
+## Files Changed (5 total)
+
+| File | Lines Changed | Type |
+|------|:---:|------|
+| `js/ai-chat.js` | +329 −3 | Tool definitions, execution, relevance filter, grounding fix |
+| `js/connectors.js` | +180 −8 | Location extraction, geocoding fix, API exposure |
+| `public/ai-worker-groq.js` | +60 −3 | Tool calling support, non-streaming path, Pass 2 |
+| `js/ai-assistant.js` | +26 −2 | sendToAi 7 params, tool_calls handler |
+| `public/ai-worker.js` | +17 −1 | GPU error messages, model-aware context limits |
@@ -41,7 +41,7 @@
 | **🔌 API Calls** | `{{API:}}` REST API integration — GET/POST/PUT/DELETE methods, custom headers, JSON body, response stored in `$(api_varName)` variables; inline review panel; toolbar GET/POST buttons |
 | **🔗 Agent Flow** | `{{Agent:}}` multi-step pipeline — define Step 1/2/3, chain outputs, per-card model + search provider selector, live step status indicators (⏳/✅/❌), review combined output; `@cloud: yes/no` with ☁️ Cloud / 🖥️ Local badge; `@agenttype:` dropdown selector (openclaw/openfang) for external agents; Docker-based local execution via `agent-runner/server.js`; Agent Execution Settings UI (Codespaces/Local Docker/Custom endpoint); GitHub Codespaces cloud execution via ☁️ toggle; **📦 Agent Containers panel** — floating toolbar panel showing running Docker containers with live status, uptime, instant stop (`docker rm -f`), badge count, daemon readiness check, startup container recovery, and floating toggle accessible when header is fully hidden |
 | **🔍 Web Search** | Toggle web search for AI — 7 providers: DuckDuckGo (free), Brave Search, Serper.dev, Tavily (AI-optimized), Google CSE, Wikipedia, Wikidata; search results injected into LLM context; source citations in responses; per-agent-card search provider selector |
-| **🔌 Connectors** | Third-party data sources for AI context — Hacker News connector (top stories with URLs, authors, body text, top 3 community comments, configurable story count 1–20); connector toggle in AI panel header with green active indicator; parallel fetch with web search via `Promise.all`; grounding instruction header forces AI to use live data; connector modal with grid view, detail view, connect/disconnect; state persisted in `localStorage`; extensible registry for future Slack, Notion, GitHub, Confluence connectors |
+| **🔌 Connectors** | Third-party data sources for AI context — **agentic tool calling** on Groq cloud (model decides which tools to invoke, two-pass generation with parallel execution), **query-relevance filter** on local models (keyword-based gating prevents connector noise on general queries); Hacker News connector (top stories with URLs, authors, body text, top 3 community comments, configurable story count 1–20); Weather connector with smart city extraction from natural language (`extractLocationFromQuery`); connector toggle in AI panel header with green active indicator; parallel fetch with web search via `Promise.all`; softened grounding header (use live data when relevant, general knowledge otherwise); connector modal with grid view, detail view, connect/disconnect; state persisted in `localStorage`; extensible registry for Slack, Notion, GitHub, Confluence connectors |
 | **🎮 Game Builder** | `{{@Game:}}` tag — AI-generated games (Canvas 2D / Three.js / P5.js) or instant pre-built games via `@prebuilt:` field (chess, snake, shooter, pong, breakout, maths quiz, hiragana, kana master); engine selector pills; per-card model picker; CDN URL normalizer for CSP compliance; auto model-ready check before generation; 📋 Import button for pasting/uploading external HTML game code with source viewer; 📥 Export as standalone HTML; ⛶ fullscreen; single-line field parsing; "Games for Kids" template with 8 playable games |
 | **🐧 Linux Terminal** | `{{Linux:}}` tag — two modes: (1) Terminal mode opens full Debian Linux ([WebVM](https://webvm.io)) in new window with `Packages:` field; (2) Compile & Run mode (`Language:` + `Script:`) compiles/executes 25+ languages (C++, Rust, Go, Java, Python, TypeScript, Kotlin, Scala…) via [Judge0 CE](https://ce.judge0.com) with inline output, execution time & memory stats |
 | **❓ Help Mode** | Interactive learning mode — click ❓ Help to highlight all buttons, click any button for description + keyboard shortcut + animated demo video; 50% screen demo panel with fullscreen expand; 16 dedicated demo videos mapped to every toolbar button |
@@ -548,6 +548,7 @@ TextAgent has undergone significant evolution since its inception. What started
 
 | Date | Commits | Feature / Update |
 |------|---------|-----------------:|
+| **2026-04-04** | — | 🤖 **Agentic Tool Calling** — transitioned AI assistant from firehose context injection to **model-driven tool calling** for Groq cloud models; `buildToolDefinitions()` registers enabled connectors (Weather, HN, GitHub, Slack) + web search as OpenAI-format tools; model decides which tools to call via `tool_choice: 'auto'`; `executeToolCall()` runs tools in parallel; `handleToolCalls()` orchestrates two-pass generation (Pass 1: tool selection, Pass 2: synthesis with results); **query-relevance filter** (`queryNeedsConnectors()`) for local models — keyword-based gating prevents weather/news injection on general queries like "what is algebra?"; `extractLocationFromQuery()` with 5 extraction strategies (preposition patterns, weather patterns, capitalized words) for smart geocoding; softened grounding header allows general knowledge answers when connector data is irrelevant; Groq worker updated with non-streaming tool detection path and `rawMessages` for Pass 2; model-aware context budgets (4K local / 30K cloud); WebGPU buffer overflow translated to user-friendly error messages; model-size-aware context limits (0.8B→4K, 2B→8K, 4B+→32K chars) |
 | **2026-04-03** | — | 💾 **Offline Model Manager** — new Manager tab in AI model selector (Models \| Manager) with ZIP-based Export (reads all cached model files from Cache API, bundles into single `.zip` via built-in CRC32 + STORE-mode ZIP creator — zero external dependencies), Import (accepts `.zip` file, extracts entries, restores to browser Cache API via manifest URL mapping), and Delete (clears browser cache); per-model status badges (In browser cache / Downloaded to disk / Not downloaded) with actual cache sizes; button labels refactored from Download/Upload to Export/Import with `bi-box-arrow-down`/`bi-box-arrow-in-up` icons; Science template category added; works in all browsers — no File System Access API required |
 | **2026-04-03** | — | 🤖 **Qwen 3.5 XL (9B) Local Model** — added `textagent/Qwen3.5-9B-Onnx` (~16 GB) as the largest local multimodal Qwen model; supports vision (image-text-to-text); marked `requiresHighEnd`; placed after 4B in size progression (0.8B → 2B → 4B → 9B) |
 | **2026-04-03** | — | 🔌 **Connector AI Pipeline** — new "My Connectors" system for plugging third-party data sources into the AI assistant; Hacker News connector fetches top stories with full URLs, author metadata, self-post body text, and top community comments; connector toggle in AI panel header with green active indicator; unified parallel fetch pipeline (`Promise.all`) merges connector + web search context; grounding instruction header ("LIVE DATA...Answer using this data") forces models to use fetched data; **Fixed:** Gemma 4 E4B worker completely discarded `context` parameter — only `userPrompt` was used in the messages array; context now injected as `context + "\n---\nUser question: " + userText`; Gemma 4 system prompt enhanced with "data is real and live" grounding instruction; context trimmed to 6000 chars for WebGPU memory safety; connector label click bug fixed (`e.preventDefault()` stops checkbox toggle via event bubbling); `hasActiveConnectors()` decoupled from DOM — reads `localStorage` directly; auto-repair re-enables connected-but-paused connectors on init; default HN stories 10→5; connector registry extensible for Slack, Notion, GitHub, Confluence |
 
@@ -1151,6 +1151,13 @@
           M._ai.handleGroqComplete(msg.text, msg.messageId);
           break;
 
+        case 'tool_calls':
+          // Model wants to call tools — forward to main thread for execution
+          if (M._ai.handleToolCalls) {
+            M._ai.handleToolCalls(msg.toolCalls, msg.assistantMessage, msg.messages, msg.messageId);
+          }
+          break;
+
         case 'image-complete':
           M._ai.handleImageComplete(msg.imageBase64, msg.mimeType, msg.prompt, msg.messageId);
           break;
@@ -1294,7 +1301,7 @@
     }
   }
 
-  function sendToAi(taskType, context, userPrompt, attachments, chatHistory) {
+  function sendToAi(taskType, context, userPrompt, attachments, chatHistory, tools, rawMessages) {
     // If a local model is selected but not loaded yet, show inline consent before downloading
     if (isLocalModel(currentAiModel)) {
       const ls = getLocalState(currentAiModel);
@@ -1358,11 +1365,14 @@
     const enableThinking = thinkingToggle ? thinkingToggle.checked : false;
 
     // Show user message in chat (if not already shown by sendChatMessage)
-    const displayText = userPrompt || `[${taskType}] ${context ? context.substring(0, 80) + '...' : ''}`;
-    const allUserBubbles = aiChatArea.querySelectorAll('.ai-message-user .ai-msg-bubble');
-    const lastUserBubble = allUserBubbles.length > 0 ? allUserBubbles[allUserBubbles.length - 1] : null;
-    if (!lastUserBubble || lastUserBubble.textContent.trim() !== displayText.trim()) {
-      M._ai.addUserMessage(displayText);
+    // Skip for Pass 2 (rawMessages means we're continuing after tool execution)
+    if (!rawMessages) {
+      const displayText = userPrompt || `[${taskType}] ${context ? context.substring(0, 80) + '...' : ''}`;
+      const allUserBubbles = aiChatArea.querySelectorAll('.ai-message-user .ai-msg-bubble');
+      const lastUserBubble = allUserBubbles.length > 0 ? allUserBubbles[allUserBubbles.length - 1] : null;
+      if (!lastUserBubble || lastUserBubble.textContent.trim() !== displayText.trim()) {
+        M._ai.addUserMessage(displayText);
+      }
     }
     M._ai.addTypingIndicator();
 
@@ -1374,7 +1384,9 @@
       messageId,
       enableThinking,
       attachments: attachments || [],
-      chatHistory: chatHistory || []
+      chatHistory: chatHistory || [],
+      tools: tools || null,
+      rawMessages: rawMessages || null
     });
   }