Skip to content

feat: add OpenAI-compatible provider for local LLM support#1

Open
marcs7 wants to merge 1 commit into
wolfsoftwaresystemsltd:masterfrom
marcs7:feat/openai-provider
Open

feat: add OpenAI-compatible provider for local LLM support#1
marcs7 wants to merge 1 commit into
wolfsoftwaresystemsltd:masterfrom
marcs7:feat/openai-provider

Conversation

@marcs7

@marcs7 marcs7 commented Mar 19, 2026

Copy link
Copy Markdown

Summary

Adds a third AI provider option ("OpenAI Compatible") that works with any OpenAI-compatible API endpoint, enabling local LLM usage via tools like
LiteLLM, Ollama, vLLM, llama.cpp, etc.

This allows WolfStack users to run AI features entirely self-hosted without requiring external API keys from Anthropic or Google.

Changes

Backend (src/ai/mod.rs)

  • Added openai_api_key and openai_base_url fields to AiConfig
  • New call_openai() function implementing OpenAI /v1/chat/completions API
  • Provider routing for "openai" in chat(), health_check(), and analyze_issue()
  • Model listing via /v1/models endpoint
  • API key is optional (supports unauthenticated local endpoints)

Backend (src/api/mod.rs)

  • ai_save_config handles new openai_api_key and openai_base_url fields
  • Config syncs across cluster nodes

Frontend (web/index.html)

  • "OpenAI Compatible (Local LLM)" option in provider dropdown
  • Base URL and API Key fields (shown only when OpenAI provider selected)
  • Text input for model name (replaces dropdown when using OpenAI, since model names are user-defined)

Frontend (web/js/app.js)

  • Load/save/test functions handle new fields
  • Dynamic show/hide of OpenAI fields and model input based on provider selection

Testing

Tested with:

  • LiteLLM proxy → llama.cpp (Qwen3-Coder-30B, local Vulkan inference)
  • Chat, health checks, and connection test all working
  • Config persists across restarts and syncs to cluster nodes

Use Case

Many self-hosted users run local LLMs for privacy, cost, or latency reasons. This PR enables WolfStack's AI features (chat assistant, health monitoring,
issue analysis) to work with any OpenAI-compatible endpoint — no cloud API keys required.

Screenshot_20260319_151605

wolfsoftwaresystemsltd pushed a commit that referenced this pull request Apr 16, 2026
Folded WolfSecure's scanner logic into the main wolfstack binary as a new src/security.rs module — no more separate plugin, no per-node install dance, no architecture-mismatch pain. Findings flow into the existing System Check UI under a 'Security' category, into the periodic alerting pipeline with per-finding cooldown, and into the AI health-check prompt so the assistant treats active threats as top-priority.

Posture checks:
  - Risky services listening on 0.0.0.0 (Docker 2375, Redis, MongoDB, Postgres, MySQL, Elasticsearch, memcached, Kibana)
  - /etc/wolfstack/ config files that are world- or group-readable
  - Cluster secret too short or low-entropy
  - sshd_config with PermitRootLogin yes / PasswordAuthentication yes
  - SSH exposed without fail2ban or sshguard running

Active-attack detection:
  - SSH brute-force in progress (>=10 failed auths from one IP within the last 5 minutes, via journalctl with auth.log fallback that filters by leading syslog timestamp)
  - Known crypto-miner processes (xmrig, minerd, cpuminer, ethminer, t-rex, nbminer, lolminer, cgminer, and friends) — the #1 post-compromise payload
  - Recent executable files in /tmp or /dev/shm (<24h, classic malware staging)
  - Established outbound connections to RAT / C2 / mining-pool ports (4444, 6667, 9001, 14444, etc) excluding RFC1918 peers

Integration:
  - GET /api/system-check now runs dependency + security checks in parallel via tokio::join!, appending findings to the existing checks array — frontend renders 'Security' as a new category automatically
  - Background tokio task runs the security scan every 5 min, fires alerts via alerting::send_alert() on critical findings (Discord/Slack/Telegram/email), with a per-check-name 1h cooldown so a prolonged attack doesn't spam channels
  - AI health_check() loads security findings and threads them into the LLM prompt so alerts + proposed [ACTION] fixes account for active threats, not just CPU/RAM/disk

Bugs caught in pre-commit review (all fixed):
  - ssh_is_exposed was reading wrong field from ss -tlnp (Netid column dropped changes indexing); now uses ss -tulnp with explicit tcp filter
  - scan_outbound_suspicious assumed 5 fields but ss suppresses State column when filtered — fixed to use fields[3]
  - auth.log fallback used to read the entire log; now tails 2000 lines AND filters by leading syslog timestamp to enforce the 5-minute window
  - Alert cooldown key stripped of dynamic counts so 'SSH brute-force (3 IPs, 47 attempts)' doesn't re-alert every scan
wolfsoftwaresystemsltd pushed a commit that referenced this pull request May 1, 2026
Three urgent fixes targeting the Discord reports about local AI not
working in WolfStack despite working via curl, plus Gary's restart
bubble bug.

1. Local-model tool calling — recover from content-string emissions
   Reproduced live on this box against Ollama's qwen2.5-coder:32b: the
   model emits tool calls as JSON *inside* `message.content` instead
   of in the structured `tool_calls` array. WolfStack's previous
   parser only read `tool_calls`, so users got raw JSON as the "AI's
   reply" and tool dispatch never fired. Common with smaller / not-
   tool-fine-tuned local models (qwen2.5 below 7B, llama3.2 variants,
   gemma tunes, FunctionGemma).

   New `extract_tool_calls_from_content` in `src/ai/mod.rs` recognises
   six common content-side wire formats and translates them into the
   same bracket-tag pipeline the rest of the code uses:
     • bare `{"name": "fn", "arguments": {…}}` object
     • bare array `[{"name": "fn", "arguments": {…}}, …]`
     • OpenAI-shape `{"function": {…}}` inside content
     • Fenced ```json blocks
     • `<tool_call>…</tool_call>` and `<function_call>…</function_call>`
       XML wrappers (FunctionGemma + qwen tool-call dialect)
     • Mistral's `[TOOL_CALLS]` prefix

   **Critical safety property**: takes an `allowed_tool_names` param
   and drops any extracted call whose name isn't on the caller's
   allowlist. Without that, prose like `{"name": "nginx", "status":
   "stopped"}` (a model explaining service state) would synthesise a
   phantom tool call — on a sysadmin platform that's potential
   command-injection-via-AI-response. The allowlist closes that.
   `MAIN_AI_TOOLS` constant gates the chat path; WolfAgents passes
   each agent's `allowed_tools` set.

   `<tool_call>` XML stripper anchored at position 0 of the trimmed
   string so a mid-prose echo can't trigger extraction. Fenced-block
   close detected with proper `rfind("```")` instead of trim-end-all-
   backticks.

   14 unit tests across `content_tool_call_tests` covering every
   wire format, the rejection cases, and two regression guards
   pinned by name (`unknown_tool_name_is_dropped`,
   `mid_prose_xml_wrapper_does_not_match`).

2. WolfAgents same recovery path
   `src/wolfagents/agent_loop.rs::openai_tool_loop` had the same
   blind spot — model's content-side tool calls returned to the
   user as raw JSON. Recovery synthesises into the existing
   `tool_calls_json` so the rest of the multi-round loop works
   unchanged. Tool-call IDs use `call_{:016x}` (timestamp + counter
   + agent-id-length mix) so a future swap to OpenAI proper, which
   validates the ID shape, doesn't reject the recovered turn's
   history.

3. AI chat bubble visibility on restart (Gary KO4BSR)
   `web/index.html`'s page-load visibility check only inspected
   `has_claude_key || has_gemini_key` — Local / OpenAI / OpenRouter
   users lost the red AI bubble every wolfstack restart and had to
   re-save Settings → AI Agent to bring it back (the Save handler
   force-shows it). Now checks all five provider fields.

Diagnostic logging
- `call_local`: previously-debug-only logs escalated to info/warn
  so the next "local AI not responding" report has actionable
  signal at default log levels.
- Empty-response error message now includes finish_reason +
  body_size so context-overflow on small models is identifiable
  from the UI.
- Body-size measurement at debug level (RUST_LOG=wolfstack::ai=debug
  to enable) — surfaces the #1 small-model failure mode.

Independent code-reviewer pass: two BLOCKERs (mid-prose XML match,
unrestricted tool-name acceptance) + four MAJORs (synthetic ID
format, fenced-block stripping, info-log flooding) all addressed
before commit. All findings have negative-case regression tests
pinning the fix.
wolfsoftwaresystemsltd pushed a commit that referenced this pull request May 28, 2026
Three Sponsor-reported fixes ship together.

* KO4BSR 2026-05-28: 1300+ defunct dnsmasq processes parented to
  wolfstack on his node. Root cause: `Command::new("dnsmasq").spawn()`
  in both `ensure_lxcbr0_services` (containers) and the per-TAP
  WolfNet DHCP setup (vms) launched dnsmasq and dropped the Child
  handle without ever calling `.wait()`. dnsmasq daemonizes by double-
  fork, so its initial process exits the moment the daemon is forked
  — and that initial process becomes a zombie under wolfstack
  because nothing reaps it. Switched both call sites to `.status()`
  so the parent dnsmasq is reaped synchronously after its quick
  daemonize fork. Restarting wolfstack clears the existing pile.

* Klas (Sponsor) 2026-05-28 #1: editing the listen port on one node
  wiped `/etc/wolfnet/config.toml` and wolfnet then exited on every
  start. WolfStack-side `save_wolfnet_config` and the four other
  paths that rewrote the file now route through one helper that
  refuses empty/malformed payloads, snapshots the existing file to
  `config.toml.bak`, and atomic-renames a `.tmp` into place. wolfnet
  0.5.25 (shipped separately) adds the matching self-heal on the
  load side: if config.toml is missing or empty but .bak exists,
  it gets restored before parsing.

* Klas (Sponsor) 2026-05-28 #2: "trying to change a network setting
  on a vm and an unable to click save button; a little later it
  says settings saved but settings have not been changed" on a
  Proxmox node. Two issues:
    - The Save/Cancel footer in the VM-settings modal scrolled with
      the body and could end up below the viewport on small screens.
      Made it position:sticky to the bottom of the modal-body so it
      is always reachable.
    - On the Proxmox update path, `qm set --net0`, `qm set --net1`,
      and the WolfNet bridge reconcile all logged a warning and
      returned Ok(()) on failure — so the handler answered 200 to
      the API and the UI showed a success toast while PVE had
      rejected the change. All three now propagate the error so the
      frontend surfaces what PVE actually said. `qm set --delete
      net1` keeps treating "net1 not in config" as a no-op since
      that's the desired state.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant