Skip to content

gofastercloud/yabby

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

201 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

yabby

Your AI agent. Off the grid. Under your roof.

CI License: MIT Status: Beta Platform: Apple Silicon Shell: bash 5+

yabby turns an Apple Silicon Mac into a hardened, fully local AI agent stack. LLM inference runs natively on-device via two dedicated mlx-lm server instances — an 8B tool model (Qwen3-8B, 86% tool-call accuracy) and a 9B reasoning model (Qwen3.5-9B, long-form generation) — both with 3-bit KV cache quantization, ~10 GB total weights, running simultaneously on a 24 GB M4 Pro. Email, calendar, search, DNS, backups, and remote access all run on your hardware. No cloud LLM providers. No hardcoded domain, IP, or identity anywhere. A guided wizard asks for everything.

The install runs in a single script on the Mac Mini itself — keyboard attached, ~20 minutes plus model download. When the wizard finishes it emails you device-agnostic connection instructions so you can unplug the keyboard and display and access everything from any device (phone, tablet, laptop, whatever you have). Full Quick Start below.

Status: beta (v0.8.x). Two full dress-rehearsal rebuilds completed clean end-to-end. Every component has an automated test and the blog tunnel roundtrip is verified. Expect to file an issue or two — that's the whole point. See the changelog.


Why this exists

You want a personal AI agent that reads your email, manages your calendar, summarises your news feeds, searches the web on your behalf, posts to your blog, and runs scheduled jobs. You don't want a second subscription fee, a second privacy policy, a second "we may use your content to improve our models" clause, or a second party between you and the hardware you already own.

You have an Apple Silicon Mac gathering dust — or one you can buy used for about the price of two years of a chat-product subscription — and it can run a dual-model stack (8B tool + 9B reasoning) at roughly the same speed as the hosted alternatives. The model, the inference runtime, the container stack, the DNS layer, the backup tool, and the mesh VPN are all already open source. The missing piece is the fifteen-minute install from "freshly booted Mac" to "Telegram bot that reads your inbox." That's yabby.

It is not the easiest way to have an AI agent. It is, as far as I can tell, the easiest way to have one that nobody else can read, train on, throttle, deprecate, or turn off.


What you get

Component What it does
OpenClaw (local agent) AI assistant accessible via Telegram; reads email, checks calendar, searches the web, runs scheduled jobs, posts to your blog
mlx-lm (dual model) Two native Apple Silicon inference servers: Qwen3-8B (4.3 GB, port 11435) for tool calling, Qwen3.5-9B (5.6 GB, port 11436) for long-form generation. ~10 GB total weights, 3-bit KV cache quantization, ~10 GB left for KV caches. No API keys, no cloud
Resend (email API) Outbound + inbound mail via HTTPS. No local MTA, no port 25 exposed, no Spamhaus PBL nightmare
Radicale (CalDAV) Local CalDAV server; plain .ics files on disk that restic can trivially back up
Blocky (DNS firewall) Threat/ad/crypto denylists, strict per-container egress allowlist
SearXNG (private search) Self-hosted metasearch (DuckDuckGo + Brave + Startpage + Mojeek; Google/Bing off by default)
Tailscale Reach the agent from anywhere; optional exit node through your home network
restic Hourly encrypted backups to an external drive
Prompt Shield Distroless container running a ProtectAI DeBERTa-v3 classifier in front of mlx-lm; transparent intercept enforced by the egress firewall so the gateway physically cannot bypass it (see docs/prompt-shield-architecture.md)
Specialist nodes (optional) Satellite agents on Raspberry Pi 5/4 that delegate work back to the primary over Tailscale

Everything runs inside a locked-down Docker Compose stack: read_only rootfs, no-new-privileges, cap_drop: ALL, memory limits, two network segments (an internal bridge with no internet + an egress bridge with filtered access), per-container egress rules enforced by an iptables sidecar on the DOCKER-USER chain, SHA256-pinned images re-verified on every scripts/update.sh.


Requirements

Hardware

  • Apple Silicon Mac (M2 Pro or later recommended, 24 GB+ unified memory — both models loaded simultaneously consume ~10 GB of weights; the 16 GB tier can't hold both models plus KV caches at our prompt sizes)
  • macOS 14 (Sonoma) or later
  • FileVault enabled (the installer hard-requires it — all secrets at rest rely on it)
  • External drive for backups (optional but strongly recommended)
  • A keyboard and display to run the installer (you unplug them when setup is done — a cheap dummy HDMI plug keeps GPU clocks up when running headless)

Accounts & credentials (the shopping list)

The wizard will prompt for each of these and open the right browser tab when needed. Create them before or during the wizard run — don't try to do it all in advance, you'll forget which is which.

What Cost Where
A domain name you control ~$10/year any registrar
Cloudflare account (recommended) free cloudflare.com — point the domain's nameservers here
Resend account free tier: 3k emails/month (see note below) resend.com
Telegram bot token free @BotFather/newbot
Your Telegram user ID free @userinfobot
Tailscale account free for up to 3 users / 100 devices tailscale.com
iCloud / Google / M365 iCal URLs (optional) free read-only calendar sharing — wizard has per-provider instructions

Resend quota note: the free tier's 3000/month is shared between sent and received messages. Typical personal volume (50 out + 50-100 in per day) fits comfortably. Forwarding a high-volume mailing list to your domain will eat it alive — watch the x-resend-monthly-quota response header and upgrade to the $20/month tier if you cross it. Not our margin, but your problem.


Quick Start

Everything runs in a single script on the Mac Mini. Supports a fully non-interactive mode driven by YABBY_* environment variables for automated / CI installs — see docs/install-env-vars.md for the manifest. The walkthrough below uses the friendlier interactive path.

Boot the Mini with keyboard and monitor attached. Sign in as the user who will own the install. Paste this into Terminal:

curl -fsSL https://getyabby.com/install.sh | bash

In roughly 20 minutes it:

  • Flips macOS into server mode: power settings, firewall, telemetry off, App Nap disabled, fast key repeat, FileVault enforced
  • Installs Xcode Command Line Tools, Homebrew, and Tailscale
  • Joins the Mini to your tailnet (you'll click a browser link)
  • Enables SSH and VNC (Screen Sharing) — both Tailscale-protected
  • Installs mlx-lm and applies the KV cache patch
  • Launches the interactive setup wizard, which handles:
    • Domain + DNS (auto-configures via the Cloudflare API if you're on it)
    • Resend domain verification (DKIM/SPF/DMARC records)
    • Telegram bot registration + user allowlist
    • MLX model downloads (tool + reasoning, ~10 GB from Hugging Face)
    • mlx_lm.server launchd supervision (dual model, 3-bit KV cache)
    • OpenClaw config generation
    • Docker stack deployment
    • Tailscale exit-node setup (optional)
    • restic backup repo initialisation
    • launchd + openclaw-cron jobs for all scheduled work

When the wizard finishes it emails you connection instructions tailored to your actual settings — Tailscale download links, VNC address, SSH command, Telegram bot handle — for any device (iPhone, Android, Windows laptop, Chromebook, macOS, Linux). Then you unplug the keyboard and display and put the Mini wherever it fits.

Dummy HDMI tip: plug a cheap dummy HDMI adapter (~US$10, search "dummy HDMI plug 4K") into the Mini's HDMI port before unplugging your display. Without one, macOS reduces GPU clock speeds in headless mode, which slows down your AI models noticeably.


Architecture

Host (macOS, Apple Silicon)
├── mlx_lm.server (tool)    ── Qwen3-8B on 127.0.0.1:11435, 3-bit KV cache.
│                               Agent cron turns, tool calling (86% accuracy).
│                               Supervised by launchd.
├── mlx_lm.server (reason)  ── Qwen3.5-9B on 127.0.0.1:11436, 3-bit KV cache.
│                               Blog, digest, wiki, research generation.
│                               Supervised by launchd.
├── Tailscale ───────────────── WireGuard mesh VPN, optional exit node
├── restic ──────────────────── Hourly encrypted backups → external drive
│                                (or a local path for dogfood / testing installs)
│
└── Docker (OrbStack)
    ├── openclaw-gateway ── AI agent (Telegram, email, calendar, search)
    ├── radicale ─────────── CalDAV server (127.0.0.1:5232, htpasswd/bcrypt)
    ├── blocky ───────────── DNS firewall (127.0.0.1:53)
    ├── searxng ──────────── Private metasearch (127.0.0.1:8888)
    ├── egress-firewall ──── iptables per-container egress rules
    └── dozzle ───────────── Log viewer (127.0.0.1:9999, reachable from
                             other devices via `tailscale serve`)

Networks:
  openclaw-internal (172.27.0.0/24)  — no internet access
  openclaw-egress   (172.28.0.0/24)  — filtered internet via blocky DNS

Email flow:

Outbound:  agent  → Resend HTTPS API → recipient   (DKIM signed)
Inbound:   sender → Resend inbound   → poller      → ~/.yabby/inbox/new/
           (your iCloud/Gmail → auto-forward → you@yourdomain.com)

Calendar flow:

iCloud / Google / M365 iCal URL → vdirsyncer (read-only pull)
                                        ↓
                                   Radicale CalDAV
                                        ↑
                                     agent

Blog flow (optional):

public internet → Cloudflare edge → Cloudflare tunnel → Ghost CMS
                                                        (internal-only)

The Ghost container has no internet access — the only thing it talks to is cloudflared in the same internal network. cloudflared dials Cloudflare outbound and accepts zero inbound connections.


Tailscale Client Setup

The setup wizard emails you personalised connection instructions when it finishes. The short version: install Tailscale on each of your devices from tailscale.com/download, sign in with the same account used during setup, and your Mini appears in the device list immediately.

Platform Where to get it
iPhone / iPad App Store → Tailscale
Android Play Store → Tailscale
macOS App Store or brew install --cask tailscale
Windows tailscale.com/download/windows
Linux tailscale.com/download/linux
ChromeOS Play Store → Tailscale

iOS SSID auto-toggle

Automatically disable the exit node when you're home (your home router is probably already filtering):

  1. Shortcuts app → Automations → + → Personal Automation
  2. Trigger: Wi-Fi → connect to [home SSID]
  3. Action: Set Exit Node (Tailscale) → off
  4. Create the reverse automation: on disconnect → Set Exit Node → on

Router port forward (for best performance)

If your router supports it (Eero, UniFi, OPNsense, etc.), forward UDP port 41641 to the Mac Mini. Tailscale falls back to DERP relays without this, but direct connections are noticeably faster.


Security

Full details in SECURITY.md, but the headlines:

  • SHA256-pinned Docker images, re-verified on every scripts/update.sh.
  • Both mlx-lm.server instances bound to 127.0.0.1 only (ports 11435 and 11436) — never the Tailscale IP unless specialist nodes are enabled. The gateway container reaches them via host.docker.internal:{11435,11436} through egress-firewall allow rules.
  • Every container read_only, no-new-privileges, cap_drop: ALL, memory-limited.
  • Blocky DNS firewall with threat/crypto/ad denylists and a strict egress allowlist per container.
  • No ClawHub skills — deny-by-default tool policy after the 2026 ClawHavoc supply chain incident. Not negotiable.
  • Prompt-injection defences in openclaw/SOUL.md: trust hierarchy, boundary markers on external content, confirmation gates on destructive actions, plus a ProtectAI DeBERTa-v3 classifier pre-filter on everything the agent reads from the outside world.
  • Secrets in ~/.yabby/secrets.env (mode 0600, FileVault encrypted at rest). No Keychain, no environment variables in logs.
  • Pre-commit secret-scanning (TruffleHog + custom patterns for Cloudflare / Resend / Tailscale / Telegram / HuggingFace / GitHub / AWS / OpenAI / Slack / Stripe / PEM). Both pre-commit and pre-push.
  • Minimum safe OpenClaw version: v2026.2.26 (CVE-2026-32922, CVSS 9.9).
  • Port 25 is not exposed to the internet. Inbound mail lands via Resend's MX, not via your home IP.

Common Commands

scripts/start.sh           # Start mlx-lm servers + the Docker stack
scripts/stop.sh            # Graceful shutdown
scripts/health-check.sh    # Full health verification (all services)
scripts/backup.sh          # Run a restic backup now
scripts/restore.sh         # Interactive restic restore
scripts/update.sh          # Weekly update (git pull + image re-pin + rolling restart)
scripts/reset-local-state.sh --dry-run   # Simulate a clean wipe
scripts/reset-local-state.sh --execute --i-know-what-im-doing   # Actually wipe

scripts/agent-exec.sh "summarise today's news"    # Ask the agent something
                                                  # (fresh session per call; pass
                                                  # --reuse-session to continue the
                                                  # last conversation)

scripts/blog-admin-show.sh                        # Telegram me the Ghost admin creds

Most of these are also run on a schedule (launchd for the no-LLM jobs, native openclaw cron for LLM-invoking jobs). See scripts/install-launchd.sh and scripts/wizard.py::step_register_cron_jobs for the full list.


Specialist Nodes

yabby supports a satellite-node architecture: a Raspberry Pi 5 (8 GB, running Pi OS) or Pi 4 acts as a secondary OpenClaw instance with its own small Ollama sidecar (llama3.2:1b). It handles routine tasks locally and escalates anything heavy back to the primary Mac Mini via Tailscale.

The specialist does not share the Mac Mini's inference backend — it runs its own Ollama because it's on different silicon. It also doesn't run Radicale, Blocky, SearXNG, or the egress firewall. It has one job: respond to sessions spawned by the primary.

Provision an SD card from your MacBook:

scripts/provision-specialist.sh --name research --target /Volumes/bootfs

Then put the SD card in the Pi, power it on, and the firstrun script will join Tailscale, install OpenClaw, and register itself. Full spec: docs/specialist-nodes.md.


Testing

# Unit tests (run in CI)
uv run pytest tests/ -v

# Offline integration smoke tests (no live credentials required)
scripts/test-guard.sh               # Prompt injection classifier
scripts/test-wiki.sh                # wiki-compile.sh end to end
scripts/test-stack.sh --teardown    # Full Docker stack up + probes + tear down

# Online tests (require live credentials for the service being tested)
CF_TOKEN=... CF_DOMAIN=... scripts/test-cf-tunnel.sh       # CF REST API
CF_TOKEN=... CF_DOMAIN=... scripts/test-cf-tunnel-live.sh  # Full tunnel e2e
RESEND_API_KEY=... RESEND_DOMAIN=... RESEND_FROM=... \
  scripts/test-resend-e2e.sh                               # Full mail loop

See CONTRIBUTING.md for the full test matrix and what's expected before opening a PR.


Contributing

PRs welcome — see CONTRIBUTING.md for the workflow (feature branches → main, auto-released on merge). Short version:

  • Read SECURITY.md before touching networking, credentials, or the prompt-injection defences
  • Run pre-commit install once after cloning so every commit is secret-scanned locally
  • No ClawHub skill PRs. No cloud LLM providers. No features that require opening new inbound ports
  • Conventional commits (feat:, fix:, refactor: …), terse and LLM-friendly

All contributions are governed by the Code of Conduct.

Personalise your feed list

configs/feeds.opml is gitignored — ship your starter template from configs/feeds.example.opml, curate freely, never leak your reading list to a public repo. The hardcoded defaults in scripts/news-digest.sh and scripts/check-advisories.sh are generic security feeds (CISA, Schneier, Troy Hunt, etc.) — edit them if that's not your thing.


Thanks

yabby stands on a mountain of open-source work by other people. See THANKS.md for the full attribution — mlx-lm, Radicale, SearXNG, Blocky, Ghost, Tailscale, restic, uv, Cloudflare's tunnel, Dozzle, and dozens of smaller libraries that do most of the actual work.

The one thing yabby does is glue them all together with a wizard.


Models and licensing

Both default models — Qwen3-8B and Qwen3.5-9B from the Qwen team (Alibaba) — are released under Apache-2.0. Loading them does not require a click-through licence acceptance.

docs/MODELS.md documents the dual-model architecture, licensing, hardware requirements, and swap instructions. If you want to run different models, read MODELS.md before running the wizard.


Changelog

Full history in CHANGELOG.md. Current and previous releases (with notes) at github.com/gofastercloud/yabby/releases.

License

MIT. yabby's own code is MIT-licensed. It installs and configures several upstream services under AGPL/GPL/etc licenses which run as independent network services in separate containers — see THANKS.md for the full breakdown of why this is compatible.

About

Security-hardened Docker deployment for OpenClaw AI agent platform — egress firewall, DNS filtering, local inference, private ingress (Cloudflare Tunnel / Tailscale)

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages