Atomic Chat

Local AI app and inference engine for agents. Run open-weight LLMs locally — private, on your machine.

Getting Started · Discord · X / Twitter · Bug Reports

Download

Desktop

Mobile

or grab any build from atomic.chat · GitHub Releases — latest: v1.1.95

Contributors

Atomic Chat is built by a small team and a handful of community contributors. Pull requests welcome — see CONTRIBUTING.md for how to get started.

Features

Local models

Run open-weight LLMs locally from HuggingFace — Llama, Gemma, Qwen, Mistral, Phi, and others
Multi-Token Prediction (MTP) speculative decoding — 30–70% throughput boost on supported models, up to 3× on Gemma 4
DFlash block-diffusion decoding — up to 6× faster on Qwen 3.6, Gemma 4, Kimi K2.5
Flash Attention toggle (on / off / auto)
Automatic reasoning-context tracking for chain-of-thought models
Auto context-window expansion with overflow notifications
EAGLE-3 speculative decoding for Gemma 4 on Apple Silicon (MLX)
MTP on MLX for Qwen 3.5 / 3.6 and DeepSeek V4
TurboQuant KV cache on MLX-VLM — smaller memory footprint via RHT-correct fast paths

Cloud models

Built-in providers: OpenAI, Anthropic, Mistral, Groq, MiniMax, Qwen, Moonshot
Bring your own key, switch model per chat, mix local and cloud freely

Tools & integrations

One-click agent launch — launch OpenCode and GitHub Copilot CLI agents in one click from the Integrations tab
Artifacts — live preview panel for HTML/CSS/JS code with copy, download and print
Connect multiple MCP servers — bring your own tools, file access, web search
In-app log viewer for MCP tool calls
Custom assistants with per-assistant system prompts
Projects with conversation tree view in the sidebar

Local API

OpenAI-compatible server at http://localhost:1337/v1 — drop-in replacement for the OpenAI SDK
Works with any agent, CLI, or IDE plugin that speaks the OpenAI API
Bound to 127.0.0.1 by default; set host: 0.0.0.0 to expose on LAN

Privacy

Everything runs locally when you want it to — local server is loopback-only by default
Your conversations and keys stay on your machine

Inference Engines

Three engines under the hood, all exposed through one OpenAI-compatible API at http://localhost:1337/v1:

atomic-llama-cpp-turboquant — our llama.cpp fork with TurboQuant optimizations for faster quantized inference. Cross-platform (macOS, Windows, Linux), CPU and GPU.
Upstream llama.cpp — official ggml-org build, used on Windows by default for the widest hardware coverage and MTP support.
MLX-VLM — Apple Silicon-native engine for vision-language models, running on the Neural Engine and unified memory. Faster than llama.cpp on M-series chips for supported models.

Speculative-decoding features available across backends:

MTP (Multi-Token Prediction) — a draft model predicts ahead, the full model verifies in one pass. Available on macOS and Windows.
DFlash — block-diffusion speculative decoding for Qwen 3.6, Gemma 4, Kimi K2.5 and others. Apple Silicon only; can't be enabled together with MTP.
Flash Attention — Settings → on / off / auto.

Tools talking to http://localhost:1337/v1 don't need to know which backend is running underneath — switch engines without reconfiguring clients.

Launch With

Atomic Chat runs an OpenAI-compatible server at http://localhost:1337/v1, so any agent, CLI, IDE plugin, or app that speaks the OpenAI API can run on top of your local models — no extra glue needed. Just point its base URL at Atomic Chat and you're done.

A few projects already ship first-class support with their own setup docs:

Tool	What it is	Setup
OpenCode	Open-source TUI coding agent. Add Atomic Chat as a local provider in `opencode.json`.	Setup guide →
OpenClaude	Open-source coding-agent CLI for cloud and local models. Lists Atomic Chat as a supported provider.	Providers list →
Goose	Open-source extensible AI agent (CLI, desktop, API).	Setup guide →
Hermes Desktop	Native desktop companion for Hermes Agent. Includes an Atomic Chat local preset at `http://localhost:1337/v1`.	Repo →
Hermes Workspace	Local-first agent workspace built on Nous Research's Hermes. Uses Atomic Chat as its inference backend.	Repo →
nanobot	Ultra-lightweight personal AI agent with chat channels, MCP, and WebUI.	Repo →
nanoclaw	Containerized agent runtime that calls Atomic Chat as an MCP tool.	Skill guide →

Built something that runs on Atomic Chat? Open a PR and we'll add it here.

Build from Source

Prerequisites

Node.js ≥ 20.0.0
Yarn ≥ 4.5.3
Make ≥ 3.81
Rust (for Tauri)
(Apple Silicon) MetalToolchain xcodebuild -downloadComponent MetalToolchain

Run with Make

git clone https://github.com/AtomicBot-ai/Atomic-Chat
cd Atomic-Chat
make dev

This handles everything: installs dependencies, builds core components, and launches the app.

Available make targets:

make dev — full development setup and launch
make build — production build
make test — run tests and linting
make clean — delete everything and start fresh

Manual Commands

yarn install
yarn build:tauri:plugin:api
yarn build:core
yarn build:extensions
yarn dev

System Requirements

macOS: 13.6+ (8GB RAM for 3B models, 16GB for 7B, 32GB for 13B)
Windows: 10/11 x64 (same RAM recommendations as macOS)
Linux: x86_64, glibc ≥ 2.35 (Ubuntu 22.04+, Debian 12+, Fedora 40+, Arch, Mint, Pop!_OS — same RAM recommendations as macOS). Optional: a Vulkan loader (vulkan-1 package, or mesa-vulkan-drivers / proprietary NVIDIA driver) for GPU acceleration.
iOS: 17+ (download from App Store)
Android: download from Google Play

Running on Linux

Atomic Chat ships as a single self-contained .AppImage — no installer, no root:

chmod +x Atomic.Chat_*_amd64.AppImage
./Atomic.Chat_*_amd64.AppImage

If prompted about FUSE on first launch: sudo apt install fuse libfuse2 (Debian/Ubuntu) or sudo dnf install fuse fuse-libs (Fedora). GPU acceleration (Vulkan) is auto-detected on first launch; only GGUF models run on Linux.

Troubleshooting

If something isn't working:

Copy your error logs and system specs
Open an issue on GitHub
Or ask for help in our Discord

Star History

License

Apache 2.0 — see LICENSE for details.

Acknowledgements

Built on the shoulders of giants:

Name		Name	Last commit message	Last commit date
Latest commit History 8,224 Commits
.cargo		.cargo
.devcontainer		.devcontainer
.github		.github
.husky		.husky
.probe		.probe
assets		assets
autoqa		autoqa
core		core
docs		docs
downloads		downloads
extensions		extensions
foundation-models-server		foundation-models-server
mlx-server		mlx-server
scripts		scripts
src-tauri		src-tauri
tests		tests
web-app		web-app
.gitattributes		.gitattributes
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
.yarnrc.yml		.yarnrc.yml
AGENTS.md		AGENTS.md
CONTRIBUTING.md		CONTRIBUTING.md
DEVELOP.md		DEVELOP.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
demo.gif		demo.gif
package.json		package.json
vitest.config.ts		vitest.config.ts
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Atomic Chat

Download

Contributors

Features

Inference Engines

Launch With

Build from Source

Prerequisites

Run with Make

Manual Commands

System Requirements

Running on Linux

Troubleshooting

Star History

License

Acknowledgements

About

Uh oh!

Releases 27

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Atomic Chat

Download

Contributors

Features

Inference Engines

Launch With

Build from Source

Prerequisites

Run with Make

Manual Commands

System Requirements

Running on Linux

Troubleshooting

Star History

License

Acknowledgements

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 27

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages