otari gateway

OpenAI-compatible LLM gateway with API key management, budget enforcement, and usage tracking.

Why gateway?

gateway sits between your applications and LLM providers so you can control access, cost, and observability in one place.

OpenAI-compatible endpoints (/v1/chat/completions, /v1/embeddings, /v1/models)
Virtual API key management (/v1/keys) for safe client access
User and budget controls (/v1/users, /v1/budgets)
Usage and pricing tracking (/v1/messages, /v1/pricing)
Health and metrics endpoints (/health, optional /metrics)
Built-in tools the gateway runs itself — otari_code_execution (sandboxed Python REPL) and otari_web_search. See Built-in tools.

Quickstart

1) Install

uv venv
source .venv/bin/activate
uv sync --dev

2) Configure

cp config.example.yml config.yml

Edit config.yml and set at least:

master_key
one provider credential in providers (for example openai.api_key)

3) Run

uv run gateway serve --config config.yml

Open API docs at http://localhost:8000/docs.

Start in platform mode

Platform mode is enabled automatically when OTARI_AI_TOKEN is set.

Export platform env vars:

export OTARI_AI_TOKEN=gw_xxx

Start the gateway:

uv run gateway serve --config config.yml

Notes:

GATEWAY_MODE is optional; effective mode is derived from OTARI_AI_TOKEN.
If you explicitly set GATEWAY_MODE=platform, startup fails unless OTARI_AI_TOKEN is also set.
In platform mode, local providers configuration is not used.
The gateway/platform wire contract (resolve and usage endpoints, request/response shapes, retry semantics) is documented in docs/platform-protocol.md.

First request (OpenAI SDK)

On startup, the gateway can bootstrap an API key in logs. Export it as GATEWAY_API_KEY, then call the gateway as an OpenAI-compatible server:

import os

from openai import OpenAI

client = OpenAI(
    api_key=os.environ["GATEWAY_API_KEY"],
    base_url="http://localhost:8000/v1",
)

response = client.chat.completions.create(
    model="openai:gpt-4o",
    messages=[{"role": "user", "content": "Hello from gateway"}],
)

print(response.choices[0].message.content)

Local development

Run with hot reload and .env:

cp .env.example .env
make dev

Tests and checks

make test
make lint
make typecheck

Run a single test file:

uv run pytest tests/unit/test_gateway_cli.py -v

Docker

The gateway image is published on Docker Hub.

Run with docker compose (gateway + PostgreSQL)

cp config.example.yml config.yml
docker compose up -d

Run with docker only

docker run --rm \
  -p 8000:8000 \
  -v "$(pwd)/config.yml:/app/config.yml:ro" \
  mzdotai/otari:latest \
  gateway serve --config /app/config.yml

Gateway will be available at http://localhost:8000.

Built-in tools

The gateway can run a couple of tools itself so any model — including open-weight ones — gets parity with what frontier APIs expose as managed tools. Both are opt-in via the request's tools array and run inside docker-compose profiles so operators who don't use them don't pull extra images.

These use dedicated otari_* tool types. The keyword decides who runs the code: an otari_* type means the gateway runs it. Every other tool type — the legacy gateway short forms (code_execution, web_search) and the provider-native keywords (code_interpreter, code_execution_<date>, web_search_<date>) — is passed through to the upstream provider untouched, so the provider runs it in its own native sandbox/search. (In particular, the bare code_execution / web_search short forms no longer trigger the gateway sandbox — use the otari_* types for that.) Either way the gateway still handles routing, observability, and billing.

`otari_code_execution` — sandboxed Python REPL

{
  "model": "anthropic:claude-sonnet-4-6",
  "messages": [{"role": "user", "content": "Compute 23 factorial."}],
  "tools": [{"type": "otari_code_execution"}]
}

Bring up with docker compose --profile code-exec up. See demo/code-exec/ for a runnable walkthrough of both the gateway-managed and native-passthrough flows.

`otari_web_search` — current-information search

{
  "model": "anthropic:claude-sonnet-4-6",
  "messages": [{"role": "user", "content": "What's the latest stable Python release?"}],
  "tools": [{"type": "otari_web_search"}]
}

Bring up with docker compose --profile web-search up. See demo/web-search/ for a runnable walkthrough.

The bundled backend is a SearXNG metasearch container restricted to engines that don't forbid automated querying (duckduckgo, mojeek, qwant, wikipedia) — see scripts/searxng/settings.yml. Top results are fetched and content is extracted via trafilatura in-process so the model sees LLM-ready Markdown, not raw SERP snippets.

The free SearXNG engines rate-limit/CAPTCHA automated queries by IP, so they can be flaky for sustained use. For commercial or production use, swap the SearXNG container for a backend that uses a licensed API (Tavily, Brave Search API, Exa, Linkup, Serper). WebSearchBackend is configured purely by URL (GATEWAY_WEB_SEARCH_URL), so any HTTP service that exposes a SearXNG-compatible /search?format=json endpoint is a drop-in replacement — including thin adapters in front of commercial APIs. Adapters that already extract content can pass it through on the optional extracted_content result field to bypass the gateway-side extraction.

A ready-to-run Brave Search adapter ships in scripts/web-search-brave-adapter/: set BRAVE_API_KEY and GATEWAY_WEB_SEARCH_URL=http://brave-adapter:8080, then docker compose --profile web-search-brave up -d --build brave-adapter gateway. See that folder's README for details and how to adapt it to another provider.

Per-tool overrides (max_results, allowed_domains, blocked_domains, purpose_hint) live on the tool entry; operator-level env knobs (GATEWAY_WEB_SEARCH_ENGINES, GATEWAY_WEB_SEARCH_MAX_RESULTS, GATEWAY_WEB_SEARCH_EXTRACT, GATEWAY_WEB_SEARCH_PURPOSE_HINT) live alongside GATEWAY_WEB_SEARCH_URL.

API surface

GET /health
POST /v1/chat/completions
POST /v1/embeddings
POST /v1/moderations
GET /v1/models
POST/GET /v1/keys
POST/GET /v1/users
POST/GET /v1/budgets
GET /v1/messages
GET /v1/pricing

Full schema: docs/public/openapi.json

Useful CLI commands

uv run gateway init-db --config config.yml
uv run gateway migrate --config config.yml
uv run gateway migrate --config config.yml --revision <rev>
uv run python scripts/generate_openapi.py --check

License

Apache 2.0. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 215 Commits
.github/workflows		.github/workflows
alembic		alembic
demo		demo
docs		docs
scripts		scripts
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
alembic.ini		alembic.ini
codecov.yml		codecov.yml
config.example.yml		config.example.yml
config.yaml		config.yaml
docker-compose.override.yml		docker-compose.override.yml
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

otari gateway

Why gateway?

Quickstart

1) Install

2) Configure

3) Run

Start in platform mode

First request (OpenAI SDK)

Local development

Tests and checks

Docker

Run with docker compose (gateway + PostgreSQL)

Run with docker only

Built-in tools

`otari_code_execution` — sandboxed Python REPL

`otari_web_search` — current-information search

API surface

Useful CLI commands

License

About

Uh oh!

Releases

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

otari gateway

Why gateway?

Quickstart

1) Install

2) Configure

3) Run

Start in platform mode

First request (OpenAI SDK)

Local development

Tests and checks

Docker

Run with docker compose (gateway + PostgreSQL)

Run with docker only

Built-in tools

otari_code_execution — sandboxed Python REPL

otari_web_search — current-information search

API surface

Useful CLI commands

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors

Uh oh!

Languages

`otari_code_execution` — sandboxed Python REPL

`otari_web_search` — current-information search