Skip to content
Open
89 changes: 87 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,8 @@ A wire format and a verify function. See [POSITIONING.md](POSITIONING.md).

[![CI](https://github.com/Aliipou/authgate-kernel/actions/workflows/ci.yml/badge.svg)](https://github.com/Aliipou/authgate-kernel/actions)
[![Rust](https://img.shields.io/badge/kernel-Rust-orange.svg)](authgate-kernel/)
[![Tests](https://img.shields.io/badge/tests-1155%20passing-brightgreen.svg)](tests/)
[![Tests](https://img.shields.io/badge/tests-1297%20passing-brightgreen.svg)](tests/)
[![Runtime red team](https://img.shields.io/badge/runtime%20red--team-1000%20cases%2C%200%20escapes-brightgreen.svg)](redteam/)
[![Kani](https://img.shields.io/badge/Kani-24%20harnesses-green.svg)](formal/)
[![Lean4](https://img.shields.io/badge/Lean4-16%20theorems-blue.svg)](formal/lean4/)
[![License: PolyForm Noncommercial 1.0.0](https://img.shields.io/badge/License-PolyForm--Noncommercial--1.0.0-orange.svg)](LICENSE)
Expand Down Expand Up @@ -67,6 +68,88 @@ This is the same principle as capability-based OS security (seL4, CHERI), applie

---

## Agent runtime (operational layer)

`src/authgate/runtime/` is a minimal, working agent runtime that puts the gate on
the real path — **not** part of the TCB:

```
intent → Planner → [PlanStep] → runtime loop → verify(action) → sandboxed tool → RunLog + AuditLog
│ deny → stop, nothing further runs
```

Single agent, 3 tools (calculator, file_read, web_search), deterministic planner.
A denied step halts the plan. Two things make it more than a demo:

**1. It can run on the *verified* kernel (opt-in).** `build_runtime(backend="rust")`
routes each permit/deny decision into the Kani/Lean-verified Rust engine
(`engine::verify`) over a **JSON** boundary (`verify_json`), returning an
ed25519-signed verdict — no Rust objects enter Python. When that backend is
selected, the decision is made by the verified code instead of the Python
reimplementation. **The default is `backend="python"`** (the pure-Python
reference verifier), and that is the path CI runs — the Rust backend requires the
compiled extension (below) and is **not** exercised by CI. `"auto"` picks Rust
only when the extension is importable. Epoch-revocation semantics (absent from the
wire format) are preserved by serializing only currently-valid claims.

**2. Tools can run in a real sandbox (opt-in).** Pass a `SandboxPolicy` and each
tool executes in an isolated subprocess under a wall-clock deadline and output cap
(every platform), plus opt-in POSIX rlimits (CPU/memory/file-size). A tool that
hangs, crashes, or runs away is *killed and reaped* → clean denial, rather than an
in-process check. It bounds time/memory/output and isolates crashes; it is **not**
a network or syscall jail, and it is **off by default** (`sandbox_policy=None`).

### Adversarial verification — 1000 engineers

`redteam/runtime_redteam.py` synthesizes 1000 deterministic adversarial
"engineers" across 10 attack classes (sandbox escape, capability forgery,
calculator injection, plan injection, revocation/epoch bypass, denial probing,
audit tampering, replay-after-deny, argument fuzzing, identity spoofing). The bar
is **zero escapes** — any one is a hard failure.

| Backend | Result |
|---|---|
| Pure-Python verifier | 1000/1000 blocked, 0 escapes |
| **Verified Rust engine** (via JSON wire) | 1000/1000 blocked, 0 escapes |

Two real vulnerabilities were found and fixed by this harness: an unbounded-`**`
calculator DoS (a `2**2**2**2**2**2` that hangs the process) and a Windows
reserved-device-name sandbox bypass (`CON` blocks forever; `NUL`/`COMn` open
devices). Both are now regression-covered.

**Scope, honestly:** these are *self-authored* attack classes against the 3 MVP
tools. A green result is evidence that the *known* vectors are closed on both
backends — it is **not** a proof of security and **not** a substitute for
independent external review (still an open milestone).

### Benchmark — cost of a verified decision

`verify()` latency per gated tool call (`redteam/bench_verify.py`):

| Path | Latency | Throughput |
|---|---|---|
| Pure-Python `FreedomVerifier` | ~38 µs/decision | ~26,000 /s |
| Verified Rust engine (JSON wire + ed25519 sign) | ~544 µs/decision | ~1,800 /s |

Routing through the verified engine costs ~14× (JSON marshalling + per-call
signing) but stays sub-millisecond. Figures from one machine (Windows, CPython
3.13); treat as order-of-magnitude, not a spec.

### Building the verified extension

The Rust kernel exposes a PyO3 module (`authgate_kernel`). Build it with:

```bash
# ASCII build dir (the C toolchain mangles non-ASCII paths); GNU toolchain.
CARGO_TARGET_DIR=/tmp/akbuild cargo build --lib --release
# copy the cdylib next to the package as authgate_kernel.{pyd,so}
```

The runtime falls back to the pure-Python verifier wherever the extension is not
built (CI included), so the runtime tests are green with or without it.

---

## What it does NOT do

| Not this | Why |
Expand Down Expand Up @@ -351,8 +434,10 @@ The gap between `Permit/Deny` and actual constrained execution:

| Gap | Status | What closes it |
|---|---|---|
| **Runtime decides on the verified TCB** | Opt-in (`backend="rust"`); needs the built extension; **not run by CI** (CI uses Python) | A CI job that builds the extension and runs the runtime suite on the Rust backend |
| **Real tool sandbox** (process isolation) | Opt-in (`SandboxPolicy`, off by default); bounds time/memory/output + isolates crashes; **not** a syscall/network jail | seccomp / namespaces / WASM for syscall + network confinement |
| **WASM sandbox** (`cargo build --features sandbox`) | Blocked: Windows SDK kernel32.lib missing | Install Windows SDK 10.0.22621 or build on Linux |
| **OS-level confinement** (seccomp-bpf) | Not implemented | Wrap tool subprocess with seccomp filter |
| **OS-level confinement** (seccomp-bpf, network jail) | Partial: process isolation + rlimits done; no syscall/network jail | seccomp filter / namespaces / WASM around the tool subprocess |
| **End-to-end integration test** | **Done** (`tests/test_integration_e2e.py`) | 18 assertions: tool call → gate → audit chain |
| **TLC model checker** | Java not installed | `java -jar tla2tools.jar -tool MC_AuthGateV3` |
| **CLI** | Exists; not packaged | `pip install authgate-kernel` |
Expand Down
2 changes: 2 additions & 0 deletions examples/sandbox/notes.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
hello from sandbox
revenue Q1: 1200000
66 changes: 66 additions & 0 deletions redteam/bench_verify.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
"""
Benchmark: cost of an authorization decision, pure-Python verifier vs the
verified Rust engine reached over the JSON wire (RustBackedVerifier).

This is the decision-relevant number for the runtime: every gated tool call pays
one verify(). It answers "what does routing through the verified TCB cost per
call?" Run: AUTHGATE_BACKEND=python python redteam/bench_verify.py
"""
from __future__ import annotations

import os
import sys
import time
from pathlib import Path

os.environ.setdefault("AUTHGATE_BACKEND", "python")
sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "src"))

from authgate.kernel.entities import ( # noqa: E402
AgentType,
Entity,
Resource,
ResourceType,
RightsClaim,
)
from authgate.kernel.registry import OwnershipRegistry # noqa: E402
from authgate.kernel.verifier import Action, FreedomVerifier # noqa: E402
from authgate.runtime.rust_backend import RustBackedVerifier, rust_backend_available # noqa: E402


def _scenario():
owner = Entity("operator", AgentType.HUMAN)
agent = Entity("agent-1", AgentType.MACHINE)
resource = Resource("compute", ResourceType.COMPUTE_SLOT)
registry = OwnershipRegistry()
registry.register_machine(agent, owner)
registry.add_claim(RightsClaim(owner, resource, can_read=True, can_delegate=True))
registry.delegate(RightsClaim(agent, resource, can_read=True), delegated_by=owner)
return registry, Action("t1", agent, resources_read=[resource])


def _bench(label: str, verify, action, n: int) -> None:
verify(action) # warm up
t0 = time.perf_counter()
for _ in range(n):
verify(action)
elapsed = time.perf_counter() - t0
per = elapsed / n * 1e6 # microseconds per decision
print(f" {label:32} {per:9.2f} us/decision {n/elapsed:12,.0f} decisions/s")


def main() -> int:
registry, action = _scenario()
n = 20000
print(f"verify() latency over {n:,} permitted decisions (AUTHGATE_BACKEND="
f"{os.environ.get('AUTHGATE_BACKEND')}):")
_bench("pure-Python FreedomVerifier", FreedomVerifier(registry).verify, action, n)
if rust_backend_available():
_bench("verified Rust engine (JSON wire)", RustBackedVerifier(registry).verify, action, n)
else:
print(" verified Rust engine: SKIPPED (authgate_kernel extension not built)")
return 0


if __name__ == "__main__":
sys.exit(main())
Loading
Loading