Aliipou · Aliipou · Jun 1, 2026 · Jun 1, 2026 · Jun 10, 2026 · Jun 10, 2026
@@ -10,7 +10,8 @@ A wire format and a verify function. See [POSITIONING.md](POSITIONING.md).
 
 [![CI](https://github.com/Aliipou/authgate-kernel/actions/workflows/ci.yml/badge.svg)](https://github.com/Aliipou/authgate-kernel/actions)
 [![Rust](https://img.shields.io/badge/kernel-Rust-orange.svg)](authgate-kernel/)
-[![Tests](https://img.shields.io/badge/tests-1155%20passing-brightgreen.svg)](tests/)
+[![Tests](https://img.shields.io/badge/tests-1297%20passing-brightgreen.svg)](tests/)
+[![Runtime red team](https://img.shields.io/badge/runtime%20red--team-1000%20cases%2C%200%20escapes-brightgreen.svg)](redteam/)
 [![Kani](https://img.shields.io/badge/Kani-24%20harnesses-green.svg)](formal/)
 [![Lean4](https://img.shields.io/badge/Lean4-16%20theorems-blue.svg)](formal/lean4/)
 [![License: PolyForm Noncommercial 1.0.0](https://img.shields.io/badge/License-PolyForm--Noncommercial--1.0.0-orange.svg)](LICENSE)
@@ -67,6 +68,88 @@ This is the same principle as capability-based OS security (seL4, CHERI), applie
 
 ---
 
+## Agent runtime (operational layer)
+
+`src/authgate/runtime/` is a minimal, working agent runtime that puts the gate on
+the real path — **not** part of the TCB:
+
+```
+intent → Planner → [PlanStep] → runtime loop → verify(action) → sandboxed tool → RunLog + AuditLog
+                                                    │ deny → stop, nothing further runs
+```
+
+Single agent, 3 tools (calculator, file_read, web_search), deterministic planner.
+A denied step halts the plan. Two things make it more than a demo:
+
+**1. It can run on the *verified* kernel (opt-in).** `build_runtime(backend="rust")`
+routes each permit/deny decision into the Kani/Lean-verified Rust engine
+(`engine::verify`) over a **JSON** boundary (`verify_json`), returning an
+ed25519-signed verdict — no Rust objects enter Python. When that backend is
+selected, the decision is made by the verified code instead of the Python
+reimplementation. **The default is `backend="python"`** (the pure-Python
+reference verifier), and that is the path CI runs — the Rust backend requires the
+compiled extension (below) and is **not** exercised by CI. `"auto"` picks Rust
+only when the extension is importable. Epoch-revocation semantics (absent from the
+wire format) are preserved by serializing only currently-valid claims.
+
+**2. Tools can run in a real sandbox (opt-in).** Pass a `SandboxPolicy` and each
+tool executes in an isolated subprocess under a wall-clock deadline and output cap
+(every platform), plus opt-in POSIX rlimits (CPU/memory/file-size). A tool that
+hangs, crashes, or runs away is *killed and reaped* → clean denial, rather than an
+in-process check. It bounds time/memory/output and isolates crashes; it is **not**
+a network or syscall jail, and it is **off by default** (`sandbox_policy=None`).
+
+### Adversarial verification — 1000 engineers
+
+`redteam/runtime_redteam.py` synthesizes 1000 deterministic adversarial
+"engineers" across 10 attack classes (sandbox escape, capability forgery,
+calculator injection, plan injection, revocation/epoch bypass, denial probing,
+audit tampering, replay-after-deny, argument fuzzing, identity spoofing). The bar
+is **zero escapes** — any one is a hard failure.
+
+| Backend | Result |
+|---|---|
+| Pure-Python verifier | 1000/1000 blocked, 0 escapes |
+| **Verified Rust engine** (via JSON wire) | 1000/1000 blocked, 0 escapes |
+
+Two real vulnerabilities were found and fixed by this harness: an unbounded-`**`
+calculator DoS (a `2**2**2**2**2**2` that hangs the process) and a Windows
+reserved-device-name sandbox bypass (`CON` blocks forever; `NUL`/`COMn` open
+devices). Both are now regression-covered.
+
+**Scope, honestly:** these are *self-authored* attack classes against the 3 MVP
+tools. A green result is evidence that the *known* vectors are closed on both
+backends — it is **not** a proof of security and **not** a substitute for
+independent external review (still an open milestone).
+
+### Benchmark — cost of a verified decision
+
+`verify()` latency per gated tool call (`redteam/bench_verify.py`):
+
+| Path | Latency | Throughput |
+|---|---|---|
+| Pure-Python `FreedomVerifier` | ~38 µs/decision | ~26,000 /s |
+| Verified Rust engine (JSON wire + ed25519 sign) | ~544 µs/decision | ~1,800 /s |
+
+Routing through the verified engine costs ~14× (JSON marshalling + per-call
+signing) but stays sub-millisecond. Figures from one machine (Windows, CPython
+3.13); treat as order-of-magnitude, not a spec.
+
+### Building the verified extension
+
+The Rust kernel exposes a PyO3 module (`authgate_kernel`). Build it with:
+
+```bash
+# ASCII build dir (the C toolchain mangles non-ASCII paths); GNU toolchain.
+CARGO_TARGET_DIR=/tmp/akbuild cargo build --lib --release
+# copy the cdylib next to the package as authgate_kernel.{pyd,so}
+```
+
+The runtime falls back to the pure-Python verifier wherever the extension is not
+built (CI included), so the runtime tests are green with or without it.
+
+---
+
 ## What it does NOT do
 
 | Not this | Why |
@@ -351,8 +434,10 @@ The gap between `Permit/Deny` and actual constrained execution:
 
 | Gap | Status | What closes it |
 |---|---|---|
+| **Runtime decides on the verified TCB** | Opt-in (`backend="rust"`); needs the built extension; **not run by CI** (CI uses Python) | A CI job that builds the extension and runs the runtime suite on the Rust backend |
+| **Real tool sandbox** (process isolation) | Opt-in (`SandboxPolicy`, off by default); bounds time/memory/output + isolates crashes; **not** a syscall/network jail | seccomp / namespaces / WASM for syscall + network confinement |
 | **WASM sandbox** (`cargo build --features sandbox`) | Blocked: Windows SDK kernel32.lib missing | Install Windows SDK 10.0.22621 or build on Linux |
-| **OS-level confinement** (seccomp-bpf) | Not implemented | Wrap tool subprocess with seccomp filter |
+| **OS-level confinement** (seccomp-bpf, network jail) | Partial: process isolation + rlimits done; no syscall/network jail | seccomp filter / namespaces / WASM around the tool subprocess |
 | **End-to-end integration test** | **Done** (`tests/test_integration_e2e.py`) | 18 assertions: tool call → gate → audit chain |
 | **TLC model checker** | Java not installed | `java -jar tla2tools.jar -tool MC_AuthGateV3` |
 | **CLI** | Exists; not packaged | `pip install authgate-kernel` |

@@ -0,0 +1,2 @@
+hello from sandbox
+revenue Q1: 1200000
@@ -0,0 +1,66 @@
+"""
+Benchmark: cost of an authorization decision, pure-Python verifier vs the
+verified Rust engine reached over the JSON wire (RustBackedVerifier).
+
+This is the decision-relevant number for the runtime: every gated tool call pays
+one verify(). It answers "what does routing through the verified TCB cost per
+call?" Run:  AUTHGATE_BACKEND=python python redteam/bench_verify.py
+"""
+from __future__ import annotations
+
+import os
+import sys
+import time
+from pathlib import Path
+
+os.environ.setdefault("AUTHGATE_BACKEND", "python")
+sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "src"))
+
+from authgate.kernel.entities import (  # noqa: E402
+    AgentType,
+    Entity,
+    Resource,
+    ResourceType,
+    RightsClaim,
+)
+from authgate.kernel.registry import OwnershipRegistry  # noqa: E402
+from authgate.kernel.verifier import Action, FreedomVerifier  # noqa: E402
+from authgate.runtime.rust_backend import RustBackedVerifier, rust_backend_available  # noqa: E402
+
+
+def _scenario():
+    owner = Entity("operator", AgentType.HUMAN)
+    agent = Entity("agent-1", AgentType.MACHINE)
+    resource = Resource("compute", ResourceType.COMPUTE_SLOT)
+    registry = OwnershipRegistry()
+    registry.register_machine(agent, owner)
+    registry.add_claim(RightsClaim(owner, resource, can_read=True, can_delegate=True))
+    registry.delegate(RightsClaim(agent, resource, can_read=True), delegated_by=owner)
+    return registry, Action("t1", agent, resources_read=[resource])
+
+
+def _bench(label: str, verify, action, n: int) -> None:
+    verify(action)  # warm up
+    t0 = time.perf_counter()
+    for _ in range(n):
+        verify(action)
+    elapsed = time.perf_counter() - t0
+    per = elapsed / n * 1e6  # microseconds per decision
+    print(f"  {label:32} {per:9.2f} us/decision   {n/elapsed:12,.0f} decisions/s")
+
+
+def main() -> int:
+    registry, action = _scenario()
+    n = 20000
+    print(f"verify() latency over {n:,} permitted decisions (AUTHGATE_BACKEND="
+          f"{os.environ.get('AUTHGATE_BACKEND')}):")
+    _bench("pure-Python FreedomVerifier", FreedomVerifier(registry).verify, action, n)
+    if rust_backend_available():
+        _bench("verified Rust engine (JSON wire)", RustBackedVerifier(registry).verify, action, n)
+    else:
+        print("  verified Rust engine: SKIPPED (authgate_kernel extension not built)")
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())