Agentic bot for any business — BitBot bootstrap with category classification using ModernBERT (fine-tuned MoritzLaurer/ModernBERT-base-zeroshot-v2.0), a staged procedure-driven LangGraph API flow, and Docker Compose for frontend (Streamlit), backend (FastAPI), PostgreSQL, Elasticsearch, and a BentoML classifier service.
flowchart LR
fe[Streamlit_frontend] --> be[FastAPI_backend]
be --> lg[LangGraph_staged_issue_pipeline]
lg --> bento[ModernBERT_Bento]
lg --> stageIntent["LLM_intent_plus_DB_allowlist"]
lg --> stageRouter[Deterministic_specialist_router]
lg --> yaml[ProcedureYAMLLibrary]
lg --> stagePolicy[Policy_load_and_constraints]
lg --> stageValidate[Data_plus_eligibility_gates]
lg --> stageOutcome[Outcome_validator_and_escalation]
lg --> llm[Ollama_or_Cerebras]
lg --> es[(ElasticsearchPolicyDocs)]
be --> pg[(Postgres)]
stageIntent --> pg
train[Dataset_and_finetune_scripts] --> artifacts[Local_model_dir]
artifacts --> bento
- Frontend (
frontend/): Streamlit chat UI callingPOST /classifywithfull_flowfor session + LangGraph + LLM branches. - Backend (
backend/): FastAPI + staged LangGraph (backend/agent/issue_graph.py): ModernBERT via Bento for category → no-issue/low-confidence branch or intent resolution → deterministic specialist routing → YAML procedure load (backend/procedures/*.yaml) → policy load/constraints → data+eligibility gates → structured step execution → explicit outcome validation/escalation (see docs/agent.md). - Tool APIs (
backend/api/routes/tools.py): DB-backed HTTP tools used by procedure steps (order, product, refund, payment, invoice, subscription, contact, complaint, delivery); see API Surface below for the full list. - Escalation API (
backend/api/routes/escalations.py): in-chat escalation decision endpoint (/escalations/decision) for accept/reject UX. - Procedures (
backend/procedures/): One YAML blueprint per intent. Blueprints definerequired_dataand ordered steps (retrieval,logic_gate,tool_call,llm_response,interrupt) used to enforce deterministic control flow. - ModernBERT (
services/modernbert_bento/): BentoML service loading a local fine-tuned checkpoint. - Policy source: Policy constraints/content remain external to procedures and come from Elasticsearch-backed policy documents through retrieval steps.
- Data stores: Postgres for sessions, orders, products, and related app data (
infra/postgres/init.sql). Policy retrieval uses Elasticsearch only (no vector search in Postgres). - Sessions (
POST /classifywithfull_flow=true): messages and active issue (intent, category, confidence) live in Postgres. While an issue is active, intent classification is locked to the stored intent on later turns. Users can confirm resolution in natural language to end the issue without re-running the graph; the graph can also mark the session resolved when a procedure completes cleanly (seegraph_suggests_session_resolvedandclassify.py). - Product catalog:
productstable in Postgres backs theproduct_catalog_lookupstep inside the LangGraph executor.
| Topic | Document |
|---|---|
| LangGraph issue agent (nodes, procedures, session flow) | docs/agent.md |
| Dataset creation, binary split, fine-tuning, evaluation, serving | docs/finetuning-modernbert.md |
| Index Foodpanda policy Markdown in Elasticsearch | docs/elasticsearch-foodpanda-policy-docs.md |
| Run the agent testing simulator (CLI, env, evaluators) | docs/how_to_simulate.md |
| Simulator contracts (schemas, modules, CLI reference) | specs/simulator-spec.md |
Policy retrieval (backend/rag/policy_retriever.py) queries Elasticsearch with a multi_match on title, content, and tags. Configure the cluster with .env (see .env.example): ES_HOST, ES_PORT, ES_SCHEME, ES_POLICY_INDEX (default: policy_docs), and ES_TIMEOUT_SECONDS. With Docker Compose, these are passed into the backend service.
Foodpanda sample policies: to index the Markdown under data/policy_docs/foodpanda/policy_docs/, use the bundled script (see docs/elasticsearch-foodpanda-policy-docs.md):
python scripts/upload_foodpanda_policy_docs.py --create-index --host localhostThe generic curl / _bulk flow below uses docker compose exec into the elasticsearch service so curl runs inside the container; Elasticsearch listens on 9200 in that container (Compose maps it to the host for debugging, but these examples stay fully in Docker).
-
Ensure Elasticsearch is running (e.g.
docker compose upwith theelasticsearchservice healthy). -
Create an index (optional; dynamic mapping is enough for a quick start). Replace
policy_docsif you use a customES_POLICY_INDEX:docker compose exec -T elasticsearch curl -s -X PUT "http://localhost:9200/policy_docs" -H "Content-Type: application/json" -d "{}"
-
Index documents with
_bulk. Each source document must include the fields the retriever searches. Bash can stream a heredoc into the container (stdin is forwarded tocurl):docker compose exec -T elasticsearch curl -s -X POST "http://localhost:9200/_bulk" -H "Content-Type: application/x-ndjson" --data-binary @- <<'EOF' {"index":{"_index":"policy_docs","_id":"refund-overview"}} {"title":"Refund policy overview","content":"Customers may request a refund within 30 days of purchase.","tags":["refund","policy"]} {"index":{"_index":"policy_docs","_id":"shipping-late"}} {"title":"Late delivery","content":"If your order is late, contact support with your order number.","tags":["shipping","order"]} EOF
PowerShell or any shell: save the NDJSON (action line + JSON line per document) to
bulk.ndjsonin the project root and run:Get-Content bulk.ndjson -Raw | docker compose exec -T elasticsearch curl -s -X POST "http://localhost:9200/_bulk" -H "Content-Type: application/x-ndjson" --data-binary @-
Bash one-liner equivalent:
docker compose exec -T elasticsearch curl -s -X POST "http://localhost:9200/_bulk" -H "Content-Type: application/x-ndjson" --data-binary @- < bulk.ndjson
-
Verify with a search that matches the app query shape:
docker compose exec -T elasticsearch curl -s -X POST "http://localhost:9200/policy_docs/_search" -H "Content-Type: application/json" -d "{\"size\":3,\"query\":{\"multi_match\":{\"query\":\"refund\",\"fields\":[\"title^2\",\"content\",\"tags\"]}}}"
-
Optional source material: Markdown under
data/policy_docs/can be loaded withscripts/upload_foodpanda_policy_docs.py. The Foodpanda sample set is 12 files (01–12underdata/policy_docs/foodpanda/policy_docs/; see docs/elasticsearch-foodpanda-policy-docs.md for filenames) or adapt the script for your own bulk NDJSON.
If ES_HOST is unset, retrieval returns no documents. The readiness endpoint only records that Elasticsearch env vars are configured, not cluster health.
-
Create
.envfrom.env.example(example uses a tiny Alpine container so you do not rely on hostcp):docker run --rm -v "$PWD:/w" -w /w alpine cp .env.example .envPowerShell (repository root):
docker run --rm -v "${PWD}:/w" -w /w alpine cp .env.example .env
Set
POSTGRES_USER,POSTGRES_PASSWORD, and any overrides. -
Train and export a model to
training/models/modernbert_finetuned/(see docs/finetuning-modernbert.md). Themodernbertcontainer mounts this path read-only; without valid tokenizer + model files, that service will not start. -
Start services:
docker compose up --build
-
Open the UI at http://localhost:8501 (backend API: http://localhost:8000).
-
Try classification (Bento only, no Postgres/LLM). The
elasticsearchimage includescurland shares the Compose network withbackend, so call the API by service name:docker compose exec -T elasticsearch curl -s -X POST http://backend:8000/classify -H "Content-Type: application/json" -d "{\"text\":\"My order is late\",\"full_flow\":false}"
-
Full conversation flow (Postgres + LangGraph + local Ollama): set
NO_ISSUE_MODEL_*,VALIDATION_MODEL_*,INTENT_MODEL_PROVIDER/INTENT_MODEL(defaults matchdocker-compose.yml), andOLLAMA_BASE_URLin.env(see.env.example). Ensure Postgres is up and Ollama is reachable from the backend (e.g.host.docker.internal:11434on Docker Desktop). Then:docker compose exec -T elasticsearch curl -s -X POST http://backend:8000/classify -H "Content-Type: application/json" -d "{\"text\":\"Hello\",\"full_flow\":true}"
-
Escalation decision action (for pending
interruptsteps in procedures):docker compose exec -T elasticsearch curl -s -X POST http://backend:8000/escalations/decision -H "Content-Type: application/json" -d "{\"session_id\":\"<session-uuid>\",\"action_id\":\"<action-id>\",\"decision\":\"accept\"}"
GET /health: liveness probe for the backend.GET /health/ready: readiness; reports configured Postgres/Elasticsearch and tries the classifier health whenCLASSIFIER_BENTOML_URLis set.POST /classify: classification-only (full_flow=false) or full LangGraph orchestration (full_flow=true).POST /escalations/decision: accept/reject a pending escalation action for a session.
Tools (POST unless noted; JSON bodies per route in backend/api/routes/tools.py):
POST /tools/order-status— order status lookupPOST /tools/product-lookup— product name searchPOST /tools/product-info— product detailsPOST /tools/product-price— product pricePOST /tools/product-availability— stock / availabilityPOST /tools/refund-context— refund context for an orderPOST /tools/cancel-order— cancel order (mutating)POST /tools/create-refund-request— create refund request (mutating)POST /tools/update-shipping-address— update shipping address (mutating)POST /tools/payment— payment by transaction idGET /tools/payment-methods— list configured payment methodsPOST /tools/payment-track-refund— refund status for an orderPOST /tools/invoice— invoice lookupPOST /tools/subscription-status— subscription by account emailPOST /tools/subscription-unsubscribe— unsubscribe (mutating)POST /tools/contact-handoff— human handoff ticketPOST /tools/complaint— complaint ticketPOST /tools/delivery-period— delivery window by order or tracking id
See docs/how_to_simulate.md for prerequisites, environment variables, and failure triage. Normative schemas and module layout: specs/simulator-spec.md.
The simulator CLI lives in testing/simulator/runner.py and supports:
- bounded loops (
--iterations N) and continuous loops (--forever) - randomized selection (
--randomize) with--persona,--category, and--intentfilters - optional LLM-as-judge (
eval_targets: [llm_judge]) - optional Postgres persistence (
--persist-db) for runs/turns/messages/evaluations/training rows
The simulator service now starts in idle mode when brought up with Compose (docker compose up).
Trigger runs manually from the running container:
docker compose exec simulator python -m testing.simulator.runner --suite testing/simulator/suites/smoke.yaml --iterations 1Run it as a one-off Compose task:
docker compose run --rm simulatorExample with overrides:
SIMULATOR_SUITE_PATH=testing/simulator/suites/regression.yaml \
SIMULATOR_ITERATIONS=5 \
SIMULATOR_EXTRA_ARGS="--randomize --persist-db" \
docker compose run --rm simulator| Path | Purpose |
|---|---|
backend/ |
FastAPI app, LangGraph flow, classifier HTTP client |
frontend/ |
Streamlit demo |
services/modernbert_bento/ |
BentoML ModernBERT binary classifier |
training/scripts/ |
Dataset preparation scripts (create_bitext_dataset.py, build_is_issue_dataset.py) |
training/experiments/src/ |
ModernBERT training entrypoints (train_multiclass_modernbert.py, train_modernbert.py) |
training/data/samples/ |
Small committed examples for smoke tests |
infra/postgres/ |
Postgres init SQL (core tables) used by Docker Compose |
db/postgres/ |
Rerunnable dummy schema + idempotent seed (ecommerce demos + sessions/messages/observability aligned with infra/postgres/init.sql) |
docs/ |
Detailed guides |
Docker Compose initializes the database from infra/postgres/init.sql. For expanded test users, orders, refunds, products, plus seeded chat sessions (unresolved / resolved / escalated) and observability samples, apply the SQL in order via psql in the postgres container (from the repo root, stack running):
-
db/postgres/01_schema.sql— drops and recreates the dummy ecommerce tables and the session/messaging/observability objects that match the app (sessions,messages,tickets,agent_spans,outcomes,evaluation_scores, analytics views).docker compose exec -T postgres psql -U "${POSTGRES_USER:-admin}" -d "${POSTGRES_DB:-ecom_support}" -f - < db/postgres/01_schema.sql
-
db/postgres/02_seed.sql— loads data usingINSERT … ON CONFLICT … DO UPDATE(safe to run multiple times without duplicate-key failures).docker compose exec -T postgres psql -U "${POSTGRES_USER:-admin}" -d "${POSTGRES_DB:-ecom_support}" -f - < db/postgres/02_seed.sql
-
Optional:
db/postgres/03_smoke_checks.sql— sanity queries after load.docker compose exec -T postgres psql -U "${POSTGRES_USER:-admin}" -d "${POSTGRES_DB:-ecom_support}" -f - < db/postgres/03_smoke_checks.sql
PowerShell (repo root; set variables if they differ from defaults):
Get-Content db/postgres/01_schema.sql -Raw | docker compose exec -T postgres psql -U admin -d ecom_support -f -
Get-Content db/postgres/02_seed.sql -Raw | docker compose exec -T postgres psql -U admin -d ecom_support -f -
Get-Content db/postgres/03_smoke_checks.sql -Raw | docker compose exec -T postgres psql -U admin -d ecom_support -f -The ecommerce portion (VARCHAR order_id, support_tickets, etc.) is not the same as the UUID orders table in infra/postgres/init.sql. The chat session tables and observability schema are aligned with infra/postgres/init.sql so sessions / messages and related analytics match the backend. Use a dedicated database or run these scripts when you want SQL-level fixtures; wiring the backend may require matching column names to backend/db/ repos.
Run the test suite inside the backend service. The app imports the backend package from /app, so set PYTHONPATH=/app when invoking tests:
docker compose exec backend env PYTHONPATH=/app pytest backend/testsAdd your license here.