BitBot

Agentic bot for any business — BitBot bootstrap with category classification using ModernBERT (fine-tuned MoritzLaurer/ModernBERT-base-zeroshot-v2.0), a staged procedure-driven LangGraph API flow, and Docker Compose for frontend (Streamlit), backend (FastAPI), PostgreSQL, Elasticsearch, and a BentoML classifier service.

Architecture (overview)

flowchart LR
  fe[Streamlit_frontend] --> be[FastAPI_backend]
  be --> lg[LangGraph_staged_issue_pipeline]
  lg --> bento[ModernBERT_Bento]
  lg --> stageIntent["LLM_intent_plus_DB_allowlist"]
  lg --> stageRouter[Deterministic_specialist_router]
  lg --> yaml[ProcedureYAMLLibrary]
  lg --> stagePolicy[Policy_load_and_constraints]
  lg --> stageValidate[Data_plus_eligibility_gates]
  lg --> stageOutcome[Outcome_validator_and_escalation]
  lg --> llm[Ollama_or_Cerebras]
  lg --> es[(ElasticsearchPolicyDocs)]
  be --> pg[(Postgres)]
  stageIntent --> pg
  train[Dataset_and_finetune_scripts] --> artifacts[Local_model_dir]
  artifacts --> bento

Frontend (frontend/): Streamlit chat UI calling POST /classify with full_flow for session + LangGraph + LLM branches.
Backend (backend/): FastAPI + staged LangGraph (backend/agent/issue_graph.py): ModernBERT via Bento for category → no-issue/low-confidence branch or intent resolution → deterministic specialist routing → YAML procedure load (backend/procedures/*.yaml) → policy load/constraints → data+eligibility gates → structured step execution → explicit outcome validation/escalation (see docs/agent.md).
Tool APIs (backend/api/routes/tools.py): DB-backed HTTP tools used by procedure steps (order, product, refund, payment, invoice, subscription, contact, complaint, delivery); see API Surface below for the full list.
Escalation API (backend/api/routes/escalations.py): in-chat escalation decision endpoint (/escalations/decision) for accept/reject UX.
Procedures (backend/procedures/): One YAML blueprint per intent. Blueprints define required_data and ordered steps (retrieval, logic_gate, tool_call, llm_response, interrupt) used to enforce deterministic control flow.
ModernBERT (services/modernbert_bento/): BentoML service loading a local fine-tuned checkpoint.
Policy source: Policy constraints/content remain external to procedures and come from Elasticsearch-backed policy documents through retrieval steps.
Data stores: Postgres for sessions, orders, products, and related app data (infra/postgres/init.sql). Policy retrieval uses Elasticsearch only (no vector search in Postgres).
Sessions (POST /classify with full_flow=true): messages and active issue (intent, category, confidence) live in Postgres. While an issue is active, intent classification is locked to the stored intent on later turns. Users can confirm resolution in natural language to end the issue without re-running the graph; the graph can also mark the session resolved when a procedure completes cleanly (see graph_suggests_session_resolved and classify.py).
Product catalog: products table in Postgres backs the product_catalog_lookup step inside the LangGraph executor.

Documentation

Topic	Document
LangGraph issue agent (nodes, procedures, session flow)	docs/agent.md
Dataset creation, binary split, fine-tuning, evaluation, serving	docs/finetuning-modernbert.md
Index Foodpanda policy Markdown in Elasticsearch	docs/elasticsearch-foodpanda-policy-docs.md
Run the agent testing simulator (CLI, env, evaluators)	docs/how_to_simulate.md
Simulator contracts (schemas, modules, CLI reference)	specs/simulator-spec.md

How to add data to Elasticsearch

Policy retrieval (backend/rag/policy_retriever.py) queries Elasticsearch with a multi_match on title, content, and tags. Configure the cluster with .env (see .env.example): ES_HOST, ES_PORT, ES_SCHEME, ES_POLICY_INDEX (default: policy_docs), and ES_TIMEOUT_SECONDS. With Docker Compose, these are passed into the backend service.

Foodpanda sample policies: to index the Markdown under data/policy_docs/foodpanda/policy_docs/, use the bundled script (see docs/elasticsearch-foodpanda-policy-docs.md):

python scripts/upload_foodpanda_policy_docs.py --create-index --host localhost

The generic curl / _bulk flow below uses docker compose exec into the elasticsearch service so curl runs inside the container; Elasticsearch listens on 9200 in that container (Compose maps it to the host for debugging, but these examples stay fully in Docker).

Ensure Elasticsearch is running (e.g. docker compose up with the elasticsearch service healthy).

Create an index (optional; dynamic mapping is enough for a quick start). Replace policy_docs if you use a custom ES_POLICY_INDEX:

docker compose exec -T elasticsearch curl -s -X PUT "http://localhost:9200/policy_docs" -H "Content-Type: application/json" -d "{}"

Index documents with _bulk. Each source document must include the fields the retriever searches. Bash can stream a heredoc into the container (stdin is forwarded to curl):

docker compose exec -T elasticsearch curl -s -X POST "http://localhost:9200/_bulk" -H "Content-Type: application/x-ndjson" --data-binary @- <<'EOF'
{"index":{"_index":"policy_docs","_id":"refund-overview"}}
{"title":"Refund policy overview","content":"Customers may request a refund within 30 days of purchase.","tags":["refund","policy"]}
{"index":{"_index":"policy_docs","_id":"shipping-late"}}
{"title":"Late delivery","content":"If your order is late, contact support with your order number.","tags":["shipping","order"]}
EOF

PowerShell or any shell: save the NDJSON (action line + JSON line per document) to bulk.ndjson in the project root and run:

Get-Content bulk.ndjson -Raw | docker compose exec -T elasticsearch curl -s -X POST "http://localhost:9200/_bulk" -H "Content-Type: application/x-ndjson" --data-binary @-

Bash one-liner equivalent:

docker compose exec -T elasticsearch curl -s -X POST "http://localhost:9200/_bulk" -H "Content-Type: application/x-ndjson" --data-binary @- < bulk.ndjson

Verify with a search that matches the app query shape:

docker compose exec -T elasticsearch curl -s -X POST "http://localhost:9200/policy_docs/_search" -H "Content-Type: application/json" -d "{\"size\":3,\"query\":{\"multi_match\":{\"query\":\"refund\",\"fields\":[\"title^2\",\"content\",\"tags\"]}}}"

Optional source material: Markdown under data/policy_docs/ can be loaded with scripts/upload_foodpanda_policy_docs.py. The Foodpanda sample set is 12 files (01–12 under data/policy_docs/foodpanda/policy_docs/; see docs/elasticsearch-foodpanda-policy-docs.md for filenames) or adapt the script for your own bulk NDJSON.

If ES_HOST is unset, retrieval returns no documents. The readiness endpoint only records that Elasticsearch env vars are configured, not cluster health.

Quickstart (Docker)

Create .env from .env.example (example uses a tiny Alpine container so you do not rely on host cp):
```
docker run --rm -v "$PWD:/w" -w /w alpine cp .env.example .env
```
PowerShell (repository root):
```
docker run --rm -v "${PWD}:/w" -w /w alpine cp .env.example .env
```
Set POSTGRES_USER, POSTGRES_PASSWORD, and any overrides.
Train and export a model to training/models/modernbert_finetuned/ (see docs/finetuning-modernbert.md). The modernbert container mounts this path read-only; without valid tokenizer + model files, that service will not start.
Start services:
```
docker compose up --build
```
Open the UI at http://localhost:8501 (backend API: http://localhost:8000).

Try classification (Bento only, no Postgres/LLM). The elasticsearch image includes curl and shares the Compose network with backend, so call the API by service name:

docker compose exec -T elasticsearch curl -s -X POST http://backend:8000/classify -H "Content-Type: application/json" -d "{\"text\":\"My order is late\",\"full_flow\":false}"

Full conversation flow (Postgres + LangGraph + local Ollama): set NO_ISSUE_MODEL_*, VALIDATION_MODEL_*, INTENT_MODEL_PROVIDER / INTENT_MODEL (defaults match docker-compose.yml), and OLLAMA_BASE_URL in .env (see .env.example). Ensure Postgres is up and Ollama is reachable from the backend (e.g. host.docker.internal:11434 on Docker Desktop). Then:
```
docker compose exec -T elasticsearch curl -s -X POST http://backend:8000/classify -H "Content-Type: application/json" -d "{\"text\":\"Hello\",\"full_flow\":true}"
```

Escalation decision action (for pending interrupt steps in procedures):

docker compose exec -T elasticsearch curl -s -X POST http://backend:8000/escalations/decision -H "Content-Type: application/json" -d "{\"session_id\":\"<session-uuid>\",\"action_id\":\"<action-id>\",\"decision\":\"accept\"}"

API Surface (Core)

GET /health: liveness probe for the backend.
GET /health/ready: readiness; reports configured Postgres/Elasticsearch and tries the classifier health when CLASSIFIER_BENTOML_URL is set.
POST /classify: classification-only (full_flow=false) or full LangGraph orchestration (full_flow=true).
POST /escalations/decision: accept/reject a pending escalation action for a session.

Tools (POST unless noted; JSON bodies per route in backend/api/routes/tools.py):

POST /tools/order-status — order status lookup
POST /tools/product-lookup — product name search
POST /tools/product-info — product details
POST /tools/product-price — product price
POST /tools/product-availability — stock / availability
POST /tools/refund-context — refund context for an order
POST /tools/cancel-order — cancel order (mutating)
POST /tools/create-refund-request — create refund request (mutating)
POST /tools/update-shipping-address — update shipping address (mutating)
POST /tools/payment — payment by transaction id
GET /tools/payment-methods — list configured payment methods
POST /tools/payment-track-refund — refund status for an order
POST /tools/invoice — invoice lookup
POST /tools/subscription-status — subscription by account email
POST /tools/subscription-unsubscribe — unsubscribe (mutating)
POST /tools/contact-handoff — human handoff ticket
POST /tools/complaint — complaint ticket
POST /tools/delivery-period — delivery window by order or tracking id

Simulator (Docker + persistence)

See docs/how_to_simulate.md for prerequisites, environment variables, and failure triage. Normative schemas and module layout: specs/simulator-spec.md.

The simulator CLI lives in testing/simulator/runner.py and supports:

bounded loops (--iterations N) and continuous loops (--forever)
randomized selection (--randomize) with --persona, --category, and --intent filters
optional LLM-as-judge (eval_targets: [llm_judge])
optional Postgres persistence (--persist-db) for runs/turns/messages/evaluations/training rows

The simulator service now starts in idle mode when brought up with Compose (docker compose up). Trigger runs manually from the running container:

docker compose exec simulator python -m testing.simulator.runner --suite testing/simulator/suites/smoke.yaml --iterations 1

Run it as a one-off Compose task:

docker compose run --rm simulator

Example with overrides:

SIMULATOR_SUITE_PATH=testing/simulator/suites/regression.yaml \
SIMULATOR_ITERATIONS=5 \
SIMULATOR_EXTRA_ARGS="--randomize --persist-db" \
docker compose run --rm simulator

Repository layout

Path	Purpose
`backend/`	FastAPI app, LangGraph flow, classifier HTTP client
`frontend/`	Streamlit demo
`services/modernbert_bento/`	BentoML ModernBERT binary classifier
`training/scripts/`	Dataset preparation scripts (`create_bitext_dataset.py`, `build_is_issue_dataset.py`)
`training/experiments/src/`	ModernBERT training entrypoints (`train_multiclass_modernbert.py`, `train_modernbert.py`)
`training/data/samples/`	Small committed examples for smoke tests
`infra/postgres/`	Postgres init SQL (core tables) used by Docker Compose
`db/postgres/`	Rerunnable dummy schema + idempotent seed (ecommerce demos + sessions/messages/observability aligned with `infra/postgres/init.sql`)
`docs/`	Detailed guides

Dummy Postgres dataset (`db/postgres/`)

Docker Compose initializes the database from infra/postgres/init.sql. For expanded test users, orders, refunds, products, plus seeded chat sessions (unresolved / resolved / escalated) and observability samples, apply the SQL in order via psql in the postgres container (from the repo root, stack running):

db/postgres/01_schema.sql — drops and recreates the dummy ecommerce tables and the session/messaging/observability objects that match the app (sessions, messages, tickets, agent_spans, outcomes, evaluation_scores, analytics views).
```
docker compose exec -T postgres psql -U "${POSTGRES_USER:-admin}" -d "${POSTGRES_DB:-ecom_support}" -f - < db/postgres/01_schema.sql
```

db/postgres/02_seed.sql — loads data using INSERT … ON CONFLICT … DO UPDATE (safe to run multiple times without duplicate-key failures).

docker compose exec -T postgres psql -U "${POSTGRES_USER:-admin}" -d "${POSTGRES_DB:-ecom_support}" -f - < db/postgres/02_seed.sql

Optional: db/postgres/03_smoke_checks.sql — sanity queries after load.

docker compose exec -T postgres psql -U "${POSTGRES_USER:-admin}" -d "${POSTGRES_DB:-ecom_support}" -f - < db/postgres/03_smoke_checks.sql

PowerShell (repo root; set variables if they differ from defaults):

Get-Content db/postgres/01_schema.sql -Raw | docker compose exec -T postgres psql -U admin -d ecom_support -f -
Get-Content db/postgres/02_seed.sql -Raw | docker compose exec -T postgres psql -U admin -d ecom_support -f -
Get-Content db/postgres/03_smoke_checks.sql -Raw | docker compose exec -T postgres psql -U admin -d ecom_support -f -

The ecommerce portion (VARCHAR order_id, support_tickets, etc.) is not the same as the UUID orders table in infra/postgres/init.sql. The chat session tables and observability schema are aligned with infra/postgres/init.sql so sessions / messages and related analytics match the backend. Use a dedicated database or run these scripts when you want SQL-level fixtures; wiring the backend may require matching column names to backend/db/ repos.

Development (Docker)

Run the test suite inside the backend service. The app imports the backend package from /app, so set PYTHONPATH=/app when invoking tests:

docker compose exec backend env PYTHONPATH=/app pytest backend/tests

License

Add your license here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BitBot

Architecture (overview)

Documentation

How to add data to Elasticsearch

Quickstart (Docker)

API Surface (Core)

Simulator (Docker + persistence)

Repository layout

Dummy Postgres dataset (`db/postgres/`)

Development (Docker)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
backend		backend
data		data
db/postgres		db/postgres
docs		docs
frontend		frontend
infra/postgres		infra/postgres
scripts		scripts
services/modernbert_bento		services/modernbert_bento
specs		specs
testing		testing
training		training
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini

Folders and files

Latest commit

History

Repository files navigation

BitBot

Architecture (overview)

Documentation

How to add data to Elasticsearch

Quickstart (Docker)

API Surface (Core)

Simulator (Docker + persistence)

Repository layout

Dummy Postgres dataset (db/postgres/)

Development (Docker)

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Dummy Postgres dataset (`db/postgres/`)

Packages