feat(olmocr): add chandra-ocr-2 runner by Khurdhula-Harshavardhan · Pull Request #3 · JigsawStack/interfaze-complete-benchmarks

Khurdhula-Harshavardhan · 2026-05-12T03:49:01Z

Summary

Adds olmocr_bench_chandra_ocr2.py + bench/runners/run_chandra_ocr2.py, following the same openai_mini/grok/reducto template
Dispatches each rendered PDF page to a self-hosted datalab-to/chandra-ocr-2 vLLM endpoint (HTTPS POST /ocr), auth via x-api-admin-key header
Apples-to-apples with other candidates: temperature=0, task="ocr_layout", no reasoning/thinking mode (Chandra OCR 2 is a fine-tuned OCR VLM with none)
Harness supports --sample, --skip-generation, --generate-only, --limit N (smoke)
README updated under the olmOCR section to list the runner + the two required env vars

Required env

CHANDRA_MODAL_URL=https://<workspace>--mlt-chandra-ocr-chandraocr-api.modal.run
CHANDRA_MODAL_ADMIN_KEY=<admin-key>

Notes

RATE_LIMIT=50 matches a 10-H100 deployment; turn down if running against a single warm container
Existing-file checkpointing inherited from the openai_mini template — re-running resumes from the last completed page

Test plan

Both files parse via python -m py_compile
Full bench run (1,403 pages, 8,413 assertions) — 84.3% ± 0.9% overall (within 1.6 pts of Datalab's headline 85.9%)
Existing-file skip verified across a crashed-run resume
CI: none configured for this repo yet

Mirrors the existing per-provider olmOCR runners (openai_mini, grok, etc.) but dispatches each rendered PDF page to a self-hosted `datalab-to/chandra-ocr-2` vLLM endpoint over HTTPS. Auth via the `x-api-admin-key` header; runner reads `CHANDRA_MODAL_URL` and `CHANDRA_MODAL_ADMIN_KEY` from env. Request payload uses `task="ocr_layout"` and `temperature=0.0` to stay apples-to-apples with the other candidates (no reasoning/thinking mode — Chandra OCR 2 is a fine-tuned OCR VLM with none). Harness supports `--sample`, `--skip-generation`, `--generate-only`, and `--limit N` (smoke). `RATE_LIMIT=50` matches the throughput of a 10-H100 deployment; turn down if running against a single warm container. README updated under the olmOCR section to list the new runner and the two required env vars.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(olmocr): add chandra-ocr-2 runner#3

feat(olmocr): add chandra-ocr-2 runner#3
Khurdhula-Harshavardhan wants to merge 1 commit into
mainfrom
feat/olmocr-chandra-ocr-2

Khurdhula-Harshavardhan commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Khurdhula-Harshavardhan commented May 12, 2026

Summary

Required env

Notes

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant