This document explains how to run the Criterion benchmarks, how datasets are chosen/created, and how to generate persistent sample datasets for reproducible measurements.
The benchmark suite measures:
- Sequential vs parallel processing
- With and without line-numbered code blocks
- Multiple dataset sizes (tiny, small, optionally medium)
By default, runs are silent to avoid skewing timings with console I/O.
-
Run (parallel by default):
- Linux/macOS:
cargo bench --bench context_bench
- Windows PowerShell:
cargo bench --bench context_bench
- Linux/macOS:
-
Include the medium dataset (heavier, disabled by default):
- Linux/macOS:
CB_BENCH_MEDIUM=1 cargo bench --bench context_bench
- Windows PowerShell:
$env:CB_BENCH_MEDIUM=1; cargo bench --bench context_bench
- Linux/macOS:
-
HTML reports:
- Open:
target/criterion/report/index.html - Or per-benchmark:
target/criterion/context_builder/*/report/index.html
- Open:
Parallel processing is enabled by default via the parallel feature (rayon).
-
Force sequential:
cargo bench --no-default-features --bench context_bench
-
Force parallel (even if defaults change):
cargo bench --features parallel --bench context_bench
Note: Benchmarks compare both “line_numbers” and “no_line_numbers” modes. Line numbering does additional formatting work and is expected to be slower.
Benchmarks set CB_SILENT=1 once at startup so logs and prompts don’t impact timings.
- To see output during benchmarks:
- Linux/macOS:
CB_SILENT=0 cargo bench --bench context_bench
- Windows PowerShell:
$env:CB_SILENT=0; cargo bench --bench context_bench
- Linux/macOS:
Prompts are auto-confirmed inside benches, so runs are fully non-interactive.
Each scenario picks an input dataset with the following precedence:
- If
./samples/<dataset>/projectexists, it is used. - Else, if
CB_BENCH_DATASET_DIRis set,<CB_BENCH_DATASET_DIR>/<dataset>/projectis used. - Else, a synthetic dataset is generated in a temporary directory for the run.
Datasets used:
- tiny: ~100 text files (fast sanity checks)
- small: ~1,000 text files (default performance checks)
- medium: ~5,000 text files (only when
CB_BENCH_MEDIUM=1is set)
Default filters in the benches focus on text/code: rs, md, txt, toml. Common ignored directories: target, node_modules. Binary files are generated but skipped by filters.
For more stable and reproducible measurements:
- Generate persistent datasets into
./samples/(see below). - Keep your machine’s background activity low during runs.
- Run each scenario multiple times and compare Criterion reports.
You have two options to generate datasets into ./samples:
The repository provides a generator binary gated behind the samples-bin feature.
- Linux/macOS:
cargo run --no-default-features --features samples-bin --bin generate_samples -- --help
- Windows PowerShell:
cargo run --no-default-features --features samples-bin --bin generate_samples -- --help
Examples:
- Generate default presets (tiny, small) into
./samples:cargo run --no-default-features --features samples-bin --bin generate_samples
- Include medium and large:
cargo run --no-default-features --features samples-bin --bin generate_samples -- --presets tiny,small,medium --include-large
- Only one preset with custom parameters:
cargo run --no-default-features --features samples-bin --bin generate_samples -- --only small --files 5000 --depth 4 --width 4 --size 1024
- Clean output before generating:
cargo run --no-default-features --features samples-bin --bin generate_samples -- --clean
- Dry run (print plan only):
cargo run --no-default-features --features samples-bin --bin generate_samples -- --dry-run
If you prefer not to use the Cargo feature gating, compile the script directly:
- Linux/macOS:
rustc scripts/generate_samples.rs -O -o generate_samples && ./generate_samples --help
- Windows PowerShell:
rustc scripts/generate_samples.rs -O -o generate_samples.exe; .\generate_samples.exe --help
Examples mirror Option A; just replace the leading command with ./generate_samples (or .\generate_samples.exe on Windows).
The generator produces datasets under ./samples/<preset>/project, which benches discover automatically.
Each project tree contains:
src/,docs/,assets/with nested subdirectories and text filestarget/,node_modules/populated with noise (ignored by default)- Top-level
README.md,Cargo.toml - Binary
.binfiles sprinkled to validate binary handling
It’s recommended to add /samples to .gitignore if not already present.
-
Sequential vs Parallel:
- Sequential (no rayon):
cargo bench --no-default-features --bench context_bench - Parallel (rayon):
cargo bench --features parallel --bench context_bench
- Sequential (no rayon):
-
With vs Without line numbers:
- Both modes are exercised in each run; consult the per-benchmark report pages for timings.
- Benchmarks produce no output:
- Expected. They run with
CB_SILENT=1. SetCB_SILENT=0to see logs.
- Expected. They run with
- Medium dataset missing:
- Set the flag explicitly:
CB_BENCH_MEDIUM=1. - Or pre-generate samples so the benches find
./samples/medium/project.
- Set the flag explicitly:
- Reports are empty or unchanged:
- Remove previous results and re-run:
rm -rf target/criterion(Linux/macOS)Remove-Item -Recurse -Force target\criterion(Windows PowerShell)
- Remove previous results and re-run:
- Sequential vs parallel deltas are small:
- On tiny datasets, overheads dominate. Use small or medium for more signal.
- Try enabling/disabling line numbers to observe formatting costs.
Happy benchmarking!