Welcome! This document is for contributors and maintainers of Context Builder. It covers how to set up a development environment, build, test, lint, benchmark, and release the project.
For user-facing documentation and examples, see README.md. For performance work, see BENCHMARKS.md. For release history, see CHANGELOG.md.
- Rust toolchain (stable) with support for the 2024 edition.
- Install via rustup: https://rustup.rs
- Keep your toolchain up-to-date:
rustup update
- Git
Optional but recommended:
- IDE with Rust Analyzer
- Just or Make for local task automation (if you prefer)
- Node.js (only if you plan to view Criterion’s HTML reports and serve them locally, not required for development)
git clone https://github.com/igorls/context-builder.git
cd context-builder- Cargo.toml — crate metadata, dependencies, features
- README.md — user-facing documentation
- CHANGELOG.md — release notes
- DEVELOPMENT.md — this file
- BENCHMARKS.md — running and understanding benchmarks
- scripts/
- generate_samples.rs — synthetic dataset generator for benchmarking
- benches/
- context_bench.rs — Criterion benchmark suite
- src/
- main.rs — binary entry point
- lib.rs — core orchestration and run() implementation
- cli.rs — clap parser and CLI arguments
- file_utils.rs — directory traversal, filter/ignore collection, prompts
- markdown.rs — core rendering logic, streaming, line numbering, binary/text sniffing
- tree.rs — file tree structure building and printing
- samples/ — optional persistent datasets (ignored in VCS) for benchmarking
Build:
cargo buildRun the CLI:
cargo run -- --help
cargo run -- -d . -o out.md -f rs -f toml -i target --line-numbersNotes:
- By default, parallel processing is enabled via the
parallelfeature (uses rayon). - Logging uses env_logger; set
RUST_LOGto control verbosity:- Linux/macOS:
RUST_LOG=info cargo run -- ... - Windows PowerShell:
$env:RUST_LOG='info'; cargo run -- ...
- Linux/macOS:
-
parallel (enabled by default)
- Enables parallel file processing in markdown generation via rayon.
- Disable defaults (sequential run):
- Build/Run:
cargo run --no-default-features -- ... - As a dependency in another crate: set
default-features = falsein Cargo.toml.
- Build/Run:
-
samples-bin
- Exposes the dataset generator as a cargo bin (development-only).
- Usage:
- Linux/macOS:
cargo run --no-default-features --features samples-bin --bin generate_samples -- --help
- Windows PowerShell:
cargo run --no-default-features --features samples-bin --bin generate_samples -- --help
- Linux/macOS:
Run all tests:
cargo testTips:
- Unit tests cover CLI parsing, file filtering/ignoring, markdown formatting (including line numbers and binary handling), and tree building.
- Avoid adding interactive prompts inside tests. The library is structured so that prompts can be injected/mocked (see
Promptertrait). - For additional diagnostics during tests:
- Linux/macOS:
RUST_LOG=info cargo test - Windows PowerShell:
$env:RUST_LOG='info'; cargo test
- Linux/macOS:
Format:
cargo fmt --allClippy (lints):
cargo clippy --all-targets --all-features -- -D warningsPlease ensure code is formatted and clippy-clean before opening a PR.
We use Criterion for micro/meso benchmarks and dataset-driven performance checks.
- See BENCHMARKS.md for details, including dataset generation, silent runs, and HTML report navigation.
- Quick start:
cargo bench --bench context_bench
-
CB_SILENT
- When set to “1” or “true” (case-insensitive), suppresses user-facing prints in the CLI.
- The benchmark harness sets this to “1” by default to avoid skewing timings with console I/O.
- Override locally:
- Linux/macOS:
CB_SILENT=0 cargo bench --bench context_bench - Windows PowerShell:
$env:CB_SILENT=0; cargo bench --bench context_bench
- Linux/macOS:
-
CB_BENCH_MEDIUM
- When set to “1”, enables the heavier “medium” dataset scenarios during benches.
-
CB_BENCH_DATASET_DIR
- Allows pointing the benchmark harness to an external root containing datasets:
<CB_BENCH_DATASET_DIR>/<preset>/project
- If not set and no
./samples/<preset>/projectis present, benches will synthesize datasets in a temp dir.
- Allows pointing the benchmark harness to an external root containing datasets:
-
RUST_LOG
- Controls log verbosity (env_logger). Example:
- Linux/macOS:
RUST_LOG=info cargo run -- ... - Windows PowerShell:
$env:RUST_LOG='info'; cargo run -- ...
- Linux/macOS:
- Controls log verbosity (env_logger). Example:
- Edition: 2024
- Error handling:
- Use
io::Resultwhere appropriate; prefer returning errors over panicking. - It’s okay to use
unwrap()andexpect()in tests/benches and small setup helpers, but not in core library logic.
- Use
- Performance:
- Prefer streaming reads/writes for large files (see
markdown.rs). - Keep binary detection lightweight (current sniff logic checks for NUL bytes and UTF-8 validity).
- Keep language detection simple and deterministic; add new mappings in one place.
- Prefer streaming reads/writes for large files (see
- Cross-platform:
- Normalize path separators in tests where string comparisons are used.
- Logging:
- Use
log::{info, warn, error}; letenv_loggercontrol emission.
- Use
- CLI:
- Add new flags in
cli.rs. Ensure you update tests covering parsing, and propagate options cleanly throughrun_with_args.
- Add new flags in
-
Fork and create a feature branch:
git checkout -b feat/my-improvement
-
Make changes, add tests, and keep the code formatted and clippy-clean:
cargo fmt --all cargo clippy --all-targets --all-features -- -D warnings cargo test -
If you modified performance-sensitive code, run benches (see BENCHMARKS.md).
-
Update CHANGELOG.md if the change is user-visible or noteworthy.
-
Open a PR with:
- A concise title
- Description of changes and rationale
- Notes on performance impact (if any)
- Any relevant screenshots or benchmark snippets
Suggested commit message convention: short, imperative subject; optionally scope (e.g., feat(cli): add --no-parallel flag).
-
Ensure the tree is green:
cargo fmt --allcargo clippy --all-targets --all-features -- -D warningscargo test- Optionally:
cargo bench
-
Update versions and docs:
- Bump
versioninCargo.toml. - Add a new entry to
CHANGELOG.md. - Verify README.md and DEVELOPMENT.md are up to date.
- Bump
-
Tag the release:
git commit -am "chore(release): vX.Y.Z" git tag vX.Y.Z git push && git push --tags
-
Publish to crates.io:
cargo publish --dry-run cargo publish
-
Create a GitHub release, paste changelog highlights, and attach links to benchmarks if relevant.
- Prompts during runs
- The library uses a
Promptertrait for confirmation flows. Inject a test-friendly prompter to avoid interactive I/O in tests and benches.
- The library uses a
- Output file overwrites
- The CLI confirms overwrites by default. In tests/benches, use the injected prompter that auto-confirms.
- Large datasets
- Prefer parallel builds for performance.
- Consider dataset size and line-numbering effects when measuring.
Open an issue or start a discussion on GitHub. Thanks for contributing!