Allow external callers to configure the hook directory by shreyaskommuri · Pull Request #951 · NVIDIA/cloudai

shreyaskommuri · 2026-06-29T17:42:08Z

Summary

Closes #952.

Add an optional --hook-dir argument to the shared scenario command options used by run, dry-run, install, uninstall, and generate-report.
Pass the selected hook directory into Parser instead of always resolving hooks from conf/hook relative to the process working directory.
Preserve conf/hook as the default for existing in-repository workflows.
Add CLI and parser regression coverage for the configurable hook directory.

CloudAI Autotune exposed this while validating CloudAI's supplied NCCL scenario from outside the CloudAI checkout. Autotune already passed absolute paths for the scenario, system config, tests, and output directory:

autotune smoke-cloudai \
  ../cloudai/conf/common/test_scenario/nccl_test.toml \
  --cloudai-bin ../cloudai/.venv/bin/cloudai \
  --system-config ../cloudai/conf/common/system/example_slurm_cluster.toml \
  --tests-dir ../cloudai/conf/common/test \
  --runs-dir runs/integration \
  --timeout-sec 60

CloudAI still resolved hooks from conf/hook under Autotune's working directory and exited with:

[ERROR] Pre-test hook 'nccl_test' not found in hook mapping. A corresponding hook should exist under 'conf/hook'. Ensure that a proper hook directory is set under the working directory.

Changing the caller's working directory would make output and orchestration behavior depend on where CloudAI was launched. This change instead makes the remaining configuration root explicit, consistent with --system-config, --tests-dir, and --test-scenario.

Test Plan

Testing environment:

macOS local development environment
Python managed through the repository's uv environment
CloudAI dry-run mode; no Slurm cluster was available
Invocation performed from the sibling CloudAI-Autotune repository, not the CloudAI working directory

Regression and repository checks:

uv run pytest tests/test_cli.py tests/test_parser.py tests/test_handlers.py -q
# 60 passed

uv run pytest -q
# 1636 passed, 4 skipped, 490 deselected

uv run pytest -q -m ci_only
# 490 passed, 1640 deselected

uv run pytest --dead-fixtures
# Cool, every declared fixture is being used.

uv run pre-commit run --all-files
# All hooks passed, including pyright, ruff, vulture, import-linter, and taplo.

The original NCCL dry-run was then repeated from the Autotune working directory with the new option:

PYTHONPATH=/Users/shreyas/cloudai/src ../cloudai/.venv/bin/cloudai dry-run \
  --test-scenario /Users/shreyas/cloudai/conf/common/test_scenario/nccl_test.toml \
  --system-config /Users/shreyas/cloudai/conf/common/system/example_slurm_cluster.toml \
  --tests-dir /Users/shreyas/cloudai/conf/common/test \
  --hook-dir /Users/shreyas/cloudai/conf/hook \
  --output-dir /tmp/cloudai-hook-dir-validation

Result:

[INFO] Test Scenario Name: nccl-test
[INFO] Starting test: Tests.all_reduce
[INFO] Starting test: Tests.all_gather
[INFO] All jobs are complete.
exit code: 0

The dry run emitted expected local warnings for unavailable Slurm commands, but hook discovery succeeded and the complete NCCL scenario was generated.

Additional Notes

Live NCCL execution on a Slurm cluster was not tested.
AI was used only for context and guidance.

Expose an optional --hook-dir across scenario commands and pass the resolved directory into Parser so tools such as CloudAI Autotune can invoke hook-bearing scenarios outside the CloudAI repository. Constraint: Preserve conf/hook as the parser default and keep hook-free external invocations backward compatible. Rejected: Require callers to change into the CloudAI checkout | output paths and orchestration should not depend on process working directory. Confidence: high Scope-risk: narrow Directive: Keep scenario config roots explicit at the CLI boundary when external orchestration depends on them. Tested: uv run pytest -q (1636 passed, 4 skipped); uv run pytest -q -m ci_only (490 passed); uv run pytest --dead-fixtures; uv run pre-commit run --all-files; external NCCL dry-run from CloudAI-Autotune cwd exits 0 with --hook-dir Not-tested: Live NCCL execution on a Slurm cluster. Signed-off-by: shreyaskommuri <shreyaskommuri@gmail.com>

coderabbitai · 2026-06-29T17:42:14Z

📝 Walkthrough

Walkthrough

Adds a --hook-dir optional CLI argument to all CloudAI commands (install, uninstall, dry-run, run, generate-report). The value flows through argparse.Namespace into the three handler functions, which pass it to Parser. Parser.__init__ gains a hook_root parameter and parse() uses instance paths instead of module-level constants.

Hook directory configuration

Layer / File(s)	Summary
Parser hook_root parameter and parse() usage `src/cloudai/parser.py`	`Parser.__init__` accepts `hook_root` (defaults to `HOOK_ROOT`), stores `self.hook_root` and derives `self.hook_test_root`; `parse()` replaces constant references with instance paths and adds existence guards.
CLI --hook-dir option and command signatures `src/cloudai/cli/cli.py`	`common_options` decorator gains `--hook-dir` Click option; all five command functions updated to accept and forward `hook_dir` in their `argparse.Namespace`.
Handlers pass hook_dir to Parser `src/cloudai/cli/handlers.py`	`handle_install_and_uninstall`, `_setup_system_and_scenario`, and `handle_generate_report` each construct `Parser` with `args.hook_dir or HOOK_ROOT`.
Tests and copyright `tests/test_cli.py`, `tests/test_parser.py`, `src/cloudai/cli/cli.py`	New CLI test checks `--hook-dir DIRECTORY` in `dry-run --help`; new parser test verifies `hook_root`/`hook_test_root` from constructor; copyright year bumped to 2026.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Allow external callers to configure the hook directory #952: This PR directly implements the --hook-dir plumbing through CLI, handlers, and parser described in that issue.

Poem

🐇 Hop hop, a new path appears,
--hook-dir now steers,
No more hardcoded root alone,
The hook finds its own home!
This rabbit digs the flexible den~ 🏠

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: making the hook directory configurable for external callers.
Description check	✅ Passed	The description clearly matches the code changes and explains the configurable hook directory behavior and tests.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_cli.py`:
- Around line 49-55: The hook-dir help check only covers the dry-run command, so
it can miss regressions in other commands that should also include the shared
option. Update the test in test_hook_dir_is_available_on_run_commands to be
parameterized over the full command set covered by the common_options decorator,
including install, uninstall, run, dry-run, and generate-report, and invoke main
for each command to assert the help output still contains --hook-dir DIRECTORY.

In `@tests/test_parser.py`:
- Around line 42-47: The current test only checks Parser constructor state, so
it won’t catch regressions in path resolution inside parse(). Update
test_custom_hook_root_is_used to exercise Parser.parse() with the injected
hook_root and assert the parsed hook file resolution comes from that custom
directory, using Parser.parse and the parser.hook_root/hook_test_root setup to
verify the runtime behavior instead of just object fields.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 43241b99-55f5-4ceb-8a26-047f57f413e5

📥 Commits

Reviewing files that changed from the base of the PR and between 79f8bdb and 5d971f4.

📒 Files selected for processing (5)

src/cloudai/cli/cli.py
src/cloudai/cli/handlers.py
src/cloudai/parser.py
tests/test_cli.py
tests/test_parser.py

coderabbitai · 2026-06-29T17:50:39Z

+def test_hook_dir_is_available_on_run_commands():
+    runner = CliRunner()
+
+    result = runner.invoke(main, ["dry-run", "--help"])
+
+    assert result.exit_code == 0
+    assert "--hook-dir DIRECTORY" in result.output


📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Cover all commands that now accept --hook-dir.

This only locks the help text for dry-run, but the PR contract is broader (install, uninstall, run, dry-run, generate-report). If one command later drops @common_options, this regression test still passes. Please parametrize this check over the full command set.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_cli.py` around lines 49 - 55, The hook-dir help check only covers the dry-run command, so it can miss regressions in other commands that should also include the shared option. Update the test in test_hook_dir_is_available_on_run_commands to be parameterized over the full command set covered by the common_options decorator, including install, uninstall, run, dry-run, and generate-report, and invoke main for each command to assert the help output still contains --hook-dir DIRECTORY.

coderabbitai · 2026-06-29T17:50:39Z

+    def test_custom_hook_root_is_used(self, parser: Parser, tmp_path: Path):
+        hook_root = tmp_path / "hooks"
+        parser = Parser(parser.system_config_path, hook_root)
+
+        assert parser.hook_root == hook_root
+        assert parser.hook_test_root == hook_root / "test"


🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Exercise parse() with the custom hook root, not just constructor state.

These assertions still pass if Parser.parse() regresses to consulting HOOK_ROOT directly. The behavior change in this PR is path resolution during parsing, so the regression test should drive parse() and verify the injected hook directory is the one actually used.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/test_parser.py` around lines 42 - 47, The current test only checks Parser constructor state, so it won’t catch regressions in path resolution inside parse(). Update test_custom_hook_root_is_used to exercise Parser.parse() with the injected hook_root and assert the parsed hook file resolution comes from that custom directory, using Parser.parse and the parser.hook_root/hook_test_root setup to verify the runtime behavior instead of just object fields.

podkidyshev · 2026-07-03T12:21:11Z

we're quite busy with release + I'll be on vacations for couple weeks. I'll respond as soon as I get some spare time 🙏

shreyaskommuri · 2026-07-03T16:35:07Z

@podkidyshev No worries! Take your time, happy to wait but I will update branch and merge conflicts once this is active again to avoid redundant work.

shreyaskommuri marked this pull request as ready for review June 29, 2026 17:44

shreyaskommuri requested review from jeffnvidia, podkidyshev and srivatsankrishnan as code owners June 29, 2026 17:44

coderabbitai Bot reviewed Jun 29, 2026

View reviewed changes

shreyaskommuri mentioned this pull request Jun 29, 2026

Keep CloudAI integration failures actionable shreyaskommuri/CloudAI-Autotune#5

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Allow external callers to configure the hook directory#951

Allow external callers to configure the hook directory#951
shreyaskommuri wants to merge 1 commit into
NVIDIA:mainfrom
shreyaskommuri:fix/configurable-hook-directory

shreyaskommuri commented Jun 29, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading

Walkthrough

Estimated code review effort

Possibly related issues

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 29, 2026

Uh oh!

coderabbitai Bot Jun 29, 2026

Uh oh!

podkidyshev commented Jul 3, 2026

Uh oh!

shreyaskommuri commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

shreyaskommuri commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Additional Notes

Uh oh!

coderabbitai Bot commented Jun 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Possibly related issues

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 29, 2026

Choose a reason for hiding this comment

Uh oh!

podkidyshev commented Jul 3, 2026

Uh oh!

shreyaskommuri commented Jul 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shreyaskommuri commented Jun 29, 2026 •

edited

Loading

coderabbitai Bot commented Jun 29, 2026 •

edited

Loading