Skip to content

Allow external callers to configure the hook directory#951

Open
shreyaskommuri wants to merge 1 commit into
NVIDIA:mainfrom
shreyaskommuri:fix/configurable-hook-directory
Open

Allow external callers to configure the hook directory#951
shreyaskommuri wants to merge 1 commit into
NVIDIA:mainfrom
shreyaskommuri:fix/configurable-hook-directory

Conversation

@shreyaskommuri

@shreyaskommuri shreyaskommuri commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary

Closes #952.

  • Add an optional --hook-dir argument to the shared scenario command options used by run, dry-run, install, uninstall, and generate-report.
  • Pass the selected hook directory into Parser instead of always resolving hooks from conf/hook relative to the process working directory.
  • Preserve conf/hook as the default for existing in-repository workflows.
  • Add CLI and parser regression coverage for the configurable hook directory.

CloudAI Autotune exposed this while validating CloudAI's supplied NCCL scenario from outside the CloudAI checkout. Autotune already passed absolute paths for the scenario, system config, tests, and output directory:

autotune smoke-cloudai \
  ../cloudai/conf/common/test_scenario/nccl_test.toml \
  --cloudai-bin ../cloudai/.venv/bin/cloudai \
  --system-config ../cloudai/conf/common/system/example_slurm_cluster.toml \
  --tests-dir ../cloudai/conf/common/test \
  --runs-dir runs/integration \
  --timeout-sec 60

CloudAI still resolved hooks from conf/hook under Autotune's working directory and exited with:

[ERROR] Pre-test hook 'nccl_test' not found in hook mapping. A corresponding hook should exist under 'conf/hook'. Ensure that a proper hook directory is set under the working directory.

Changing the caller's working directory would make output and orchestration behavior depend on where CloudAI was launched. This change instead makes the remaining configuration root explicit, consistent with --system-config, --tests-dir, and --test-scenario.

Test Plan

Testing environment:

  • macOS local development environment
  • Python managed through the repository's uv environment
  • CloudAI dry-run mode; no Slurm cluster was available
  • Invocation performed from the sibling CloudAI-Autotune repository, not the CloudAI working directory

Regression and repository checks:

uv run pytest tests/test_cli.py tests/test_parser.py tests/test_handlers.py -q
# 60 passed

uv run pytest -q
# 1636 passed, 4 skipped, 490 deselected

uv run pytest -q -m ci_only
# 490 passed, 1640 deselected

uv run pytest --dead-fixtures
# Cool, every declared fixture is being used.

uv run pre-commit run --all-files
# All hooks passed, including pyright, ruff, vulture, import-linter, and taplo.

The original NCCL dry-run was then repeated from the Autotune working directory with the new option:

PYTHONPATH=/Users/shreyas/cloudai/src ../cloudai/.venv/bin/cloudai dry-run \
  --test-scenario /Users/shreyas/cloudai/conf/common/test_scenario/nccl_test.toml \
  --system-config /Users/shreyas/cloudai/conf/common/system/example_slurm_cluster.toml \
  --tests-dir /Users/shreyas/cloudai/conf/common/test \
  --hook-dir /Users/shreyas/cloudai/conf/hook \
  --output-dir /tmp/cloudai-hook-dir-validation

Result:

[INFO] Test Scenario Name: nccl-test
[INFO] Starting test: Tests.all_reduce
[INFO] Starting test: Tests.all_gather
[INFO] All jobs are complete.
exit code: 0

The dry run emitted expected local warnings for unavailable Slurm commands, but hook discovery succeeded and the complete NCCL scenario was generated.

Additional Notes

  • Live NCCL execution on a Slurm cluster was not tested.
  • AI was used only for context and guidance.

Expose an optional --hook-dir across scenario commands and pass the resolved directory into Parser so tools such as CloudAI Autotune can invoke hook-bearing scenarios outside the CloudAI repository.

Constraint: Preserve conf/hook as the parser default and keep hook-free external invocations backward compatible.

Rejected: Require callers to change into the CloudAI checkout | output paths and orchestration should not depend on process working directory.

Confidence: high

Scope-risk: narrow

Directive: Keep scenario config roots explicit at the CLI boundary when external orchestration depends on them.

Tested: uv run pytest -q (1636 passed, 4 skipped); uv run pytest -q -m ci_only (490 passed); uv run pytest --dead-fixtures; uv run pre-commit run --all-files; external NCCL dry-run from CloudAI-Autotune cwd exits 0 with --hook-dir

Not-tested: Live NCCL execution on a Slurm cluster.
Signed-off-by: shreyaskommuri <shreyaskommuri@gmail.com>
@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a --hook-dir optional CLI argument to all CloudAI commands (install, uninstall, dry-run, run, generate-report). The value flows through argparse.Namespace into the three handler functions, which pass it to Parser. Parser.__init__ gains a hook_root parameter and parse() uses instance paths instead of module-level constants.

Hook directory configuration

Layer / File(s) Summary
Parser hook_root parameter and parse() usage
src/cloudai/parser.py
Parser.__init__ accepts hook_root (defaults to HOOK_ROOT), stores self.hook_root and derives self.hook_test_root; parse() replaces constant references with instance paths and adds existence guards.
CLI --hook-dir option and command signatures
src/cloudai/cli/cli.py
common_options decorator gains --hook-dir Click option; all five command functions updated to accept and forward hook_dir in their argparse.Namespace.
Handlers pass hook_dir to Parser
src/cloudai/cli/handlers.py
handle_install_and_uninstall, _setup_system_and_scenario, and handle_generate_report each construct Parser with args.hook_dir or HOOK_ROOT.
Tests and copyright
tests/test_cli.py, tests/test_parser.py, src/cloudai/cli/cli.py
New CLI test checks --hook-dir DIRECTORY in dry-run --help; new parser test verifies hook_root/hook_test_root from constructor; copyright year bumped to 2026.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Poem

🐇 Hop hop, a new path appears,
--hook-dir now steers,
No more hardcoded root alone,
The hook finds its own home!
This rabbit digs the flexible den~ 🏠

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: making the hook directory configurable for external callers.
Description check ✅ Passed The description clearly matches the code changes and explains the configurable hook directory behavior and tests.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@shreyaskommuri shreyaskommuri marked this pull request as ready for review June 29, 2026 17:44

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@tests/test_cli.py`:
- Around line 49-55: The hook-dir help check only covers the dry-run command, so
it can miss regressions in other commands that should also include the shared
option. Update the test in test_hook_dir_is_available_on_run_commands to be
parameterized over the full command set covered by the common_options decorator,
including install, uninstall, run, dry-run, and generate-report, and invoke main
for each command to assert the help output still contains --hook-dir DIRECTORY.

In `@tests/test_parser.py`:
- Around line 42-47: The current test only checks Parser constructor state, so
it won’t catch regressions in path resolution inside parse(). Update
test_custom_hook_root_is_used to exercise Parser.parse() with the injected
hook_root and assert the parsed hook file resolution comes from that custom
directory, using Parser.parse and the parser.hook_root/hook_test_root setup to
verify the runtime behavior instead of just object fields.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 43241b99-55f5-4ceb-8a26-047f57f413e5

📥 Commits

Reviewing files that changed from the base of the PR and between 79f8bdb and 5d971f4.

📒 Files selected for processing (5)
  • src/cloudai/cli/cli.py
  • src/cloudai/cli/handlers.py
  • src/cloudai/parser.py
  • tests/test_cli.py
  • tests/test_parser.py

Comment thread tests/test_cli.py
Comment on lines +49 to +55
def test_hook_dir_is_available_on_run_commands():
runner = CliRunner()

result = runner.invoke(main, ["dry-run", "--help"])

assert result.exit_code == 0
assert "--hook-dir DIRECTORY" in result.output

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Cover all commands that now accept --hook-dir.

This only locks the help text for dry-run, but the PR contract is broader (install, uninstall, run, dry-run, generate-report). If one command later drops @common_options, this regression test still passes. Please parametrize this check over the full command set.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_cli.py` around lines 49 - 55, The hook-dir help check only covers
the dry-run command, so it can miss regressions in other commands that should
also include the shared option. Update the test in
test_hook_dir_is_available_on_run_commands to be parameterized over the full
command set covered by the common_options decorator, including install,
uninstall, run, dry-run, and generate-report, and invoke main for each command
to assert the help output still contains --hook-dir DIRECTORY.

Comment thread tests/test_parser.py
Comment on lines +42 to +47
def test_custom_hook_root_is_used(self, parser: Parser, tmp_path: Path):
hook_root = tmp_path / "hooks"
parser = Parser(parser.system_config_path, hook_root)

assert parser.hook_root == hook_root
assert parser.hook_test_root == hook_root / "test"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟡 Minor | ⚡ Quick win

Exercise parse() with the custom hook root, not just constructor state.

These assertions still pass if Parser.parse() regresses to consulting HOOK_ROOT directly. The behavior change in this PR is path resolution during parsing, so the regression test should drive parse() and verify the injected hook directory is the one actually used.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_parser.py` around lines 42 - 47, The current test only checks
Parser constructor state, so it won’t catch regressions in path resolution
inside parse(). Update test_custom_hook_root_is_used to exercise Parser.parse()
with the injected hook_root and assert the parsed hook file resolution comes
from that custom directory, using Parser.parse and the
parser.hook_root/hook_test_root setup to verify the runtime behavior instead of
just object fields.

@podkidyshev

Copy link
Copy Markdown
Contributor

we're quite busy with release + I'll be on vacations for couple weeks. I'll respond as soon as I get some spare time 🙏

@shreyaskommuri

Copy link
Copy Markdown
Contributor Author

@podkidyshev No worries! Take your time, happy to wait but I will update branch and merge conflicts once this is active again to avoid redundant work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Allow external callers to configure the hook directory

2 participants