Skip to content

Add Primus pretrain integration (primus_pretrain)#146

Open
coketaste wants to merge 9 commits into
ROCm:developfrom
coketaste:coketaste/primus
Open

Add Primus pretrain integration (primus_pretrain)#146
coketaste wants to merge 9 commits into
ROCm:developfrom
coketaste:coketaste/primus

Conversation

@coketaste
Copy link
Copy Markdown
Contributor

  • Add scripts/Primus submodule (AMD-AGI/Primus) and docker/primus.ubuntu.amd.Dockerfile
  • Add scripts/primus_pretrain (run.sh, get_models_json.py, extract_primus_perf.py)
  • Register primus_pretrain in models.json before existing primus_pyt_* entries
  • Un-ignore tracked perf manifests and models.json in .gitignore

run.sh: resolve local submodule via ../Primus (sibling of primus_pretrain under scripts/).

- Add scripts/Primus submodule (AMD-AGI/Primus) and docker/primus.ubuntu.amd.Dockerfile
- Add scripts/primus_pretrain (run.sh, get_models_json.py, extract_primus_perf.py)
- Register primus_pretrain in models.json before existing primus_pyt_* entries
- Un-ignore tracked perf manifests and models.json in .gitignore

run.sh: resolve local submodule via ../Primus (sibling of primus_pretrain under scripts/).
@coketaste coketaste self-assigned this Apr 15, 2026
@coketaste coketaste requested a review from gargrahul as a code owner April 15, 2026 19:59
Copilot AI review requested due to automatic review settings April 15, 2026 19:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Integrates Primus pretraining into the MAD/madengine workflow by adding a dedicated runner script bundle, a new Primus-based Docker image, and registering a new primus_pretrain model entry so Primus pretrain jobs can be launched consistently across environments.

Changes:

  • Added scripts/primus_pretrain/ wrappers to run Primus pretrain and emit multiple_results perf CSV.
  • Added a new docker/primus.ubuntu.amd.Dockerfile and registered primus_pretrain in models.json.
  • Added scripts/Primus as a git submodule and adjusted .gitignore to track key perf/model manifests.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
scripts/primus_pretrain/run.sh Wrapper to resolve Primus root/config, run pretrain, and post-process perf metrics.
scripts/primus_pretrain/get_models_json.py Optional discovery of Primus configs as runnable models via globbing.
scripts/primus_pretrain/extract_primus_perf.py Parses training logs and writes a multiple_results CSV.
models.json Registers primus_pretrain ahead of existing Primus PyTorch training entries.
docker/primus.ubuntu.amd.Dockerfile Defines a Primus image intended to bake in the Primus submodule.
.gitmodules Adds the scripts/Primus submodule definition.
.gitignore Stops ignoring tracked perf manifests and models.json.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread scripts/primus_pretrain/run.sh
Comment thread scripts/primus_pretrain/run.sh Outdated
Comment thread scripts/primus_pretrain/get_models_json.py Outdated
Comment thread docker/primus.ubuntu.amd.Dockerfile
Copilot AI review requested due to automatic review settings May 6, 2026 21:30
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 5 comments.

Comment thread scripts/primus_pretrain/run.sh Outdated
Comment thread docker/primus.ubuntu.amd.Dockerfile
Comment thread README.md Outdated
Comment thread models.json
Comment thread .gitmodules
coketaste and others added 3 commits May 21, 2026 10:43
- README.md: clarify Docker build and discovery notes as madengine-only;
  document that tools/run_models.py is unsupported for primus_pretrain
  due to ./docker build context incompatibility
- models.json: add multiple_results: primus_perf_output.csv to
  primus_pretrain so the runner picks up tps/tflops/mfu from the CSV
- run.sh: validate PRIMUS_ROOT path when set via env var; use set +e/-e
  around run_pretrain.sh so exitcode is captured correctly; fix comment
  on CSV row format
- get_models_json.py: pass recursive=True to glob so CONFIGS_GLOB
  pattern matches nested YAML paths correctly

Co-Authored-By: Claude Sonnet 4 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 29, 2026 15:17
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.

Comment thread models.json
coketaste and others added 3 commits May 29, 2026 11:41
Aligns with the schema documented in README.md and the convention used
by all other entries in models.json.

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 29, 2026 15:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 9 changed files in this pull request and generated 1 comment.

Comment on lines +90 to +92
if [[ -f "$LOG_PATH" ]]; then
python3 "$RUN_DIR/extract_primus_perf.py" "$LOG_PATH" "$RUN_DIR/primus_perf_output.csv" || true
fi
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants