Skip to content

[WIP] Update DSv4 B300 vllm image tag#1588

Open
wzhao18 wants to merge 3 commits into
mainfrom
nv/dsv4-b300-vllm-0.22.0
Open

[WIP] Update DSv4 B300 vllm image tag#1588
wzhao18 wants to merge 3 commits into
mainfrom
nv/dsv4-b300-vllm-0.22.0

Conversation

@wzhao18
Copy link
Copy Markdown
Collaborator

@wzhao18 wzhao18 commented May 29, 2026

Note

Low Risk
Single-field image version bump in benchmark metadata with no application code or scenario changes.

Overview
Bumps the DeepSeek-V4-Pro FP4 B300 vLLM benchmark recipe (dsv4-fp4-b300-vllm) container image from vllm/vllm-openai:v0.21.0 to v0.22.0 in nvidia-master.yaml. Model, runner, scenarios, and search-space entries are unchanged.

perf-changelog.yaml records the image tag update for this config key.

Reviewed by Cursor Bugbot for commit 82584fa. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

Comment thread perf-changelog.yaml Outdated
Comment on lines +3204 to +3207
- config-keys:
- dsv4-fp4-b300-vllm
description:
- "Update DSv4 FP4 B300 vLLM image tag to v0.22.0" No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new dsv4-fp4-b300-vllm entry at the bottom of perf-changelog.yaml is missing the pr-link field that every other one of the 400 entries in the file carries, breaking changelog→PR traceability for this update. The file also no longer ends with a trailing newline (\ No newline at end of file in the diff). Adding pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1588 plus a trailing newline fixes both at once.

Extended reasoning...

What the bug is

perf-changelog.yaml follows a strict, fully-uniform convention: each entry has config-keys, description, and pr-link. The new entry added by this PR (lines 3204-3207) breaks that convention in two ways:

  1. Missing pr-link field. A quick count confirms the asymmetry: the file has 401 - config-keys: entries but only 400 pr-link: lines after this PR — the new dsv4-fp4-b300-vllm block is the sole exception. The five most recently added entries above it (PRs [NV] B300 (Agg): migrate model path #1539, [AMD] Add Qwen3.5 FP8 MI355X SGLang disaggregated benchmark #1570, [MoRI short term temp patch] GLM-5 FP8 MI355X SGLang disaggregated #1572, [MoRI short term patch] add qwen3.5-fp4 MI355X SGLang PD-disaggregation #1579, and the older ones) all end with a pr-link line such as pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1579.
  2. No trailing newline. The diff explicitly shows \ No newline at end of file under the new block. Every prior entry in the file ends with a newline.

Why it matters / impact

The changelog is the canonical map from a config-key change to the PR that introduced it. Dropping pr-link for this entry means a future reader looking at dsv4-fp4-b300-vllm in the changelog has no in-file pointer back to PR #1588 — they have to git log/git blame to recover the link that every other row gives them for free. For a file that has held this convention across 400 entries, the inconsistency is the bug, not the missing newline per se.

Proof — step by step

  1. grep -c '^- config-keys:' perf-changelog.yaml401 after this PR.
  2. grep -c '^ pr-link:' perf-changelog.yaml400 after this PR.
  3. Diff against main: pre-PR both counts are 400. The PR adds exactly one config-keys block (lines 3204-3207) and zero pr-link lines, producing the 401/400 mismatch.
  4. tail -c1 perf-changelog.yaml | xxd → would show the last byte is " (closing the description string), not \n, matching the diff's \ No newline at end of file marker. Pre-PR the file's last byte was \n (line 3203 was a blank separator).

Why existing code/process doesn't prevent it

There is no schema or pre-commit check enforcing the presence of pr-link or a final newline on this file; the convention is maintained by author discipline and review. The PR is marked [WIP], which is consistent with the field simply not having been filled in yet.

How to fix

Append the missing field plus a newline to the entry:

- config-keys:
    - dsv4-fp4-b300-vllm
  description:
    - "Update DSv4 FP4 B300 vLLM image tag to v0.22.0"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1588

…and make sure the file ends with a trailing \n. Note that adding pr-link on its own line and saving with a final newline resolves the trailing-newline complaint at the same time, so this is a single one-line edit on the author's side.

On the refutation of the trailing-newline issue

A reviewer argued the missing trailing newline is too trivial to surface. We agree it is purely a style/POSIX nit in isolation — that's why it is folded into this single nit-severity report alongside the pr-link finding rather than filed separately. Fixing pr-link will incidentally fix the newline, so there is no extra ask on the author; the note is just so they don't accidentally re-introduce the same trailing-byte situation when adding the link.

@github-actions
Copy link
Copy Markdown
Contributor

@wzhao18
Copy link
Copy Markdown
Collaborator Author

wzhao18 commented May 29, 2026

@Oseltamivir Could you help take a look at the error in the sweep related to the new vllm image?

@github-actions
Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant