Skip to content

[enhancement] : use only major version for rhel-specific driver images#2497

Open
rahulait wants to merge 1 commit into
NVIDIA:mainfrom
rahulait:use-major-version-for-rhel
Open

[enhancement] : use only major version for rhel-specific driver images#2497
rahulait wants to merge 1 commit into
NVIDIA:mainfrom
rahulait:use-major-version-for-rhel

Conversation

@rahulait
Copy link
Copy Markdown
Contributor

@rahulait rahulait commented May 27, 2026

Description

Fixes: #2471

Checklist

  • No secrets, sensitive information, or unrelated changes
  • Lint checks passing (make lint)
  • Generated assets in-sync (make validate-generated-assets)
  • Go mod artifacts in-sync (make validate-modules)
  • Test cases are added for new code paths

Testing

Brought up a cluster using following values.yaml file:

cat values.yaml
operator:
  repository: ghcr.io/nvidia
  image: gpu-operator
  version: ddf7c20b
  imagePullPolicy: Always
validator:
  repository: ghcr.io/nvidia
  image: gpu-operator
  version: ddf7c20b
  imagePullPolicy: Always
driver:
  enabled: true
  version: "595.71.05"

Installed gpu-operator using it:

helm install gpu-operator -n gpu-operator --create-namespace nvidia/gpu-operator -f values.yaml

Once installed, verified that all operands came up fine

NAME                                                          READY   STATUS      RESTARTS   AGE
gpu-feature-discovery-hwsvg                                   1/1     Running     0          2m18s
gpu-operator-54955ddd6-5vxwt                                  1/1     Running     0          2m48s
gpu-operator-node-feature-discovery-gc-8fb8d5d8d-5zlrz        1/1     Running     0          2m48s
gpu-operator-node-feature-discovery-master-5bbc6d887b-tf6pl   1/1     Running     0          2m48s
gpu-operator-node-feature-discovery-worker-5sjjm              1/1     Running     0          2m48s
nvidia-container-toolkit-daemonset-72ncv                      1/1     Running     0          2m18s
nvidia-cuda-validator-6q8gv                                   0/1     Completed   0          71s
nvidia-dcgm-exporter-whqpl                                    1/1     Running     0          2m17s
nvidia-device-plugin-daemonset-ddfvj                          1/1     Running     0          2m17s
nvidia-driver-daemonset-52k6r                                 1/1     Running     0          2m24s
nvidia-mig-manager-w67wn                                      1/1     Running     0          2m17s
nvidia-operator-validator-2zpbq                               1/1     Running     0          2m17s

Verified driver container is using generic image:

k get daemonset/nvidia-driver-daemonset -n gpu-operator -o yaml | grep image | grep rhel
        image: nvcr.io/nvidia/driver:595.71.05-rhel9

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 27, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@rahulait
Copy link
Copy Markdown
Contributor Author

/ok to test 53e32d6

Signed-off-by: Rahul Sharma <rahulsharm@nvidia.com>
@rahulait rahulait force-pushed the use-major-version-for-rhel branch from 53e32d6 to ddf7c20 Compare May 27, 2026 18:50
@rahulait rahulait marked this pull request as ready for review May 27, 2026 18:50
@rahulait rahulait changed the title [WIP] use only major version for rhel-specific driver images [enhancement] : use only major version for rhel-specific driver images May 27, 2026
@rahulait rahulait enabled auto-merge May 27, 2026 19:36
@rahulait rahulait disabled auto-merge May 27, 2026 19:40
@rahulait
Copy link
Copy Markdown
Contributor Author

This still won't work for GDS/GDRCopy images. We'll need to have generic tags for them as well 😢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Enhancement]: Enhance RHEL Driver Image Selection to Use Major-Version Tags for RHEL 8 and RHEL 9

2 participants