Summary
When deploying GPU Operator in an air-gapped (offline) cluster the nvidia-driver-daemonset init container fails to start.
Root cause: the driver image ships with a public YUM repo enabled by default, which triggers yum errors in offline environments.
Additionally, the pod spec tries to mount /etc/yum.repos.d/redhat.repo (HostPath, File) but the file is absent, so the kubelet rejects the volume with hostPath type check failed.
Node OS: RHEL 8.10
We rebuilt the driver image and **removed /etc/yum.repos.d/redhat.repo
|
volMountSubscriptionName := fmt.Sprintf("subscription-config-%d", num) |
Related issue
Summary
When deploying GPU Operator in an air-gapped (offline) cluster the
nvidia-driver-daemonsetinit container fails to start.Root cause: the driver image ships with a public YUM repo enabled by default, which triggers
yumerrors in offline environments.Additionally, the pod spec tries to mount
/etc/yum.repos.d/redhat.repo(HostPath, File) but the file is absent, so the kubelet rejects the volume withhostPath type check failed.Node OS: RHEL 8.10
We rebuilt the driver image and **removed
/etc/yum.repos.d/redhat.repogpu-operator/internal/state/driver_volumes.go
Line 203 in cc4abab
Related issue