Add nvidia-vss-nims 3.1.15 (NVIDIA VSS Blueprint 3.1.0)#241
Open
blik616287 wants to merge 12 commits into
Open
Add nvidia-vss-nims 3.1.15 (NVIDIA VSS Blueprint 3.1.0)#241blik616287 wants to merge 12 commits into
blik616287 wants to merge 12 commits into
Conversation
…base) Drop the gated nvcr cosmos-reason2-8b + vss-rt-embed images from pack content; content.images is now just cgr.dev/chainguard/wolfi-base (0/0). Each gated NIM rootfs is crane-fetched as runtime DATA at deploy and its native entrypoint (start_server.sh / start_rtvi_embed.sh) is run NON-PRIVILEGED on wolfi-base via the SAME matched-ld + CUDA toolchain block GB10-validated for the vLLM pack (real ptxas/cuda paths, gcc --sysroot wrappers, comprehensive lib path, triton paths). NOTE: live functional validation of these two GPU NIMs is pending a free GPU slot — the GB10s two GPUs are held by the running route (cosmos is the live VLM). The build-on-deploy mechanics are those proven serving nemotron via vLLM (HTTP 200). The running containers carry NVIDIA CUDA-stack CVEs as runtime data, not pack images. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…out) GPU NIMs under RollingUpdate deadlock (new pod Pending on Insufficient nvidia.com/gpu while the old holds the slot); Recreate terminates the old first. Found on live GB10 deploy. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cosmos nvidia_entrypoint.sh does exec /bin/bash + start_server.sh is #!/bin/bash; the rootfs bash was only swapped to /bin/sh. cp it to /bin/bash too. Found on live GB10 deploy. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ths) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…lves nim_llm_sdk) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
3.1.14 validated serving cosmos VLM (nim 3.1.14, bf16) on a GB10 edge cluster, 200 on /v1/health/ready. Merged upstream/main (crane-manifest validator fix). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a self-contained dGPU NVDEC decode backend for rtvi-embed (PyNvVideoCodec via libnvcuvid, chroot build-on-deploy + cosmos.enabled gate) so the search-profile embedder runs on DGX-Spark dGPU-mode nodes where the stock DeepStream/pyds decoder can't load. rtvi-embed reached 1/1 ready with the NVDEC DecoderProcess warmed up while pyds plugins failed — confirming the NVDEC path. Live-validated on a GB10 (DGX Spark) edge cluster via the cluster-profile pipeline. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
nvidia-vss-nims — VSS 3.x, pack
3.1.1VSS 3.x NIM model backends as a Helm chart — the Cosmos-Reason2-8B VLM (
nvcr.io/nim/nvidia/cosmos-reason2-8b) that the vss-agent calls for visual understanding, plus the rt-embed NIM. Gated nvcr.io images (pulled on-cluster via ngc-pull-secret).Versioning: chart/pack
version: 3.1.1(our packaging) ·appVersion: 3.1.0(upstream NVIDIA VSS Blueprint 3.x). Helm chart; images pinned invalues.yamlpack.content.images.Tested on NVIDIA GB10 / DGX Spark (arm64 SBSA)
Deployed via Palette add-on cluster profile
vss-dgx-spark-3xon edge clusteredge-gx10(single GB10). Full VSS 3.x route green — all 5 packs reportPack services are ready, cluster Running:Validation:
pack.jsonJSON-syntax/schema/version, logo, README, andpack.content.imagesall pass. Thecontent.imagespull (crane) fails for the gatednvcr.io/nim/*andnvcr.io/nvidia/vss-core/*images — the CI runner has no NGC credentials (same image-pull exception as the 2.4 PRs #233–236; the cluster pulls them viangc-pull-secret).