[NV] Add MiniMax-M2.5 FP4 B200 Dynamo vLLM recipes by jasonlizhengjian · Pull Request #1643 · SemiAnalysisAI/InferenceX

jasonlizhengjian · 2026-06-02T18:58:49Z

Summary

Add B200 MiniMax-M2.5 FP4 Dynamo vLLM recipes.

Note

Low Risk
Adds benchmark and CI launch configuration only; no application runtime, auth, or data-path changes.

Overview
Adds MiniMax-M2.5 NVFP4 disaggregated Dynamo + vLLM multinode benchmarks on B200, parallel to the existing FP8 B200 entry.

Registers minimaxm2.5-fp4-b200-dynamo-vllm in nvidia-master.yaml with fixed-seq-len scenarios at 1k/1k and 8k/1k, mapping concurrency sweeps to prefill/decode worker layouts (TP4, TP4+EP, dep2/dep4/dep8, multi-decode workers, etc.) via CONFIG_FILE paths under recipes/vllm/minimax-m2.5-b200-fp4/.

Introduces the corresponding srt-slurm recipe tree under benchmarks/multi_node/srt-slurm-recipes/vllm/minimax-m2.5-b200-fp4/ (1k1k and 8k1k variants) and wires the B200 DGXC launcher to clone srt-slurm and copy those recipes when running FP4 minimax. Also aligns FP4 SRT_SLURM_MODEL_PREFIX to minimax-m2.5-nvfp4 and records the change in perf-changelog.yaml.

^{Reviewed by Cursor Bugbot for commit ae62e58. Bugbot is set up for automated code reviews on this repo. Configure here.}

github-actions · 2026-06-02T18:59:01Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude · 2026-06-02T19:08:56Z

+    - minimaxm2.5-fp4-b200-dynamo-vllm
+  description:
+    - "Add MiniMax-M2.5 NVFP4 B200 disaggregated multinode vLLM benchmarks via Dynamo"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/TBD


🟡 The new perf-changelog entry has pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/TBD (line 3403) instead of the actual PR number (1643). After merge, this link will 404 permanently, while every other entry in the file points to a real PR. Please update TBD → 1643 before merging so the changelog stays traceable.

Extended reasoning...

What the bug is\n\nIn the new entry added to perf-changelog.yaml:\n\nyaml\n- config-keys:\n - minimaxm2.5-fp4-b200-dynamo-vllm\n description:\n - "Add MiniMax-M2.5 NVFP4 B200 disaggregated multinode vLLM benchmarks via Dynamo"\n pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/TBD\n\n\nthe pr-link value still contains the literal string TBD instead of the actual PR number. The placeholder was clearly added when the PR was drafted and never updated before requesting review.\n\nWhy this matters\n\nEvery other entry in perf-changelog.yaml resolves to a real PR — e.g. immediately above this new block (lines 3373, 3379, 3385, 3391, 3397) the entries use real numbers like pull/1626, pull/1627, pull/1630, pull/1631, pull/1588. The whole point of the pr-link field is to let someone reading the changelog jump straight to the PR that introduced a given config key. With TBD baked in, the link https://github.com/SemiAnalysisAI/InferenceX/pull/TBD will permanently 404 — GitHub does not interpret TBD as anything and there is no branch named TBD.\n\nStep-by-step proof\n\n1. Open the diff for perf-changelog.yaml — only one block was added (the minimaxm2.5-fp4-b200-dynamo-vllm entry).\n2. The last line of that block is pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/TBD.\n3. This PR's own number is #1643 (visible in the PR metadata).\n4. Click https://github.com/SemiAnalysisAI/InferenceX/pull/TBD after merge → GitHub returns 404 since TBD is not a valid PR id.\n5. Compare to the immediately preceding entry (pull/1627 for the DSv4-Pro FP4 MI355X ATOM MTP3 changelog row) — that link resolves correctly.\n\nImpact\n\nNo runtime impact: the file is metadata for the changelog, not consumed by the benchmark runner. The only damage is loss of traceability — anyone trying to figure out which PR introduced the minimaxm2.5-fp4-b200-dynamo-vllm key from the changelog will hit a dead link and have to fall back to git log / git blame.\n\nHow to fix\n\nReplace pull/TBD with pull/1643 on line 3403 before merging:\n\nyaml\n pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1643\n

github-actions · 2026-06-02T19:17:35Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26842048264
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26842048264

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

There are 2 total unresolved issues (including 1 from previous review).

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit 00ef06a. Configure here.}

github-actions · 2026-06-03T01:52:05Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26842607912
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26842607912

functionstackx · 2026-06-03T03:21:48Z

/reuse-sweep-run

github-actions · 2026-06-03T03:24:07Z

see unofficial run visualizer at https://inferencex.semianalysis.com/inference?unofficialRun=26861732820
see unofficial run visualizer at https://inferencex.semianalysis.com/evaluation?unofficialRun=26861732820

Add MiniMax-M2.5 FP4 B200 Dynamo vLLM recipes

300adb4

jasonlizhengjian requested a review from a team June 2, 2026 18:58

jasonlizhengjian requested review from jgangani and kedarpotdar-nv as code owners June 2, 2026 18:58

github-project-automation Bot added this to InferenceMAX Board Jun 2, 2026

Update B200 MiniMax changelog PR link

9dfc6c8

cursor Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread runners/launch_b200-dgxc-slurm.sh

claude Bot reviewed Jun 2, 2026

View reviewed changes

jasonlizhengjian added the full-sweep-enabled label Jun 2, 2026

Fix B200 MiniMax Slurm account defaults

00ef06a

cursor Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread benchmarks/multi_node/srt-slurm-recipes/vllm/minimax-m2.5-b200-fp4/1k1k/tp4-1p1d.yaml

Merge branch 'main' into nv/jasonli/minimaxm2.5-fp4-b200-dynamo-vllm

ae62e58

functionstackx merged commit 0316b19 into main Jun 3, 2026
5 of 6 checks passed

functionstackx deleted the nv/jasonli/minimaxm2.5-fp4-b200-dynamo-vllm branch June 3, 2026 03:23

github-project-automation Bot moved this to Done in InferenceMAX Board Jun 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NV] Add MiniMax-M2.5 FP4 B200 Dynamo vLLM recipes#1643

[NV] Add MiniMax-M2.5 FP4 B200 Dynamo vLLM recipes#1643
functionstackx merged 4 commits into
mainfrom
nv/jasonli/minimaxm2.5-fp4-b200-dynamo-vllm

jasonlizhengjian commented Jun 2, 2026 •

edited by cursor Bot

Loading

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

Uh oh!

claude Bot Jun 2, 2026

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

functionstackx commented Jun 3, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jasonlizhengjian commented Jun 2, 2026 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

Uh oh!

claude Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

functionstackx commented Jun 3, 2026

Uh oh!

Uh oh!

github-actions Bot commented Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jasonlizhengjian commented Jun 2, 2026 •

edited by cursor Bot

Loading