Skip to content

Update new fixed-AR-MTP CI workflow for kimik2.5_int4, kimik2.5_fp4, …#1633

Draft
haic0 wants to merge 2 commits into
mainfrom
haichen/Fixed-AR-MTP-benchmark
Draft

Update new fixed-AR-MTP CI workflow for kimik2.5_int4, kimik2.5_fp4, …#1633
haic0 wants to merge 2 commits into
mainfrom
haichen/Fixed-AR-MTP-benchmark

Conversation

@haic0
Copy link
Copy Markdown
Collaborator

@haic0 haic0 commented Jun 1, 2026

[Summary] Implemented and validated CI support for the new Eagle3 and fixed-AR MTP benchmark paths.

For amd-master.yaml, added matrix coverage for:

kimik2.5-int4-mi355x-vllm-eagle3
kimik2.5-mxfp4-mi355x-vllm-eagle3
minimaxm2.5-fp8-mi355x-vllm-eagle3
kimik2.5-int4-mi355x-vllm-fixed-ar-mtp
kimik2.5-fp4-mi355x-vllm-fixed-ar-mtp
Added benchmark scripts for:

Kimi INT4 Eagle3
Kimi FP4/MXFP4 Eagle3
MiniMax FP8 Eagle3
Kimi INT4 fixed-AR MTP
Kimi FP4 fixed-AR MTP
MiniMax FP8 Eagle3 fixed-AR support
Updated CI workflow plumbing:

Added fixed-ar-mtp scenario support in matrix validation and generation.
Updated e2e-tests.yml to route fixed-AR MTP jobs through benchmark-tmpl.yml.
Updated benchmark-tmpl.yml to pass draft model, speculative token count, rejection method, and synthetic acceptance rates into benchmark scripts.
Updated launch_mi355x-amds.sh to resolve Eagle3 and fixed-AR MTP script names correctly.
Added mtp-fixed-ar-amd.yml for LiveCodeBench-based synthetic acceptance-rate generation.
Installed and ran actionlint; fixed workflow lint issues.
Validation completed:

…and minimaxm2.5_fp8 models

Signed-off-by: root <root@gbt350-odcdh5-wbb3.png-odc.dcgpu>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jun 1, 2026

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your single node PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Drop local benchmark outputs and logs from version control so the PR only contains CI workflow and benchmark script changes.

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant