Skip to content

fix benchmark template#552

Open
Dogacel wants to merge 1 commit into
sgl-project:mainfrom
Dogacel:fix-benchmark-template
Open

fix benchmark template#552
Dogacel wants to merge 1 commit into
sgl-project:mainfrom
Dogacel:fix-benchmark-template

Conversation

@Dogacel
Copy link
Copy Markdown

@Dogacel Dogacel commented Apr 30, 2026

Motivation

Chat template of some models (gpt-oss, qwen3...) are not rendered correctly in SGLang, causing
accuracy to be reported incorrectly.

This patch checks if SGLang consists a template for the given model, if not it handles chat formatting manually using HF tokenizer and sends the templated text directly to SGLang.

Modifications

  1. Auto detect chat template format and override it if sglang fails to auto-detect.
  2. Fix GSM8K benchmark not applying the chat format correctly.

Also adds 3 new benchmarks:

Related Issues

Fixes #551

Accuracy Test

Tested on 3 model families with different template formats gpt-oss-20b, qwen3 8B, llama3.1 8B.

Any model that reported their accuracy using the previous benchmark should ensure SGLang was able to auto-detect the template. Moreover all models should re-report their GSM8K benchmark results because not applying template correctly causes accuracy drops.

Checklist

@Dogacel Dogacel requested a review from FrankLeeeee as a code owner April 30, 2026 04:47
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Chat Template Not Applied Correctly to Recent Models

1 participant