Add optional cost & token-efficiency columns to the leaderboard#41
Open
lakshvantb wants to merge 1 commit into
Open
Add optional cost & token-efficiency columns to the leaderboard#41lakshvantb wants to merge 1 commit into
lakshvantb wants to merge 1 commit into
Conversation
Adds a "Show Cost & Tokens" toggle that appends two columns — Output Tokens (avg output incl. reasoning, per question) and Cost / Question ($) — to the existing table for all models, plus an "Only Models With Cost Data" filter. - Cost data is published per-date as an optional public/cost_<date>.csv and merged into each row, so the columns sort via the existing SortTable logic. - Coverage is intentionally partial (top models): models without an entry render a muted "—" (tooltip explains coverage) and, being null, sort to the bottom. - The cost dataset is a no-op when absent, so other dates are unaffected. - cost_2026_01_08.csv covers 14 top models; values = billed output_tokens + reconstructed input_tokens (tiktoken o200k / count_tokens API / provider tokenizers) x per-model official API prices. Tested: production build compiles; a data-layer test against the real CSVs verifies merge, missing->"—", the cost-only filter (14 rows), and null-to-bottom sorting (cheapest first = deepseek-v4-pro). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds an optional cost / token-efficiency view to the leaderboard table:
?cost=true,?costonly=true) and reset by Clear Filters, matching the existing toggle pattern (showProvider,showReasoners, …).How partial coverage is handled (top-models-only)
Cost/token metrics are published only for the top set of models. This is shown elegantly, not as breakage:
null, so they sort to the bottom regardless of direction (existingSortTablenull-handling).Data
public/cost_<date>.csv(model, avg_input_tokens, avg_output_tokens, cost_per_question), merged into each row by model id so columns sort with the existing machinery.cost_2026_01_08.csvcovers 14 top models. Values = billedoutput_tokens+ reconstructedinput_tokens(tiktokeno200k_base/ Geminicount_tokensAPI / provider tokenizers for Qwen·DeepSeek·Kimi) × per-model official API prices.Testing
npm run buildcompiles cleanly (no new warnings; feature present in the bundle).—(null), cost-only filter (14 rows), and null-to-bottom sorting (cheapest first =deepseek-v4-pro, $0.029/q).Notes / follow-ups
cost_<date>.csvcoverage as more models are validated.🤖 Generated with Claude Code