Add debate highlights analysis: vote flips, rhetoric, and unexpected patterns by emregucerr · Pull Request #6 · emregucerr/ai-intelligence-squared

emregucerr · 2026-04-09T07:17:18Z

Summary

Comprehensive analysis of the 45 AI² benchmark debates, surfacing the most interesting and unexpected findings from the data: dramatic vote reversals, notable rhetorical strategies, self-judging bias patterns, and structural anomalies.

Key Highlights

Biggest Comeback: Debate 013 — Claude Opus 4.6 reversed a 1-8 deficit to win 8-0 on "space colonization over climate change," deploying extinction probability calculus and diminishing-returns arguments.

GPT-5.4 High's 100% Self-Opposition: The model voted against its own debating side in all 9 self-judging instances — explained by empty debater responses across nearly all debates, with the judge component honestly acknowledging the failure.

Only Perfect 10-0 Vote: Debate 031 — Gemini 3 Pro achieved unanimous support against GPT-5.4 High with lines like "The 'human in the loop' is rapidly becoming the 'human observing the loop,' and soon, the 'human outside the loop.'"

Self-Judging Bias Spectrum: Ranges from Grok Multi-Agent (89% loyal) to Claude Opus Thinking (89% abstainer) to GPT-5.4 High (100% self-critic).

The Art of the Concession: Claude models developed a distinctive pattern of strategic concession followed by reframing that judges consistently cited as credibility-building.

Topic Asymmetry: Critical/negative framings (AI jobs, social media harm, open-source AI) had 67% FOR win rates vs aspirational motions (space colonization, UBI) at 33%.

Files Changed

benchmark/results/DEBATE_HIGHLIGHTS.md — New analysis document covering vote flips, rhetorical patterns, self-judging bias, topic analysis, and notable transcript excerpts

…patterns Comprehensive analysis of the 45 AI² benchmark debates covering: - The biggest comeback (1-8 to 8-0 in debate 013) - GPT-5.4 High's 100% self-opposition rate explained by empty responses - The only perfect 10-0 vote (debate 031) - Self-judging bias patterns across all 10 models - Topic asymmetry analysis (FOR vs AGAINST win rates) - 205 individual vote flip distribution - Notable rhetorical moves and cross-examination moments - The 'Art of the Concession' as Claude's rhetorical fingerprint Co-authored-by: Emre Gucer <emregucerr@users.noreply.github.com>

vercel · 2026-04-09T07:17:26Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
ai-squared	Ready	Preview, Comment	Apr 9, 2026 7:17am

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add debate highlights analysis: vote flips, rhetoric, and unexpected patterns#6

Add debate highlights analysis: vote flips, rhetoric, and unexpected patterns#6
emregucerr wants to merge 1 commit into
mainfrom
cursor/debate-highlights-adb0

emregucerr commented Apr 9, 2026

Uh oh!

vercel Bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

emregucerr commented Apr 9, 2026

Summary

Key Highlights

Files Changed

Uh oh!

vercel Bot commented Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants