MatchCaster

An AI football commentary engine. Replays real StatsBomb match data with live synthesized audio commentary from two LLM-powered voices (play-by-play + analyst), orchestrated by a tick-driven Director and displayed on an interactive pitch visualizer.

flowchart LR
    A[📦 Match data<br/>StatsBomb JSON] --> B[⏱ Replay Engine<br/>Clock + event emitter]
    B --> C[🧠 Analysis<br/>Momentum, xG, patterns]
    C --> D[🎬 Director<br/>Tick-driven orchestrator]
    D --> E[🗣 Commentary<br/>PBP + Analyst voices]
    E --> F[🔊 TTS<br/>Text → audio]
    B --> G[📡 Live Events]
    F --> H[🖥 Frontend<br/>Pitch + controls + audio]
    G --> H

In plain words: MatchCaster reads a real match file, "replays" the game second by second, prepares commentary slightly ahead of time, converts it to audio, and sends both visuals + sound to the live interface.

What is StatsBomb data? StatsBomb open-data is a free collection of detailed event-level football match records — every pass, shot, tackle, and dribble with pitch coordinates, timestamps, and metadata. MatchCaster uses these JSON files as its replay source.

Screenshots

Select a match	Live replay with stats & overlays	Live event feed

Quick Start

1) Prerequisites

Python 3.11+
Node.js 18+
make (pre-installed on macOS/Linux, or install via WSL on Windows)

Optional:

Groq API key (default cloud mode)
Ollama (for fully local mode)

2) Install dependencies

# from repo root
make setup

This will:

Create a Python virtual environment (.venv)
Install Python dependencies from backend/requirements.txt
Install frontend dependencies from frontend/package.json
Download match data from StatsBomb open-data

3) Run

# Cloud mode (default)
export GROQ_API_KEY=your_key_here
make run

# Local mode (offline)
# brew install ollama
# ollama pull gemma2:2b-instruct-q4_K_M
make run-local

Open http://localhost:5173

Commands

Command	Description
`make help`	Show all available commands
`make setup`	Full setup (venv, deps, data)
`make install`	Install dependencies only
`make data`	Download match data
`make run`	Run in cloud mode (Groq, default)
`make run-cloud`	Same as `make run`
`make run-local`	Run in local mode (Ollama)
`make dev-backend`	Run only backend with live reload
`make dev-frontend`	Run only frontend dev server
`make clean`	Remove venv and cleanup
`make stop`	Stop all running processes
`make test`	Run test suite
`make verify`	Verify setup is complete

Legacy start.sh

The Makefile is the recommended interface. start.sh is the original launcher kept for backward compatibility — make run and make run-local call it under the hood with the appropriate mode flag.

./start.sh          # cloud (groq, default)
./start.sh groq     # explicit cloud
./start.sh local    # local ollama

Notes

Backend deps: backend/requirements.txt (Python / pip)
Frontend deps: frontend/package.json (Node / npm)
The Makefile automatically handles virtual environment creation and dependency installation
Run make verify to check if your setup is complete

Using the app

Select a match — the launch screen appears automatically. Pick a match, choose a commentary style, then click Watch Live →.
Controls — the video player bar at the bottom:
- ▶ / ⏸ — play and pause
- −30s −10s +10s +30s — jump backward or forward
- Click the seek bar to jump to any point in the match
- Speed buttons 0.5× 1× 2× 4× 8× — control replay speed
- 🔊 — mute/unmute audio commentary
- ⚙ — open the Overlay Panel (pitch view and settings)
- Change — go back to the match selection screen
Overlay Panel (opened with ⚙):
- Live — real-time event markers and pass trails on the pitch
- Formation — starting lineup with jersey numbers and player names
- Heatmap — territory map for home or away team
- Shots — all shot locations, sized by xG, colored by outcome
- Build-up — directional pass flow arrows by zone
Sidebar tabs:
- Stats — momentum bar, possession, shots, xG, passes, fouls, cards
- Live — key events feed (goals, cards, big chances) or full event log
- Squad — starting lineup with positions and goal contributions

Commentary styles:

Style	Character
🎙 Neutral	Balanced, professional
🔥 Enthusiastic	High energy, emotional
📐 Analytical	Tactical depth, data-driven
🏠 Home Fan	Biased toward the home side
✈️ Away Fan	Biased toward the away side

Architecture

System Overview (non-technical)

Think of MatchCaster like a live TV production team:

Replay Engine = the control room replaying the match timeline
Director = the producer deciding when each commentator speaks and what they see
AI Commentators = the voices (live action narrator + expert analyst)
Speech Engine = turns scripts into spoken audio
Frontend = what the viewer sees and hears in real time

How Commentary Stays in Sync

The app guarantees seamless commentary by always generating it before it's needed. The system works like a buffer:

 Game timeline (seconds):
  0s        15s       30s       45s       60s       75s
  ├─────────┼─────────┼─────────┼─────────┼─────────┤
  │ Block 1 │ Block 2 │ Block 3 │ Block 4 │         │
  │ ready ✓ │ ready ✓ │ ready ✓ │ generating...     │
  └─────────┴─────────┴─────────┴─────────┴─────────┘
       ↑ playing now                ↑ frontier
       Clock = 10s                  Always stays ahead

Lifecycle of a session:

Loading — User selects a match. The backend warms up TTS + LLM, classifies all events, and pre-computes analyst quiet windows (time slots far from goals/shots/cards). Then pre-generates 4 PBP blocks (each covering 15 game-seconds). The play button only appears once 2 blocks are fully ready (text + audio synthesized).
Playing — The match clock advances in real time. Every 50ms, the Director checks: are any blocks due? If a block's start time has passed, it dispatches the audio + text to the frontend. Meanwhile, it keeps generating new PBP blocks at the frontier so the buffer never runs dry. When the clock nears an analyst window (60 game-sec lead time), it spawns analyst generation — if ready before the slot, it replaces PBP for that window.
Seek — User jumps to a new time. The Director increments its epoch (a version counter), clears the buffer, recomputes analyst windows from the new position, and starts generating blocks. Any in-flight work from the old position carries the old epoch and is silently discarded when it completes. The app shows a loading state until 2 blocks are ready, then resumes seamlessly.
Speed change — Same as seek: epoch increments, buffer clears, analyst windows recompute, and blocks regenerate (because block duration changes with speed). Brief loading, then seamless playback at the new pace.
Buffer safety net — If the LLM is too slow and the buffer empties during playback, the clock pauses automatically, a loading state appears, and playback resumes once blocks are ready again.

The epoch mechanism is what makes seek/speed changes robust. Every block is tagged with the epoch it was born in. On dispatch, stale-epoch blocks are discarded — no race conditions, no ghost commentary from a previous position.

Technical Flow

flowchart TD
    subgraph Data
        M[StatsBomb Match JSON]
    end

    subgraph Backend
        P[player.loader + player.emitter]
        A[analyser.engine + state + classifier]
        S[director.analyst_scheduler<br/>Pre-computes quiet windows]
        D[director.router<br/>Tick-driven, epoch-based]
        C[commentator.agents<br/>PBP + Analyst]
        L[commentator.llm<br/>Groq or Ollama]
        T[commentator.tts.engine<br/>Piper / macOS say fallback]
        Q[commentator.queue<br/>TimeBlockQueue]
        W[ws.handler]
    end

    subgraph Frontend
        F[React UI<br/>Pitch + overlays + controls]
    end

    M --> P
    P --> A
    P --> S
    S --> D
    A --> D
    P --> D
    D --> C
    C --> L
    L --> C
    C --> T
    T --> Q
    P --> W
    A --> W
    Q --> W
    W --> F

Commentator Roles

Play-by-Play — narrates the action as flowing paragraphs. One block per 15 game-seconds (scales with playback speed). Handles everything: goals, shots, cards, substitutions, quiet build-up. Receives analyst context to weave into narration. Handles the opening scene-setter. Always in sync with the game clock.

Analyst — macro reflection voice. Speaks infrequently (every 5–7 game-minutes) during pre-computed quiet windows — periods far from goals, shots, and cards. Silent for the first 5 minutes (PBP owns the opening). Reads from the AnalysisEngine's statistical snapshot (momentum, possession, xG, pressing patterns) and turns it into a brief tactical observation. Feeds context back to PBP for richer narration.

The analyst never interrupts PBP. During loading, the system scans the full event timeline and pre-schedules analyst slots in safe quiet periods. If the analyst's LLM generation is ready before its slot, it replaces the PBP block for that window (which would be generic filler). If not ready in time, PBP plays normally — no gap, no delay.

File Structure

backend/
├── config.py                All tunables
├── main.py                  FastAPI app + HTTP routes
│
├── player/
│   ├── clock.py             Async accelerated match clock (50 ms ticks)
│   ├── loader.py            StatsBomb JSON → MatchEvent dataclasses
│   └── emitter.py           Replay session management + seek support
│
├── analyser/
│   ├── classifier.py        Event priority: critical / notable / routine
│   ├── state.py             SharedMatchState (score, possession, stats)
│   ├── engine.py            Real-time match analysis (momentum, xG, vectors)
│   ├── spatial.py           Coordinate → pitch zone descriptions
│   └── enrichment/
│       ├── match_meta.py    Stadium, date, manager lookup
│       ├── weather.py       Historical weather via Open-Meteo
│       └── team_colors.py   Kit colors for ~40 teams
│
├── director/
│   ├── router.py            Tick-driven orchestrator: epoch-based block
│   │                        generation, dispatch, analyst scheduling
│   └── analyst_scheduler.py Pre-computes quiet windows for analyst slots
│
├── commentator/
│   ├── agents/
│   │   ├── base.py          BaseAgent ABC + prompt assembly
│   │   ├── play_by_play.py  Live action narration (flow-block output)
│   │   ├── analyst.py       Expert macro commentary (replaces tactical+stats)
│   │   └── prompts.py       System prompts + user prompt builders
│   ├── llm/
│   │   ├── __init__.py      Backend singleton (get_backend / init_backend)
│   │   ├── backend.py       LLMBackend ABC
│   │   ├── groq.py          Groq cloud backend (OpenAI-compatible SSE)
│   │   └── ollama.py        Ollama local backend
│   ├── tts/
│   │   ├── engine.py        Piper TTS wrapper → WAV bytes (+ macOS say fallback)
│   │   └── voices.py        Agent → voice model mapping
│   └── queue.py             AudioQueue + TimeBlockQueue (epoch-tagged dispatch)
│
└── ws/
    └── handler.py           WebSocket session: events, audio, state, seek

Configuration

All tunables live in backend/config.py:

Key	Default	Description
`DEFAULT_SPEED_MULTIPLIER`	`1.0`	Replay speed on startup
`LLM_BACKEND`	`groq`	`"groq"` (cloud) or `"local"` (Ollama)
`GROQ_MODEL`	`llama-3.1-8b-instant`	Groq model
`OLLAMA_MODEL`	`gemma2:2b-instruct-q4_K_M`	Ollama model (local mode only)
`OLLAMA_TIMEOUT_SEC`	`90.0`	Per-call timeout for Ollama streaming
`MAX_OUTPUT_TOKENS`	`50`	Hard token cap per commentary line
`PBP_BLOCK_DURATION_GAME_SEC`	`15.0`	Game-seconds per commentary block (scales with speed)
`PBP_BLOCKS_AHEAD`	`4`	Buffer depth: blocks kept pre-generated ahead
`LOADING_MIN_BLOCKS_READY`	`2`	Blocks required before playback can start
`ANALYST_MIN_GAP_GAME_SEC`	`300.0`	Minimum gap between analyst windows
`ANALYST_MAX_GAP_GAME_SEC`	`420.0`	Maximum gap between analyst windows
`ANALYST_BLOCK_FIRST_SEC`	`300.0`	Analyst silent for first 5 game-minutes
`ANALYST_EXCLUSION_PRE`	`30.0`	No analyst within 30s before critical events
`ANALYST_EXCLUSION_POST`	`45.0`	No analyst within 45s after critical events
`ANALYST_LEAD_TIME_GAME_SEC`	`60.0`	Start analyst generation 60 game-sec before slot
`MAX_EVENTS_PER_BATCH`	`8`	Max events sent to LLM per block

Graceful degradation

Failure	Fallback
LLM unavailable / slow	Template commentary ("Shot — great save!")
LLM too slow (buffer empties)	Clock pauses, loading state, auto-resumes when buffer refills
Analyst LLM too slow	PBP plays normally for that window — analyst simply absent
Piper TTS not installed	macOS `say` built-in voices
Piper TTS crashes	macOS `say` built-in voices
Audio queue overflow	Oldest items dropped
WebSocket disconnect	Auto-reconnect after 2 s
Unknown match ID	No metadata shown, colors use defaults

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
backend		backend
data		data
docs		docs
frontend		frontend
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
start.sh		start.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MatchCaster

Screenshots

Quick Start

1) Prerequisites

2) Install dependencies

3) Run

Commands

Legacy start.sh

Notes

Using the app

Architecture

System Overview (non-technical)

How Commentary Stays in Sync

Technical Flow

Commentator Roles

File Structure

Configuration

Graceful degradation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MatchCaster

Screenshots

Quick Start

1) Prerequisites

2) Install dependencies

3) Run

Commands

Legacy start.sh

Notes

Using the app

Architecture

System Overview (non-technical)

How Commentary Stays in Sync

Technical Flow

Commentator Roles

File Structure

Configuration

Graceful degradation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages