Skip to content

hippograndet/MatchCaster

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MatchCaster

An AI football commentary engine. Replays real StatsBomb match data with live synthesized audio commentary from two LLM-powered voices (play-by-play + analyst), orchestrated by a tick-driven Director and displayed on an interactive pitch visualizer.

flowchart LR
    A[📦 Match data<br/>StatsBomb JSON] --> B[⏱ Replay Engine<br/>Clock + event emitter]
    B --> C[🧠 Analysis<br/>Momentum, xG, patterns]
    C --> D[🎬 Director<br/>Tick-driven orchestrator]
    D --> E[🗣 Commentary<br/>PBP + Analyst voices]
    E --> F[🔊 TTS<br/>Text → audio]
    B --> G[📡 Live Events]
    F --> H[🖥 Frontend<br/>Pitch + controls + audio]
    G --> H
Loading

In plain words: MatchCaster reads a real match file, "replays" the game second by second, prepares commentary slightly ahead of time, converts it to audio, and sends both visuals + sound to the live interface.

What is StatsBomb data? StatsBomb open-data is a free collection of detailed event-level football match records — every pass, shot, tackle, and dribble with pitch coordinates, timestamps, and metadata. MatchCaster uses these JSON files as its replay source.

Screenshots

Select a match Live replay with stats & overlays Live event feed
Match selection Stats and overlay Live events

Quick Start

1) Prerequisites

  • Python 3.11+
  • Node.js 18+
  • make (pre-installed on macOS/Linux, or install via WSL on Windows)

Optional:

  • Groq API key (default cloud mode)
  • Ollama (for fully local mode)

2) Install dependencies

# from repo root
make setup

This will:

  • Create a Python virtual environment (.venv)
  • Install Python dependencies from backend/requirements.txt
  • Install frontend dependencies from frontend/package.json
  • Download match data from StatsBomb open-data

3) Run

# Cloud mode (default)
export GROQ_API_KEY=your_key_here
make run

# Local mode (offline)
# brew install ollama
# ollama pull gemma2:2b-instruct-q4_K_M
make run-local

Open http://localhost:5173


Commands

Command Description
make help Show all available commands
make setup Full setup (venv, deps, data)
make install Install dependencies only
make data Download match data
make run Run in cloud mode (Groq, default)
make run-cloud Same as make run
make run-local Run in local mode (Ollama)
make dev-backend Run only backend with live reload
make dev-frontend Run only frontend dev server
make clean Remove venv and cleanup
make stop Stop all running processes
make test Run test suite
make verify Verify setup is complete

Legacy start.sh

The Makefile is the recommended interface. start.sh is the original launcher kept for backward compatibility — make run and make run-local call it under the hood with the appropriate mode flag.

./start.sh          # cloud (groq, default)
./start.sh groq     # explicit cloud
./start.sh local    # local ollama

Notes

  • Backend deps: backend/requirements.txt (Python / pip)
  • Frontend deps: frontend/package.json (Node / npm)
  • The Makefile automatically handles virtual environment creation and dependency installation
  • Run make verify to check if your setup is complete

Using the app

  1. Select a match — the launch screen appears automatically. Pick a match, choose a commentary style, then click Watch Live →.

  2. Controls — the video player bar at the bottom:

    • ▶ / ⏸ — play and pause
    • −30s −10s +10s +30s — jump backward or forward
    • Click the seek bar to jump to any point in the match
    • Speed buttons 0.5× 1× 2× 4× 8× — control replay speed
    • 🔊 — mute/unmute audio commentary
    • — open the Overlay Panel (pitch view and settings)
    • Change — go back to the match selection screen
  3. Overlay Panel (opened with ):

    • Live — real-time event markers and pass trails on the pitch
    • Formation — starting lineup with jersey numbers and player names
    • Heatmap — territory map for home or away team
    • Shots — all shot locations, sized by xG, colored by outcome
    • Build-up — directional pass flow arrows by zone
  4. Sidebar tabs:

    • Stats — momentum bar, possession, shots, xG, passes, fouls, cards
    • Live — key events feed (goals, cards, big chances) or full event log
    • Squad — starting lineup with positions and goal contributions
  5. Commentary styles:

    Style Character
    🎙 Neutral Balanced, professional
    🔥 Enthusiastic High energy, emotional
    📐 Analytical Tactical depth, data-driven
    🏠 Home Fan Biased toward the home side
    ✈️ Away Fan Biased toward the away side

Architecture

System Overview (non-technical)

Think of MatchCaster like a live TV production team:

  • Replay Engine = the control room replaying the match timeline
  • Director = the producer deciding when each commentator speaks and what they see
  • AI Commentators = the voices (live action narrator + expert analyst)
  • Speech Engine = turns scripts into spoken audio
  • Frontend = what the viewer sees and hears in real time

How Commentary Stays in Sync

The app guarantees seamless commentary by always generating it before it's needed. The system works like a buffer:

 Game timeline (seconds):
  0s        15s       30s       45s       60s       75s
  ├─────────┼─────────┼─────────┼─────────┼─────────┤
  │ Block 1 │ Block 2 │ Block 3 │ Block 4 │         │
  │ ready ✓ │ ready ✓ │ ready ✓ │ generating...     │
  └─────────┴─────────┴─────────┴─────────┴─────────┘
       ↑ playing now                ↑ frontier
       Clock = 10s                  Always stays ahead

Lifecycle of a session:

  1. Loading — User selects a match. The backend warms up TTS + LLM, classifies all events, and pre-computes analyst quiet windows (time slots far from goals/shots/cards). Then pre-generates 4 PBP blocks (each covering 15 game-seconds). The play button only appears once 2 blocks are fully ready (text + audio synthesized).

  2. Playing — The match clock advances in real time. Every 50ms, the Director checks: are any blocks due? If a block's start time has passed, it dispatches the audio + text to the frontend. Meanwhile, it keeps generating new PBP blocks at the frontier so the buffer never runs dry. When the clock nears an analyst window (60 game-sec lead time), it spawns analyst generation — if ready before the slot, it replaces PBP for that window.

  3. Seek — User jumps to a new time. The Director increments its epoch (a version counter), clears the buffer, recomputes analyst windows from the new position, and starts generating blocks. Any in-flight work from the old position carries the old epoch and is silently discarded when it completes. The app shows a loading state until 2 blocks are ready, then resumes seamlessly.

  4. Speed change — Same as seek: epoch increments, buffer clears, analyst windows recompute, and blocks regenerate (because block duration changes with speed). Brief loading, then seamless playback at the new pace.

  5. Buffer safety net — If the LLM is too slow and the buffer empties during playback, the clock pauses automatically, a loading state appears, and playback resumes once blocks are ready again.

The epoch mechanism is what makes seek/speed changes robust. Every block is tagged with the epoch it was born in. On dispatch, stale-epoch blocks are discarded — no race conditions, no ghost commentary from a previous position.

Technical Flow

flowchart TD
    subgraph Data
        M[StatsBomb Match JSON]
    end

    subgraph Backend
        P[player.loader + player.emitter]
        A[analyser.engine + state + classifier]
        S[director.analyst_scheduler<br/>Pre-computes quiet windows]
        D[director.router<br/>Tick-driven, epoch-based]
        C[commentator.agents<br/>PBP + Analyst]
        L[commentator.llm<br/>Groq or Ollama]
        T[commentator.tts.engine<br/>Piper / macOS say fallback]
        Q[commentator.queue<br/>TimeBlockQueue]
        W[ws.handler]
    end

    subgraph Frontend
        F[React UI<br/>Pitch + overlays + controls]
    end

    M --> P
    P --> A
    P --> S
    S --> D
    A --> D
    P --> D
    D --> C
    C --> L
    L --> C
    C --> T
    T --> Q
    P --> W
    A --> W
    Q --> W
    W --> F
Loading

Commentator Roles

Play-by-Play — narrates the action as flowing paragraphs. One block per 15 game-seconds (scales with playback speed). Handles everything: goals, shots, cards, substitutions, quiet build-up. Receives analyst context to weave into narration. Handles the opening scene-setter. Always in sync with the game clock.

Analyst — macro reflection voice. Speaks infrequently (every 5–7 game-minutes) during pre-computed quiet windows — periods far from goals, shots, and cards. Silent for the first 5 minutes (PBP owns the opening). Reads from the AnalysisEngine's statistical snapshot (momentum, possession, xG, pressing patterns) and turns it into a brief tactical observation. Feeds context back to PBP for richer narration.

The analyst never interrupts PBP. During loading, the system scans the full event timeline and pre-schedules analyst slots in safe quiet periods. If the analyst's LLM generation is ready before its slot, it replaces the PBP block for that window (which would be generic filler). If not ready in time, PBP plays normally — no gap, no delay.

File Structure

backend/
├── config.py                All tunables
├── main.py                  FastAPI app + HTTP routes
│
├── player/
│   ├── clock.py             Async accelerated match clock (50 ms ticks)
│   ├── loader.py            StatsBomb JSON → MatchEvent dataclasses
│   └── emitter.py           Replay session management + seek support
│
├── analyser/
│   ├── classifier.py        Event priority: critical / notable / routine
│   ├── state.py             SharedMatchState (score, possession, stats)
│   ├── engine.py            Real-time match analysis (momentum, xG, vectors)
│   ├── spatial.py           Coordinate → pitch zone descriptions
│   └── enrichment/
│       ├── match_meta.py    Stadium, date, manager lookup
│       ├── weather.py       Historical weather via Open-Meteo
│       └── team_colors.py   Kit colors for ~40 teams
│
├── director/
│   ├── router.py            Tick-driven orchestrator: epoch-based block
│   │                        generation, dispatch, analyst scheduling
│   └── analyst_scheduler.py Pre-computes quiet windows for analyst slots
│
├── commentator/
│   ├── agents/
│   │   ├── base.py          BaseAgent ABC + prompt assembly
│   │   ├── play_by_play.py  Live action narration (flow-block output)
│   │   ├── analyst.py       Expert macro commentary (replaces tactical+stats)
│   │   └── prompts.py       System prompts + user prompt builders
│   ├── llm/
│   │   ├── __init__.py      Backend singleton (get_backend / init_backend)
│   │   ├── backend.py       LLMBackend ABC
│   │   ├── groq.py          Groq cloud backend (OpenAI-compatible SSE)
│   │   └── ollama.py        Ollama local backend
│   ├── tts/
│   │   ├── engine.py        Piper TTS wrapper → WAV bytes (+ macOS say fallback)
│   │   └── voices.py        Agent → voice model mapping
│   └── queue.py             AudioQueue + TimeBlockQueue (epoch-tagged dispatch)
│
└── ws/
    └── handler.py           WebSocket session: events, audio, state, seek

Configuration

All tunables live in backend/config.py:

Key Default Description
DEFAULT_SPEED_MULTIPLIER 1.0 Replay speed on startup
LLM_BACKEND groq "groq" (cloud) or "local" (Ollama)
GROQ_MODEL llama-3.1-8b-instant Groq model
OLLAMA_MODEL gemma2:2b-instruct-q4_K_M Ollama model (local mode only)
OLLAMA_TIMEOUT_SEC 90.0 Per-call timeout for Ollama streaming
MAX_OUTPUT_TOKENS 50 Hard token cap per commentary line
PBP_BLOCK_DURATION_GAME_SEC 15.0 Game-seconds per commentary block (scales with speed)
PBP_BLOCKS_AHEAD 4 Buffer depth: blocks kept pre-generated ahead
LOADING_MIN_BLOCKS_READY 2 Blocks required before playback can start
ANALYST_MIN_GAP_GAME_SEC 300.0 Minimum gap between analyst windows
ANALYST_MAX_GAP_GAME_SEC 420.0 Maximum gap between analyst windows
ANALYST_BLOCK_FIRST_SEC 300.0 Analyst silent for first 5 game-minutes
ANALYST_EXCLUSION_PRE 30.0 No analyst within 30s before critical events
ANALYST_EXCLUSION_POST 45.0 No analyst within 45s after critical events
ANALYST_LEAD_TIME_GAME_SEC 60.0 Start analyst generation 60 game-sec before slot
MAX_EVENTS_PER_BATCH 8 Max events sent to LLM per block

Graceful degradation

Failure Fallback
LLM unavailable / slow Template commentary ("Shot — great save!")
LLM too slow (buffer empties) Clock pauses, loading state, auto-resumes when buffer refills
Analyst LLM too slow PBP plays normally for that window — analyst simply absent
Piper TTS not installed macOS say built-in voices
Piper TTS crashes macOS say built-in voices
Audio queue overflow Oldest items dropped
WebSocket disconnect Auto-reconnect after 2 s
Unknown match ID No metadata shown, colors use defaults

About

AI football commentary over real StatsBomb match data, with live audio and an interactive pitch

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors