GitHub - gallocarmine/ArbItro: ArbItro - Empirical Limits of Multi-Task CNN-RNN Pipelines for Football Foul Recognition

In this work, we present ArbItro, a multi-task CNN-RNN framework for football foul recognition on the SoccerNet-MVFouls benchmark, designed as a controlled study of what structurally classical architectures can recover under strong visual ambiguity and long-tail supervision.

The proposed system decouples offence detection, action classification, and disciplinary severity assessment into separate prediction heads, with the aim of disentangling infringement recognition from sanction estimation.

📄 ArbItro: Empirical Limits of Multi-Task CNN-RNN Pipelines for Football Foul Recognition

📁 Index

Architecture
- Pipelines
- Software Stack
Dataset
Result
Installation & Setup
- Training & Evaluation
- App Installation
Demo & Usage
Project Structure
License

Architecture

Pipelines

Single View Pipeline	Multi-view Pipeline

View Detailed Model Architecture & Parameters (56.1M Params)

Model Statistics:

Total parameters: 56,159,663
Trainable: 32,622,191
Non-trainable: 23,537,472

Layer (type)	Output Shape	Param #	Connected to
`video_input` (InputLayer)	`(None, 4, 16, 224, 398, 3)`	0	-
`td_cnn_clips` (TimeDistributed)	`(None, 4, 16, 5, 11, 1536)`	54,336,736	`video_input[0][0]`
`td_gap_clips` (TimeDistributed)	`(None, 4, 16, 1536)`	0	`td_cnn_clips[0][0]`
`dropout_video` (Dropout)	`(None, 4, 16, 1536)`	0	`td_gap_clips[0][0]`
`speed_input` (InputLayer)	`(None, 1)`	0	-
`td_lstm_per_clip` (TimeDistributed)	`(None, 4, 256)`	1,704,960	`dropout_video[0][0]`
`clip_mask` (InputLayer)	`(None, 4)`	0	-
`speed_embed` (Dense)	`(None, 32)`	64	`speed_input[0][0]`
`clip_fusion` (Lambda)	`(None, 256)`	0	`td_lstm_per_clip[0][0]`, `clip_mask[0][0]`
`dropout_speed` (Dropout)	`(None, 32)`	0	`speed_embed[0][0]`
`fusion` (Concatenate)	`(None, 288)`	0	`clip_fusion[0][0]`, `dropout_speed[0][0]`
`dense_shared` (Dense)	`(None, 256)`	73,984	`fusion[0][0]`
`ln_shared` (LayerNormalization)	`(None, 256)`	512	`dense_shared[0][0]`
`dropout_shared` (Dropout)	`(None, 256)`	0	`ln_shared[0][0]`
`act_dense` (Dense)	`(None, 64)`	16,448	`dropout_shared[0][0]`
`off_dense` (Dense)	`(None, 32)`	8,224	`dropout_shared[0][0]`
`sev_dense` (Dense)	`(None, 64)`	16,448	`dropout_shared[0][0]`
`act_dropout` (Dropout)	`(None, 64)`	0	`act_dense[0][0]`
`off_dropout` (Dropout)	`(None, 32)`	0	`off_dense[0][0]`
`sev_dropout` (Dropout)	`(None, 64)`	0	`sev_dense[0][0]`
`aux_bodypart` (Dense)	`(None, 3)`	771	`dropout_shared[0][0]`
`aux_contact` (Dense)	`(None, 1)`	257	`dropout_shared[0][0]`
`aux_handball` (Dense)	`(None, 1)`	257	`dropout_shared[0][0]`
`aux_touch_ball` (Dense)	`(None, 1)`	257	`dropout_shared[0][0]`
`aux_try_play` (Dense)	`(None, 1)`	257	`dropout_shared[0][0]`
`head_action` (Dense)	`(None, 4)`	260	`act_dropout[0][0]`
`head_offence` (Dense)	`(None, 1)`	33	`off_dropout[0][0]`
`head_severity` (Dense)	`(None, 3)`	195	`sev_dropout[0][0]`

Software Stack

ML & Deep Learning

Dataset

The ArbItro project is trained and evaluated using the official SoccerNet Challenge 2025 - Multi-View Fouls Recognition dataset.

This dataset provides a realistic and challenging benchmark for identifying fouls from multiple synchronized camera angles. As is typical with sporting event data, there is a significant class imbalance across the various labels, which is addressed through specialized data augmentation and balancing techniques.

Main Features:

Class Distribution: See the figure to the right for detailed metrics on Severity, Offense Type, Action, and Body Part involved.
Clip Duration: Approximately 5 seconds per clip, centered precisely on the moment of the action.
Views per Action: Multiple synchronized camera angles are available for each event, providing a comprehensive view for the model.

⚠️ Note on Data Access: The original video files are password-protected. To download and decompress the dataset splits, you must sign the SoccerNet NDA.

Result

To evaluate the effectiveness of ArbItro as a Video Assistant Referee System (VARS) support tool, we measured performance across all four multi-task output heads. Given the standard class imbalance in football foul data, we focus on Balanced Accuracy and Recall per head.

The heatmap below provides a detailed breakdown of the per-class recall for Pipeline 1, Pipeline 2, and our Ensemble Model across the three multi-task heads:

ArbItro Performance Heatmap - Confusion Matrix

Installation & Setup

Training & Evaluation

All training and evaluation notebooks are located under model/src/ and are fully compatible with both local execution and Google Colab.

Clone the repository

git clone https://github.com/gallocarmine/ArbItro.git
cd ArbItro

Prerequisites (local only)

pip install -r requirements.txt

Pipeline 1

# Training
model/src/pipeline1/arbitro_train.ipynb

# Evaluation
model/src/pipeline1/arbitro_test.ipynb

Pipeline 2 follows the identical structure:

model/src/pipeline2/arbitro_train.ipynb
model/src/pipeline2/arbitro_test.ipynb

Both pipelines share the same notebook conventions. data_loader.py and model.py in each folder define the data pipeline and architecture respectively. Trained weights are saved to ArbItro_Training/models/.

Ensemble Evaluation

model/src/ensemble_test.ipynb

Requires both pipeline1.keras and pipeline2.keras to be present in ArbItro_Training/models/.

Mount your Google Drive and update the dataset path in data_loader.py accordingly, or store the dataset locally and update the path directly.

App Installation

Prerequisites

Python 3.12
Node.js ≥ 18

1. Clone the repository

git clone https://github.com/tuo-username/ArbItro.git
cd ArbItro

2. Install Python dependencies

pip install -r requirements.txt

3. Install Electron dependencies

cd app
npm install

4. Start the inference server

cd app/server
source ../../.venv/bin/activate
python3 server.py

5. Launch the desktop app (in a separate terminal)

cd app
npm start

The app connects to the Flask server at http://127.0.0.1:5000. Make sure the server is running before clicking Analyze Action.

Demo & Usage

The following animated sequences provide a comprehensive overview of the ArbItro system, from theoretical operation to real-time VARS interface usage.

System Overview

The following animated sequences demonstrate the ArbItro workflow: a practical walkthrough of our VARS interface analyzing that specific event.

The VARS interface allows the human operator to manually select and load the video feeds for a specific foul event. The user is responsible for uploading the synchronized camera angles required by the multi-view Deep Learning model for accurate evaluation.

ArbItro VARS Interface: Loading Synchronized Video

Before triggering the inference phase, the user can set the desired playback speed for manual visual review. Once the analysis is initiated, the system processes the multi-stream video data through the selected pipeline and subsequently outputs the classification results across all four heads.

ArbItro VARS Interface: Analysis and Results

🗂️Project Structure

ArbItro/
├── app/
│   ├── client/
│   │   ├── static/
│   │   │   ├── renderer.js
│   │   │   └── style.css
│   │   └── index.html
│   │
│   ├── server/
│   │   └── server.py
│   ├── main.js
│   ├── package.json
│   └── package-lock.json
│
├── asset/
│
└── model/
    └── src/
        ├── pipeline1/
        │   ├── arbitro_test.ipynb
        │   ├── arbitro_train.ipynb
        │   ├── data_loader.py
        │   └── model.py
        │
        ├── pipeline2/
        │   ├── arbitro_test.ipynb
        │   ├── arbitro_train.ipynb
        │   ├── data_loader.py
        │   └── model.py
        │
        └── ensemble_test.ipynb

📄License

This project is licensed under the Creative Commons Attribution-NonCommercial 4.0 International License (CC BY-NC 4.0).

For full legal details, please refer to the LICENSE file included in this repository or visit the Official Creative Commons page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📄 ArbItro: Empirical Limits of Multi-Task CNN-RNN Pipelines for Football Foul Recognition

📁 Index

Architecture

Pipelines

Software Stack

ML & Deep Learning

Dataset

Main Features:

Result

Installation & Setup

Training & Evaluation

App Installation

Demo & Usage

System Overview

🗂️Project Structure

📄License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
app		app
asset		asset
model/src		model/src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📄 ArbItro: Empirical Limits of Multi-Task CNN-RNN Pipelines for Football Foul Recognition

📁 Index

Architecture

Pipelines

Software Stack

ML & Deep Learning

Dataset

Main Features:

Result

Installation & Setup

Training & Evaluation

App Installation

Demo & Usage

System Overview

🗂️Project Structure

📄License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages