This repository contains the analysis workflow used to generate the results reported in the manuscript 'Using ssVEPs to Characterise Wide-Ranging Retinopathy linked to CRB1: Implications for Clinical Trials' (Stäubli et al., 2026) and serves two purposes:
- Reproducibility - key analyses can be replicated using anonymised derivative datasets included in the repository.
- Methodological transparency - the full preprocessing pipeline is included for completeness, though parts of it require access to raw EEG recordings which cannot be shared due to data protection and privacy restrictions.
The workflow can be run in two ways depending on the data available to you:
| Goal | Data needed | Where to start |
|---|---|---|
| Replicate manuscript results (most users) | Publicly available derivative dataset | Module 02 → Step 2.2 |
| Run the full pipeline from raw EEG recordings | Raw EEG traces | Module 01 → then Module 02 from Step 2.1 |
! Note: Running the full pipeline requires access to raw EEG recordings, which are not publicly available.
This repository is organized into two separate modules:
| Module | Purpose |
|---|---|
01_eeg_preprocessing/ |
Converts raw EEG recordings into subject-level derivative datasets |
02_data_analysis/ |
Performs statistical analyses and generates figures for the manuscript |
! Note: Each module uses its own software environment and should be set up separately.
crb1_eeg_workflow/
│
├─ 01_eeg_preprocessing/
│ ├─ pipelines/
│ ├─ src/
│ ├─ pyproject.toml
│ ├─ setup.sh
│ ├─ config.yml
│ ├─ environment.yaml
│ └─ README.md
│
├─ 02_data_analysis/
│ ├─ data/
│ │ ├─ raw/
│ │ └─ derivatives/
│ │
│ └─ src/
│ ├─ 01_preprocessing/
│ ├─ 02_analysis/
│ ├─ 03_plotting/
│ ├─ 04_reliability/
│ ├─ 05_supplementary/
│ ├─ utils/
│ ├─ config.yml
│ └─ environment.yaml
│
└─ README.md
01_eeg_preprocessing/
This module pre-processes subject-level raw EEG traces. It re-references, band-pass filters and segments data into individual epochs, before applying a Fast Fourier Transform (FFT) and calculating signal-to-noise ratio (SNR). The output is subject-level derivative data ready for further analysis in module 2.
Setup and run preprocessing: Follow the instructions in 01_eeg_preprocessing/README.md
02_data_analysis/
This module contains the statistical analyses and plotting scripts used to generate all results reported in the manuscript.
This module requires Conda to manage the analysis environment and dependencies. Install python dependencies using:
cd <absolute/path/to/crb1_ssvep_workflow>/02_data_analysis
conda env create -f src/environment.yaml
R packages currently have to be installed manually. In your R session, run:
install.packages(c(
"bbmle",
"car",
"coin",
"contrast",
"effectsize",
"emmeans",
"ggplot2",
"here",
"ICC",
"lme4",
"lmerTest",
"MKinfer",
"performance",
"perm",
"psych",
"psychTools",
"svDialogs",
"tidyverse",
"viridis",
"yaml"
))Then edit the configuration file src/config.yml so the paths match your machine.
Make sure you add a slash at the end of your path. Otherwise you might get issues with path specifications later on. Example:
# Set this to the root of the 02_data_analysis folder.
local_analysis_path: "/absolute/path/to/02_data_analysis/"
# This is only needed if you have access to the raw EEG traces.
local_data_path: "/absolute/path/to/subject/level/derivatives/" -
Public derivative datasets required for replication can be downloaded here
-
The download contains
- main datasets required for replication (generated in Step 2.1 from subject-level derivatives produced in Module 01)
- datasets for reliability analyses in the subfolder
/reliability_analysis(generated in Step 2.1 from subject-level derivatives produced in Module 01) - precomputed curve-fitting bootstraps in the subfolder
/curve_fitting(for the main analysis) and/supplementary(for phenotype-level analyses reported in the supplementary). Curve fitting uses 1,000 bootstrap samples for each group, which can take several hours to compute, so using pre-computed values will speed replication up a lot.
-
To replicate the analyses, copy the downloaded files in their existing folder structure to:
02_data_analysis/data/derivatives/(make sure to unzip) -
Your derivatives folder should now look like this:
derivatives └─ Tables │─ ssVEP_cohort_data.csv │─ AUC_BCVA.csv │─ reliability_analysis/ │─ curve_fitting/ └─ supplementary/ -
If it does, you are good to go!
Run python notebooks and R Markdown files located in 02_data_analysis/
from top to bottom.
| Step | Directory | Description | Language |
|---|---|---|---|
| 1 | src/01_preprocessing/ |
Aggregate subject-level data | Python |
| 2 📌 | src/02_analysis/ |
Core statistical analyses | R |
| 3 | src/03_plotting/ |
Generate manuscript figures | Python |
| 4 | src/04_reliability/ |
Test–retest reliability analysis | Python |
| 5 | src/05_supplementary/ |
Supplementary analyses | Python |
📌 Replication using public data can begin at Step 2