RAPID — faster-than-SeisBench's `annotate()` benchmarking toolkit

This repository is part of a larger project focused on enabling real-time seismic phase picking for seismic event detection using deep learning models.

The preliminary work, EQCCTPro/RAPID, enabled sub-11s 3-C waveform processing using persistant model actors to handle 228 stations of 1-minute seismic data for production applications with the Texas Seismological Network (TexNet). This architecture was integrated into SCMLPick, a SeisComP module that integrates deep learning models into the SeisComP interface for real-time seismic phase picking, serving as the backbone of the processing approach currently operational in producation at TexNet.

Further work is focused on improving processing speeds beyond the persistent actor approach by combing different levels of numerical precision with batching. Batching has been applied in SeisBench's annotate(), and preliminary trials show that we can achieve faster processing than annotate() through these techniques. Prelimary results can be found here, with final trials are being finalized for publication in the near future.

Models and backends

Models: PhaseNet, PhaseNetLight (3001-sample window), EQTransformer, EQT-NC (6000-sample window). EQCCT is a planned addition once it's integrated into SeisBench.

Backends:

baseline_annotate — unmodified SeisBench
lean_pytorch — FP32 / FP16 / BF16, with optional torch.compile
onnx — ONNX Runtime (optional; only registered if the package imports)
tensorrt — prebuilt .plan engines (optional; same)

Setup: Conda environment

1. Create and activate an env Make sure to match the Python version to the PyTorch CUDA wheels you will install.

conda create -n rapid python=3.11 -y
conda activate rapid

2. Install the env library packages using environment.yml

3. Install optional backend dependencies(ONNX, ONNX Runtime GPU, and related helpers; see Optional backends:

cd RAPID
pip install -r requirements-extra.txt

This assumes the core stack is already installed. Swap onnxruntime-gpu for onnxruntime in that file if you only need CPU inference. TensorRT comes from NVIDIA for your specific CUDA toolkit version. See the comments at the bottom of requirements-extra.txt for more info.

Quick start

cd RAPID
 
# Single config sanity check, runs in ~a minute on one GPU
python scripts/run_benchmark.py \
    --dataset-dir /path/to/data/20241215T120000Z_20241215T120100Z \
    --model PhaseNet --child original \
    --backend lean_pytorch --dtype fp16 \
    --device cuda:0 --n-stations 228 --batch-size 256 --repeats 3
 
# Pipelined single-GPU (the fast path: parallel CPU preprocess
# feeding megabatched GPU forward with CPU<->GPU overlap)
python scripts/run_pipelined.py \
    --dataset-dir "$DATA_DIR" --model PhaseNet --child original \
    --n-stations 580 --batch-size 256 --dtype fp16 \
    --mode single_gpu --n-cpu-workers 16 --repeats 3
 
# Fair dual-GPU baseline: SeisBench annotate() on 2 GPUs, stations split 50/50
python scripts/run_pipelined.py \
    --dataset-dir "$DATA_DIR" --model PhaseNet --child original \
    --n-stations 580 --mode baseline_dual_gpu --repeats 3
 
# Pipelined dual-GPU: each GPU shard runs its own CPU preprocess pool
python scripts/run_pipelined.py \
    --dataset-dir "$DATA_DIR" --model PhaseNet --child original \
    --n-stations 580 --batch-size 512 --dtype bf16 \
    --mode dual_gpu --n-cpu-workers 8 --repeats 3
 
# Full matrix (all 4 models x 4 station counts x 5 backends x 9 batch sizes x 3 repeats)
python scripts/run_matrix.py --config configs/full_matrix.json
 
# Generate plots from the outputted JSONL file
python scripts/make_plots.py --jsonl results/matrix.jsonl --out-dir figures

ONNX / TensorRT - Optional backends

After installing the extra dependencies, you can export pretrained weights to ONNX

# ONNX only
python scripts/export_models.py --onnx-dir models_exported/onnx --skip-trt
 
# ONNX + TRT engines (pick the opt batch for your most common shape)
python scripts/export_models.py \
    --onnx-dir models_exported/onnx \
    --trt-dir  models_exported/trt \
    --opt-batch 228 --max-batch 1024

Then add the exported paths to configs/full_matrix.json:

{ "name": "onnx",     "dtype": "fp32", "onnx_path": "models_exported/onnx/PhaseNet_original.onnx" },
{ "name": "tensorrt", "dtype": "fp16", "engine_path": "models_exported/trt/PhaseNet_original_fp16.plan", "max_batch_size": 1024 }

Pick quality

Pick quality is evaluated against catalog ground truth on the SeisBench evaluation traces. Every trial in the dtype / timing matrix appends pick_quality, including median absolute onset offset vs catalog P and S (onset_delta_*_vs_catalog in samples at model sampling rate).

cd RAPID
python scripts/run_seisbench_matrix.py --config configs/seisbench_dtype_matrix.json

Use traces_per_dataset in the JSON config to control how many traces are drawn per dataset (100 is standard for the publication matrix).

For a quick same-waveform FP16 vs FP32 comparison on any local miniSEED chunk (no catalog needed):

python scripts/compare_fp16_fp32.py \
    --dataset-dir /path/to/timechunk \
    --model PhaseNet --child original \
    --device cuda:0 --n-stations 228 \
    --out-json results/fp16_vs_fp32_PhaseNet.json

Reports probability trace drift (MAE, max absolute error, RMSE, Pearson correlation), pick-time delta at threshold (median, p95, max — in samples at model sr), and FP16 speedup over FP32.

For a broader sweep of probability and pick drift vs FP32 on miniSEED workloads, there's also scripts/run_quality_matrix.py.

What each timed stage means

Stage	What happens
`merge_streams`	(baseline only) concatenating all station ObsPy Streams for `model.annotate()`.
`annotate_end_to_end`	(baseline only) all of SeisBench's internal pipeline, end-to-end.
`preprocess`	SeisBench's `annotate_stream_pre` (filter, resample) run once per station.
`window_cut_and_stack`	Build a single `(N_total_windows, 3, in_samples)` numpy array across all stations.
`forward`	Backend's `infer_chunked` — the model forward pass (CUDA-synchronized).

Baseline collapses the lean stages into annotate_end_to_end; the lean backends expose them separately so we can see where the speedup comes from.

The method families (evolution of speedups)

Every row in results/matrix.jsonl falls into one of these families. They're recorded as distinct kind + variant combinations so analysis scripts can tell them apart and plot the evolution side by side.

#	Kind	Variant suffix	What it is
1	`baseline`	(none)	SeisBench's `model.annotate()` on one device (CPU or CUDA).
2	`dual_gpu`	`2gpu_baseline`	SeisBench's `model.annotate()` run in parallel on 2 GPUs, stations split 50/50.
3	`single`	(none)	Lean path, 1 GPU, single-threaded preprocess.
4	`cpu_worker_sweep`	`cpuN` (device `cuda:0`)	Lean path, 1 GPU, N parallel CPU preprocess workers feeding one GPU inference actor.
5	`dual_gpu_serial`	`2gpu_serial`	Lean path, 2 GPUs, single-threaded preprocess per shard (no CPU pool). Kept for the evolution comparison; roughly equivalent to #3 on half the stations per shard.
6	`dual_gpu`	`2gpu_cpuN`	Lean path, 2 GPUs, each shard runs its own N-worker CPU preprocess pool (pipelined).
7	`cpu_worker_sweep`	`cpu_infer_poolN[_tT]` (device `cpu`)	Lean path, CPU inference, N parallel CPU preprocess workers feeding one CPU inference actor pinned to `T` BLAS threads (or auto-split when `T` is absent).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAPID — faster-than-SeisBench's `annotate()` benchmarking toolkit

Models and backends

Setup: Conda environment

Quick start

ONNX / TensorRT - Optional backends

Pick quality

What each timed stage means

The method families (evolution of speedups)

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
configs		configs
models_exported		models_exported
rapid		rapid
scripts		scripts
.gitignore		.gitignore
RAPID_Seisbench_speedup.pdf		RAPID_Seisbench_speedup.pdf
README.md		README.md
environment.yml		environment.yml
requirements-extra.txt		requirements-extra.txt

Folders and files

Latest commit

History

Repository files navigation

RAPID — faster-than-SeisBench's annotate() benchmarking toolkit

Models and backends

Setup: Conda environment

Quick start

ONNX / TensorRT - Optional backends

Pick quality

What each timed stage means

The method families (evolution of speedups)

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

RAPID — faster-than-SeisBench's `annotate()` benchmarking toolkit

Packages