OmniUnet

Code associated with the article: "OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery".

Author: R. Castilla Arquillo

Supervisor: Carlos J. Pérez del Pulgar

Contact info: raulcastar@uni.lu

Links

paper: IEEE Xplore
preprint: arXiv
dataset: Zenodo

Citation

If this work was helpful for your research, please cite it as follows:

@INPROCEEDINGS{castilla2025omniunet,
  author={Castilla-Arquillo, R. and Pérez-Del-Pulgar, C. J. and Gerdes, L. and Garcia-Cerezo, A. and Olivares-Mendez, M.},
  booktitle={2025 International Conference on Space Robotics (iSpaRo)}, 
  title={OmniUnet: A Multimodal Network for Unstructured Terrain Segmentation on Planetary Rovers Using RGB, Depth, and Thermal Imagery}, 
  year={2025},
  volume={},
  number={},
  pages={161-167},
  keywords={Training;Space vehicles;Navigation;Semantic segmentation;Soil;Transformers;Three-dimensional printing;Robot sensing systems;Software;Safety},
  doi={10.1109/iSpaRo66239.2025.11436158}}

System information

This repository contains the code for training and executing OmniUNet, a multimodal neural network based on transformers designed for semantic segmentation of images that combine color, depth, and thermal data. OmniUNet enables robust perception in complex environments by leveraging complementary sensor modalities. It is especially suited for robotics applications.

A diagram of the OmniUNet architecture is shown below:

Docker configuration

A Ubuntu host system is needed to run the files located in this repo, as we use an Nvidia GPU to train the network. First of all, we must install the docker core:

sudo apt-get update

sudo apt-get install \
    apt-transport-https \
    ca-certificates \
    curl \
    gnupg-agent \
    software-properties-common
    
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -

sudo apt-key fingerprint 0EBFCD88

sudo add-apt-repository \
   "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
   $(lsb_release -cs) \
   stable"
   
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io

We install our NVIDIA card's drivers and the modules that let us use them in our docker environment:

sudo ubuntu-drivers autoinstall
sudo apt-get install -y nvidia-docker2 nvidia-container-runtime

After that, we must build the configured docker container for this project:

docker build . -f ./dockerfile/omniunet.dockerfile -t omniunet

We run the docker image:

xhost +local:docker  ## To let docker use the screen

docker run -e DISPLAY=$DISPLAY -v /home/pc/Escritorio/omnivore_orig_tests:/home/omnivore \
  -v /tmp/.X11-unix:/tmp/.X11-unix:rw \
  --ipc=host \
  --rm --gpus all \
  -it omniunet \
  -d -t \
  /bin/bash

and:

docker exec -it $ID$  /bin/bash

Commands

Train mode

train.py script will be used to train our neural network. For doing so, many arguments can be used

d: in which dataset we will train. Explore data_loading.py to know more
b: batch size
e: number of epochs. At the end of every epoch, a .pth will be saved in order to avoid losing data if training gets interrupted
l: learning rate
f: load model from a .pth file as a pretrain. If none is given, will train from scratch
s: Downscaling factor of the images
v: Percent of images in dataset used for validation
c: Number of classes in the dataset. This parameter must be given

Example:

python3 train.py -u=1 -m=5 -d=rgbdt -c=6 -b=16 -e=30 -l=2e-5

If you want to continue a training:

python train.py -d=rgbdt -u=1 -c=6 -b=16 -e=30 -l=2e-5 -f=path_to_your/file.pth

Predict mode

m: Number of max channels architecture
i: RGB input
p: Depth input
T: Thermal input
d: Dataset
c: Number of classes in the dataset

Predict

python3 predict.py -m=path_to_your_weights.pth -d=rgbdt -c=6 -i=path_to_rgb_img.png  -p=path_to_depth_img.csv  -T=path_to_thermal_img.csv

Single images metrics mode

python3 predict_metrics.py -m=path_to_your_weights.pth -d=rgbdt -c=6 -i=path_to_rgb_img.png -p=path_to_depth_img.csv  -T=path_to_thermal_img.csv -g=path_to_groundtruth_mask.png

Multi-image metrics mode

python3 predict_multi_metrics.py -m=path_to_your_weights.pth -d=rgbdt -c=6 -i=path_to_rgb_img_folder -p=path_to_depth_img_folder -T=path_to_thermal_img_folder -g=path_to_groundtruth_mask_folder

Training missing modalities

RGDT needs to be defined by default.

python3 train_missing_mod.py -u=1 -m=5 -d=rgbdt -b=16 -e=30 -l=2e-5 -c=6

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
classes		classes
dockerfile		dockerfile
docs		docs
model		model
utils		utils
utils_dataset		utils_dataset
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
predict.py		predict.py
predict_metrics.py		predict_metrics.py
predict_multi_metrics.py		predict_multi_metrics.py
train.py		train.py
train_missing_mod.py		train_missing_mod.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OmniUnet

Links

Citation

System information

Docker configuration

Commands

Train mode

Predict mode

Predict

Single images metrics mode

Multi-image metrics mode

Training missing modalities

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OmniUnet

Links

Citation

System information

Docker configuration

Commands

Train mode

Predict mode

Predict

Single images metrics mode

Multi-image metrics mode

Training missing modalities

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages