Skip to content

HellexF/MoRe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer

Juntong Fang* · Zequn Chen* · Weiqi Zhang* · Donglin Di · Xuancheng Zhang · Chengmin Yang · Yu-Shen Liu†

CVPR 2026 Highlight

🚀 News

  • [2026.03] Refactored inference scripts and pretrained weights are released.
  • [2026.02] MoRe has been accepted by CVPR 2026!

📖 Introduction

MoRe is a feedforward 4D reconstruction transformer designed to efficiently recover dynamic 3D scenes from monocular videos.

  • Motion-Structure Disentanglement: Employs an attention-forcing strategy to separate dynamic motion from static structure.
  • Grouped Causal Attention: Captures temporal dependencies and adapts to varying token lengths for coherent geometry.


🛠️ Setup

Installation

Clone the repository and create an anaconda environment using

# Clone the repository
git clone https://github.com/HellexF/MoRe
cd MoRe

# Create and activate environment
conda create -n more python=3.10 -y
conda activate more

# Install PyTorch and CUDA toolkit
conda install pytorch=2.9.0 torchvision=0.24.0 cudatoolkit=11.8 -c pytorch
conda install cudatoolkit-dev=11.8 -c conda-forge

# Install remaining dependencies
pip install -r requirements.txt

Required Extension : We use MagiAttention for implementing grouped causal attention. Please follow their installation guide to enable stream inference.

Pretrained Model

We provide the pretrained full attention and stream models. Please download the pretrained models from Google Drive and place them in the ./pretrained directory:

💻Inference

python inference.py \
    --config_path training/config/omniworld_full.yaml \
    --ckpt_path pretrained/more_full.pt \
    --image_path ./data/example_video \
    --output_dir ./results/full_res \
    --conf_thres 50.0 \
    --predict_motion

🏋️Training

We offer the training config for both full attention and stream training on Omniworld-Game dataset. Please refer to the Omniworld for downloading and place it in the './dataset' directory. To train full attention version, simply run

torchrun --nproc_per_node=$GPU_NUM training/launch.py --config omniworld_full

Similarly, to train the stream version, run

torchrun --nproc_per_node=$GPU_NUM training/launch.py --config omniworld_stream

📊Evaluation

Run the following scripts to evaluate benchmarks for camera poses and video depth:

# Camera Pose Evaluation
bash eval/relpose/run.sh

# Video Depth Evaluation
bash eval/video_depth/run.sh

📑Acknowledgements

This project is built upon VGGT, MagiAttention. We thank all the authors for their great repos.

✒️Citation

If you find our code or paper useful, please consider citing

@inproceedings{fang2026moremotionawarefeedforward4d,
      title={MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer}, 
      author={Juntong Fang and Zequn Chen and Weiqi Zhang and Donglin Di and Xuancheng Zhang and Chengmin Yang and Yu-Shen Liu},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
      year={2026}
}

About

[CVPR'2026]: MoRe: Motion-aware Feed-forward 4D Reconstruction Transformer

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages