Author: Kaiyu Li*, Zixuan Jiang*, Xiangyong Cao✉, Jiayu Wang, Yuchen Xiao, Jing Yao, Chen Wu, Deyu Meng, Zhi Wang
DescribeEarth is a remote-sensing MLLM framework for object-level fine-grained image description, rather than only image-level captioning.
It targets practical Earth observation scenarios such as environmental monitoring, urban planning, and disaster analysis.
We introduce the Geo-DLC task and provide a complete research stack:
- DE-Dataset: a large-scale dataset with 25 categories and 261,806 object instances.
- DE-Benchmark: an LLM-assisted QA benchmark for factuality, richness, and language quality.
- DescribeEarth model: a domain-aware MLLM with scale-adaptive and fusion strategies for remote-sensing imagery.
- 2025-10-01: Paper, code, dataset, benchmark, and checkpoints released.
Please follow environment setup instructions in environments/README.md.
Download pretrained checkpoints from Hugging Face, then place them under:
weights/
└── DescribeEarth_xxx
python scripts/inference.py \
--model_dir <model_dir> \
--image <image_path> \
--bbox <4-point bbox or 2-point bbox>Example:
python scripts/inference.py \
--model_dir ./weights/DescribeEarth_0930 \
--image ./example1/image.jpg \
--bbox 36.0 332.0 311.0 325.0 317.0 584.0 42.0 591.0Example output:
The object of category baseball_field within the specified polygon bounding box is a well-defined outdoor sports facility designed for baseball...
Download from DE-Dataset on Hugging Face.
Expected structure:
DE-Dataset
- {DIOR, DOTA}
- - image
- - descriptionFormat data for training:
bash scripts/format_data.shDownload from DE-Benchmark on Hugging Face.
Following the Qwen2.5-VL baseline, train on DE-Dataset (or your own data) with these steps:
- Set formatted dataset path in
DescribeEarth_Qwen2.5-VL/qwen-vl-finetune/qwenvl/data/__init__.py. - Download merged pretrained weights from Qwen2.5-VL-3B-RC-1120.
- Run:
cd DescribeEarth_Qwen2.5-VL/qwen-vl-finetune
bash scripts/sft.shUse scripts/openai_valid.py to evaluate DescribeEarth or other models:
python scripts/openai_valid.py \
<path_to_QA.json> \
<path_to_image_dataset> \
-o <output_dir> \
--generator <api|local> \
--api-key <api_key> \
--model_dir <model_dir>Then compute final scores:
python scripts/calculate_score.pyLaunch the local demo app:
cd scripts
python app.pyIf you find this project useful, please cite:
@article{li2025describeearth,
title={DescribeEarth: Describe Anything for Remote Sensing Images},
author={Li, Kaiyu and Jiang, Zixuan and Cao, Xiangyong and Wang, Jiayu and Xiao, Yuchen and Wu, Chen and Meng, Deyu and Wang, Zhi},
journal={arXiv preprint arXiv:2509.25654},
year={2025}
}