DescribeEarth

Describe Anything for Remote Sensing Images

Author: Kaiyu Li*, Zixuan Jiang*, Xiangyong Cao✉, Jiayu Wang, Yuchen Xiao, Jing Yao, Chen Wu, Deyu Meng, Zhi Wang

Overview

DescribeEarth is a remote-sensing MLLM framework for object-level fine-grained image description, rather than only image-level captioning.
It targets practical Earth observation scenarios such as environmental monitoring, urban planning, and disaster analysis.

We introduce the Geo-DLC task and provide a complete research stack:

DE-Dataset: a large-scale dataset with 25 categories and 261,806 object instances.
DE-Benchmark: an LLM-assisted QA benchmark for factuality, richness, and language quality.
DescribeEarth model: a domain-aware MLLM with scale-adaptive and fusion strategies for remote-sensing imagery.

News

2025-10-01: Paper, code, dataset, benchmark, and checkpoints released.

Installation

Please follow environment setup instructions in environments/README.md.

Quick Start

1) Download model weights

Download pretrained checkpoints from Hugging Face, then place them under:

weights/
└── DescribeEarth_xxx

2) Run inference

python scripts/inference.py \
  --model_dir <model_dir> \
  --image <image_path> \
  --bbox <4-point bbox or 2-point bbox>

Example:

python scripts/inference.py \
  --model_dir ./weights/DescribeEarth_0930 \
  --image ./example1/image.jpg \
  --bbox 36.0 332.0 311.0 325.0 317.0 584.0 42.0 591.0

Example output:

The object of category baseball_field within the specified polygon bounding box is a well-defined outdoor sports facility designed for baseball...

Data Preparation

DE-Dataset

Download from DE-Dataset on Hugging Face.

Expected structure:

DE-Dataset
- {DIOR, DOTA}
- - image
- - description

Format data for training:

bash scripts/format_data.sh

DE-Benchmark

Download from DE-Benchmark on Hugging Face.

Training

Following the Qwen2.5-VL baseline, train on DE-Dataset (or your own data) with these steps:

Set formatted dataset path in DescribeEarth_Qwen2.5-VL/qwen-vl-finetune/qwenvl/data/__init__.py.
Download merged pretrained weights from Qwen2.5-VL-3B-RC-1120.
Run:

cd DescribeEarth_Qwen2.5-VL/qwen-vl-finetune
bash scripts/sft.sh

Evaluation

Use scripts/openai_valid.py to evaluate DescribeEarth or other models:

python scripts/openai_valid.py \
  <path_to_QA.json> \
  <path_to_image_dataset> \
  -o <output_dir> \
  --generator <api|local> \
  --api-key <api_key> \
  --model_dir <model_dir>

Then compute final scores:

python scripts/calculate_score.py

Demo

Launch the local demo app:

cd scripts
python app.py

Citation

If you find this project useful, please cite:

@article{li2025describeearth,
  title={DescribeEarth: Describe Anything for Remote Sensing Images},
  author={Li, Kaiyu and Jiang, Zixuan and Cao, Xiangyong and Wang, Jiayu and Xiao, Yuchen and Wu, Chen and Meng, Deyu and Wang, Zhi},
  journal={arXiv preprint arXiv:2509.25654},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DescribeEarth

Describe Anything for Remote Sensing Images

Overview

News

Table of Contents

Installation

Quick Start

1) Download model weights

2) Run inference

Data Preparation

DE-Dataset

DE-Benchmark

Training

Evaluation

Demo

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
DescribeEarth_Qwen2.5-VL		DescribeEarth_Qwen2.5-VL
data		data
environments		environments
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

DescribeEarth

Describe Anything for Remote Sensing Images

Overview

News

Table of Contents

Installation

Quick Start

1) Download model weights

2) Run inference

Data Preparation

DE-Dataset

DE-Benchmark

Training

Evaluation

Demo

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages