Skip to content

AlainKwishima/Aurora

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

high resolution (0.1 degree) temperature at 2m predictions gif    nitrogen dioxide predictions gif    ocean wave direction predictions gif    tropical cyclone track predictions gif

🌍 Aurora: A Foundation Model for the Earth System

State-of-the-art AI for weather, air pollution, and ocean wave forecasting

CI Documentation Paper arXiv PyPI version Conda Version DOI

📚 Documentation📄 Paper🚀 Quick Start💡 Examples🤗 Models

🌟 Overview

Aurora is a cutting-edge foundation model capable of predicting atmospheric variables, air pollution, and ocean waves with unprecedented accuracy. Trained on vast amounts of Earth system data, Aurora can be adapted to specialized forecasting tasks with minimal additional training.

🎯 Key Capabilities

  • 🌤️ Weather Forecasting: High-resolution weather predictions at 0.1° and 0.25° resolutions
  • 💨 Air Pollution: Predicts PM1, PM2.5, PM10, NO₂, SO₂, O₃ and other pollutants
  • 🌊 Ocean Waves: Forecasts significant wave height, direction, and period
  • 🌀 Tropical Cyclones: Tracks hurricane paths and intensities
  • ⚡ Real-time: Fast inference for operational forecasting
  • 🌍 Global Coverage: Works anywhere on Earth with consistent performance

🏆 Performance Highlights

  • Superior Accuracy: Outperforms traditional numerical weather prediction in many scenarios
  • Multi-Resolution: Supports both 0.1° (1801×3600) and 0.25° (721×1440) global grids
  • Fast Inference: Generate global forecasts in minutes, not hours
  • Energy Efficient: 1000x less energy consumption than traditional methods
  • Foundation Model: Pre-trained on diverse datasets, adaptable to new regions with minimal fine-tuning

Cite us as follows:

@article{bodnar2025aurora,
    title = {A Foundation Model for the Earth System},
    author = {Cristian Bodnar and Wessel P. Bruinsma and Ana Lucic and Megan Stanley and Anna Allen and Johannes Brandstetter and Patrick Garvan and Maik Riechert and Jonathan A. Weyn and Haiyu Dong and Jayesh K. Gupta and Kit Thambiratnam and Alexander T. Archibald and Chun-Chieh Wu and Elizabeth Heider and Max Welling and Richard E. Turner and Paris Perdikaris},
    journal = {Nature},
    year = {2025},
    month = {May},
    day = {21},
    issn = {1476-4687},
    doi = {10.1038/s41586-025-09005-y},
    url = {https://doi.org/10.1038/s41586-025-09005-y},
}

Contents:

Please email AIWeatherClimate@microsoft.com if you are interested in using Aurora for commercial applications. For research-related questions or technical support with the code here, please open an issue or reach out to the authors of the paper.

What is Aurora?

Aurora is a machine learning model that can predict atmospheric variables, such as temperature. It is a foundation model, which means that it was first generally trained on a lot of data, and then can be adapted to specialised atmospheric forecasting tasks with relatively little data. We provide four such specialised versions: one for medium-resolution weather prediction, one for high-resolution weather prediction, one for air pollution prediction, and one for ocean wave prediction.

🚀 Quick Start

📦 Installation

Choose your preferred installation method:

With pip (recommended):

pip install microsoft-aurora

With conda/mamba:

mamba install microsoft-aurora -c conda-forge

From source (for development):

git clone https://github.com/microsoft/aurora.git
cd aurora
make install  # Installs with dev dependencies and pre-commit hooks

🔬 Your First Prediction

Get started in under 30 seconds:

from datetime import datetime
import torch
from aurora import AuroraSmallPretrained, Batch, Metadata

# Initialize model (downloads ~500MB checkpoint)
model = AuroraSmallPretrained()
model.load_checkpoint()

# Create sample data batch
batch = Batch(
    surf_vars={k: torch.randn(1, 2, 17, 32) for k in ("2t", "10u", "10v", "msl")},
    static_vars={k: torch.randn(17, 32) for k in ("lsm", "z", "slt")},
    atmos_vars={k: torch.randn(1, 2, 4, 17, 32) for k in ("z", "u", "v", "t", "q")},
    metadata=Metadata(
        lat=torch.linspace(90, -90, 17),
        lon=torch.linspace(0, 360, 32 + 1)[:-1],
        time=(datetime(2020, 6, 1, 12, 0),),
        atmos_levels=(100, 250, 500, 850),
    ),
)

# Generate prediction
prediction = model.forward(batch)
print(f"Temperature prediction shape: {prediction.surf_vars['2t'].shape}")

🎯 Real-World Examples

Weather Forecasting with ERA5:

from aurora import AuroraPretrained, Batch

# Use the full model for production
model = AuroraPretrained()
model.load_checkpoint()
model.eval()

# Your ERA5 data processing here...
# batch = process_era5_data(...)
# prediction = model(batch)

High-Resolution Weather (0.1°):

from aurora import AuroraHighRes

model = AuroraHighRes()
model.load_checkpoint()
# Best for IFS HRES analysis data at 0.1° resolution

Air Pollution Forecasting:

from aurora import AuroraAirPollution

model = AuroraAirPollution()
model.load_checkpoint()
# Predicts PM1, PM2.5, PM10, NO₂, SO₂, O₃

Ocean Wave Prediction:

from aurora import AuroraWave

model = AuroraWave()
model.load_checkpoint()
# Forecasts wave height, direction, and period

🎯 Available Models

Aurora comes in several specialized variants optimized for different forecasting tasks:

Model Resolution Best For Variables Use Case
Aurora Pretrained 0.25° General weather, ERA5 2t, 10u, 10v, msl, t, u, v, q, z Research, custom datasets
Aurora HighRes 0.1° High-res weather, IFS HRES 2t, 10u, 10v, msl, t, u, v, q, z Operational forecasting
Aurora Air Pollution 0.4° Air quality, CAMS PM1, PM2.5, PM10, NO₂, SO₂, O₃, CO Environmental monitoring
Aurora Wave 0.25° Ocean waves, HRES-WAM swh, mwd, mwp, shww, mdww Maritime safety
Aurora Small 0.25° Development, testing Basic weather vars Debugging, prototyping

🎯 Choosing the Right Model

  • 🌍 General Weather: Start with AuroraPretrained for maximum flexibility
  • 📍 High-Resolution: Use AuroraHighRes for detailed regional forecasts
  • 💨 Air Quality: Choose AuroraAirPollution for pollution monitoring
  • 🌊 Maritime: Select AuroraWave for ocean and coastal applications
  • 🔬 Research: All models support fine-tuning on your specific datasets

⚡ Advanced Features

🔄 Autoregressive Forecasting

Generate multi-step forecasts automatically:

from aurora import rollout

# Generate 10-day forecast
predictions = [pred.to("cpu") for pred in rollout(model, batch, steps=40)]

🔧 Fine-tuning & Customization

Adapt Aurora to your specific domain:

# Load pretrained weights
model = AuroraPretrained()
model.load_checkpoint()

# Fine-tune on your data
# ... your training loop here ...

📊 Batch Processing

Process multiple forecasts efficiently:

# Process multiple regions/time steps
batch = Batch(
    surf_vars={...},  # Shape: (batch_size, time_steps, lat, lon)
    atmos_vars={...},  # Shape: (batch_size, time_steps, levels, lat, lon)
    metadata=Metadata(...)
)

🚀 GPU Acceleration

Optimized for NVIDIA GPUs:

model = model.to("cuda")
with torch.inference_mode():
    prediction = model(batch.to("cuda"))

📈 System Requirements

Model GPU Memory CPU Memory Inference Time
Aurora Small 8GB 16GB ~30 seconds
Aurora Pretrained 40GB 64GB ~2 minutes
Aurora HighRes 60GB 128GB ~5 minutes

Benchmarks on NVIDIA A100, global 0.25° resolution

🤝 Contributing

We welcome contributions from the community! Please see our Contributing Guide for details.

Quick Setup for Contributors

# Clone and install in development mode
git clone https://github.com/microsoft/aurora.git
cd aurora
make install  # Installs dev dependencies + pre-commit hooks

# Run tests
make test

# Build documentation  
make docs

📚 Resources

📖 Documentation & Examples

📊 Model Weights & Data

🔬 Research & Publications

⚖️ License & Legal

📄 License

This project is licensed under the MIT License - see the LICENSE.txt file for details.

🔒 Security

For security concerns, please see SECURITY.md and follow responsible disclosure practices.

🏢 Trademarks

This project may contain Microsoft trademarks or logos. Authorized use must follow Microsoft's Trademark & Brand Guidelines.

🏛️ Responsible AI

🎯 Our Commitment

Microsoft is committed to responsible AI development. This project follows our AI principles of fairness, reliability, privacy, security, inclusiveness, transparency, and accountability.

⚠️ Important Limitations

🔬 Research Use: This code is intended for research and academic purposes. Commercial applications require separate licensing - contact AIWeatherClimate@microsoft.com.

🎯 Accuracy: Aurora provides probabilistic forecasts without guaranteed accuracy. Predictions should not be used directly for critical decisions without proper validation and expert review.

📊 Training Data: Models inherit potential biases from training data (ERA5, CMIP6, HRES, CAMS, etc.). Performance may vary for extreme events or unprecedented conditions.

🔧 Operational Use: Additional verification, post-processing, and expert analysis are essential before operational deployment.

📈 Model Evaluations

Aurora underwent extensive evaluation on held-out test data, including:

  • ✅ Standard accuracy metrics (RMSE, ACC, CRPS)
  • ✅ Extreme weather events (heatwaves, cold snaps, storms)
  • ✅ Rare events (Hurricane Ciarán 2023, unusual patterns)
  • ✅ Regional performance across different climate zones

See the paper for complete evaluation details.

📖 Citation

If you use Aurora in your research, please cite:

@article{bodnar2025aurora,
    title = {A Foundation Model for the Earth System},
    author = {Cristian Bodnar and Wessel P. Bruinsma and Ana Lucic and Megan Stanley and Anna Allen and Johannes Brandstetter and Patrick Garvan and Maik Riechert and Jonathan A. Weyn and Haiyu Dong and Jayesh K. Gupta and Kit Thambiratnam and Alexander T. Archibald and Chun-Chieh Wu and Elizabeth Heider and Max Welling and Richard E. Turner and Paris Perdikaris},
    journal = {Nature},
    year = {2025},
    month = {May},
    day = {21},
    issn = {1476-4687},
    doi = {10.1038/s41586-025-09005-y},
    url = {https://doi.org/10.1038/s41586-025-09005-y},
}

❓ FAQ

💻 System Requirements

Minimum: 16GB RAM, 8GB VRAM (for Aurora Small) Recommended: 64GB RAM, 40GB VRAM (for full Aurora) Optimal: 128GB RAM, 80GB VRAM (for high-resolution models)

⚡ Performance Tips
  • Use GPU acceleration for 10-100x speedup
  • Batch multiple forecasts for better throughput
  • Move predictions to CPU to manage GPU memory
  • Use torch.inference_mode() for faster inference
🔧 Troubleshooting

Out of Memory: Reduce batch size or use Aurora Small Slow Inference: Enable GPU acceleration and optimize data loading Poor Predictions: Ensure correct input data format and normalization Import Errors: Check PyTorch installation and CUDA compatibility

🚀 Getting Help

🌟 Star this repo if you find Aurora useful!

📊 Browse Examples to see Aurora in action

About

Aurora is a cutting-edge foundation model capable of predicting atmospheric variables, air pollution, and ocean waves with unprecedented accuracy. Trained on vast amounts of Earth system data, Aurora can be adapted to specialized forecasting tasks with minimal additional training.

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages