Skip to content

AMDResearch/ai4science-studio

Repository files navigation

AI4Science Studio

Agent-first recipes for open AI-for-science models. Tell your AI coding agent what to run — it reads the metadata, picks the scripts, and handles the rest. Works on AMD Instinct accelerators or wherever you work.

What is this?

AI4Science Studio is an agent-first collection of runnable recipes for open AI-for-science models. Point an AI coding agent — Cursor, Claude Code, or similar — at this repo and tell it what you want: run a weather forecast, generate novel crystals, fold a protein, or train a molecular design agent. The agent reads machine-readable metadata (model.yaml), picks the right scripts, and handles the rest.

Every model comes from Hugging Face or leading research groups, with self-contained recipe folders: a model card, ready-to-run examples, container setups, and optional AMD/ROCm tuning notes validated on real hardware. You can also use the repo without an agent — see Manual usage (without an agent) below.

Science domains

🌍 Earth Science

ORBIT-2 climate downscaling — input vs. prediction

Weather forecasting, climate modeling, and Earth-system ML. ORBIT-2 — a scalable vision foundation model for global weather and climate downscaling, developed in collaboration with ORNL and validated on AMD Instinct. Also: StormCast, NeuralGCM, ArchesWeather, PanguWeather, GenCast, Aurora.

Browse earth science recipes →

🔬 Material Science

HydraGNN architecture overview

Crystal structure generation, property prediction, and simulation surrogates. HydraGNN — a multi-task graph neural network for materials property prediction, developed at ORNL. Also: MatterGen.

Browse material science recipes →

🧬 Protein Folding

AlphaFold3 protein structure prediction

Structure prediction, folding, and protein language models.

Browse protein folding recipes →

🏥 Healthcare & Life Sciences

DNA double helix

Molecular design, medical imaging segmentation, and healthcare-adjacent ML. Models: REINVENT4, SemlaFlow, SwinUNETR, GP-MoLFormer.

Browse healthcare & life sciences recipes →

Content is for research and engineering only—not medical advice or clinical use.

⚛️ Physics Simulation

Surrogate models and neural operators for continuum dynamics, fluid mechanics, turbulence, and multiphysics systems. MATEY — ORNL multiscale adaptive transformer for spatiotemporal physical systems, validated on Frontier/MI250X. Also: Walrus — Polymathic AI 1.3B cross-domain continuum dynamics foundation model.

Browse physics simulation recipes →


Using this repo with an agent

Here's what agents can do for you out of the box.

Run any model

Just describe what you want:

"Run StormCast ensemble inference on MI300X with 4 members for 12 hours starting 2025-08-09T12"

Or use a slash command (Claude Code):

/run-stormcast SC_SIF=/path/to/sif
/run-mattergen unconditional generation
/run-gpmolformer scaffold c1ccccc1

All 15 models have a /run-* command: stormcast, orbit2, archesweather, aurora, gencast, neuralgcm, panguweather, mattergen, hydragnn, gpmolformer, swinunetr, semlaflow, reinvent4, matey, walrus.

Discover and compare models

/list-models                          # show all models
/list-models earth_science            # filter by domain
/list-models finetune                 # filter by task
/audit-models                         # readiness audit for all models

Or just ask:

"What models in this repo support fine-tuning?" "Which models are MIT licensed?" "Compare StormCast and ORBIT-2"

Add and audit models

/add-model microsoft/aurora → earth_science
/add-recipe StormCast ensemble inference on MI300X
/check-model NeuralGCM

Machine-readable metadata

Agents read these files to understand the repo:

File Purpose
models.yaml Index of all 15 models across 5 domains
<model>/model.yaml Per-model manifest: HF id, license, recipes, env vars, hardware
ACKNOWLEDGEMENTS.md Per-model attribution: upstream authors, papers, ROCm blog credits
.cursor/skills/ Agent skills for Cursor (run models, discover, domain conventions)
.cursor/rules/ Contextual rules that fire when editing specific file types
.claude/commands/ Slash commands for Claude Code

Manual usage (without an agent)

  1. Browse the domain folder for the model you want.
  2. Read models/<model>/README.md for the HF model id, license, and upstream links.
  3. Run the example scripts in models/<model>/examples/ — each folder has a docker_run.sh that sets up the container automatically.
# Example: launch the StormCast container
cd earth_science/models/StormCast/examples
./docker_run.sh

No build step, no compiled code. The scripts pull public container images and model weights on first run.

Contributing

The fastest way to add a model is to let the agent do it:

/add-model microsoft/aurora → earth_science

This walks through folder creation, README.md, model.yaml, recipe stubs, example scripts, and models.yaml registration — all in one pass.

If you prefer to do it manually:

  1. Fork the repo and create a branch.
  2. Copy _template/ to your domain and model folder.
  3. Fill in the model README, create a model.yaml, and add at minimum one runnable recipe.
  4. Add the model to models.yaml.
  5. Add an entry to ACKNOWLEDGEMENTS.md crediting the upstream authors, paper, and any ROCm blog post.
  6. Open a pull request.

See each domain's models/README.md for slug conventions and domain-specific notes.

Disclaimers

  • Each model is under its upstream license; check the model card on Hugging Face before use.
  • Healthcare & Life Sciences content is for research and engineering only. Do not commit patient-identifiable data or PHI.
  • AMD/ROCm notes in individual recipes reflect what maintainers have tested—they do not replace upstream install matrices or official product documentation.
  • Full attribution for upstream authors, papers, and ROCm blog contributors is in ACKNOWLEDGEMENTS.md.