Chi-Meditron is an end-to-end engineering workflow for medical LLM development, covering data preparation, supervised fine-tuning, preference alignment, and benchmark evaluation.
This repository integrates the full training and evaluation lifecycle:
- SFT data cleaning and format conversion
- SFT training with Axolotl and DeepSpeed
- DPO preference data generation
- DPO alignment training
- Multi-benchmark evaluation (MedQA / CMMLU / OpenCompass)
This top-level README provides the global workflow. Detailed parameters and scripts are documented in each subdirectory.
- Modular data pipeline for both SFT and DPO workflows
- HPC-ready training scripts with SLURM support
- Distributed training configuration with DeepSpeed ZeRO-3
- Evaluation support for common medical benchmarks
- Reproducible project layout with per-module documentation
Chi-Meditron/
├── dataset/
│ ├── sft/ # SFT data conversion and cleaning
│ └── dpo/ # DPO preference data generation pipeline
├── training/
│ ├── sft/ # SFT training configs and SLURM scripts
│ └── dpo/ # DPO training configs and SLURM scripts
├── evaluation/ # Evaluation configs, scripts, and OpenCompass jobs
├── figure/ # Figures and visualization assets
└── README.md # This overview document
- Local environment: Python 3.8+ (for data processing or question generation)
- Cluster environment: Linux + SLURM + GPU (for training and evaluation)
- Training stack: Axolotl + DeepSpeed
Cluster scripts contain placeholders such as <account_name>, <env_config_path>, and /path/to/.... Replace them with your actual values before execution.
Recommended execution order:
- Build SFT training data in
dataset/sft/ - Run SFT training in
training/sft/ - Generate DPO preference data in
dataset/dpo/ - Run DPO training in
training/dpo/ - Evaluate models in
evaluation/
Reference: dataset/sft/README.md
pip install -r dataset/sft/requirement.txt
python dataset/sft/main.py dataset/sft/raw_data_samples/medical-dialogue_4options/MedQA-train.jsonl multiquestionThe output is unified JSONL in OpenAI-compatible chat messages format.
Reference: training/sft/Readme.md
Key files:
training/sft/meditron-3-8b.yamltraining/sft/deepspeed.jsontraining/sft/launch_axolotl_meditron3_8b.sh
# Interactive debug run
torchrun --nproc_per_node=4 -m axolotl.cli.train /path/to/meditron-3-8b.yaml
# Batch submission
sbatch --nodes <num_nodes> training/sft/launch_axolotl_meditron3_8b.shReference: dataset/dpo/README.md
sbatch dataset/dpo/run_dpo_data_generation.shReference: training/dpo/README.md
Key files:
training/dpo/dpo_config.ymltraining/dpo/zero3_bf16.jsontraining/dpo/launch_dpo.shtraining/dpo/dpo.jsonl
sbatch training/dpo/launch_dpo.shReference: evaluation/README.md
Supported directions:
- MedQA / PubMedQA / MedMCQA
- CMMLU medical subsets
- OpenCompass configuration-based evaluation
Key files:
evaluation/eval_config.yamlevaluation/eval_config_benchmark.yamlevaluation/eval_config_dpo.yamlevaluation/run_opencompass.slurm
sbatch evaluation/run_opencompass.slurm- SFT data processing:
dataset/sft/README.md - DPO data generation:
dataset/dpo/README.md - SFT training:
training/sft/Readme.md - DPO training:
training/dpo/README.md - Evaluation:
evaluation/README.md
- Prefer absolute paths in SLURM scripts.
- Avoid relying on variable expansion that may not work inside
#SBATCHdirectives. - Run a small interactive smoke test before large-scale training.
- Before training, verify consistency for data paths, output paths, DeepSpeed config, and environment variables such as HF and WandB.
This repository is intended for research and engineering use. When working with medical data, follow:
- dataset licensing requirements
- privacy and de-identification policies
- your institution's ethics and compliance process