Skip to content

EPFLiGHT/Chi-Meditron

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Chi-Meditron

Chi-Meditron is an end-to-end engineering workflow for medical LLM development, covering data preparation, supervised fine-tuning, preference alignment, and benchmark evaluation.

Overview

This repository integrates the full training and evaluation lifecycle:

  • SFT data cleaning and format conversion
  • SFT training with Axolotl and DeepSpeed
  • DPO preference data generation
  • DPO alignment training
  • Multi-benchmark evaluation (MedQA / CMMLU / OpenCompass)

This top-level README provides the global workflow. Detailed parameters and scripts are documented in each subdirectory.

Features

  • Modular data pipeline for both SFT and DPO workflows
  • HPC-ready training scripts with SLURM support
  • Distributed training configuration with DeepSpeed ZeRO-3
  • Evaluation support for common medical benchmarks
  • Reproducible project layout with per-module documentation

Repository Layout

Chi-Meditron/
├── dataset/
│   ├── sft/                 # SFT data conversion and cleaning
│   └── dpo/                 # DPO preference data generation pipeline
├── training/
│   ├── sft/                 # SFT training configs and SLURM scripts
│   └── dpo/                 # DPO training configs and SLURM scripts
├── evaluation/              # Evaluation configs, scripts, and OpenCompass jobs
├── figure/                  # Figures and visualization assets
└── README.md                # This overview document

Environment

  • Local environment: Python 3.8+ (for data processing or question generation)
  • Cluster environment: Linux + SLURM + GPU (for training and evaluation)
  • Training stack: Axolotl + DeepSpeed

Cluster scripts contain placeholders such as <account_name>, <env_config_path>, and /path/to/.... Replace them with your actual values before execution.

Workflow

Recommended execution order:

  1. Build SFT training data in dataset/sft/
  2. Run SFT training in training/sft/
  3. Generate DPO preference data in dataset/dpo/
  4. Run DPO training in training/dpo/
  5. Evaluate models in evaluation/

Quick Start

1. Prepare SFT Data

Reference: dataset/sft/README.md

pip install -r dataset/sft/requirement.txt
python dataset/sft/main.py dataset/sft/raw_data_samples/medical-dialogue_4options/MedQA-train.jsonl multiquestion

The output is unified JSONL in OpenAI-compatible chat messages format.

2. Run SFT Training

Reference: training/sft/Readme.md

Key files:

  • training/sft/meditron-3-8b.yaml
  • training/sft/deepspeed.json
  • training/sft/launch_axolotl_meditron3_8b.sh
# Interactive debug run
torchrun --nproc_per_node=4 -m axolotl.cli.train /path/to/meditron-3-8b.yaml

# Batch submission
sbatch --nodes <num_nodes> training/sft/launch_axolotl_meditron3_8b.sh

3. Generate DPO Preference Data

Reference: dataset/dpo/README.md

sbatch dataset/dpo/run_dpo_data_generation.sh

4. Run DPO Training

Reference: training/dpo/README.md

Key files:

  • training/dpo/dpo_config.yml
  • training/dpo/zero3_bf16.json
  • training/dpo/launch_dpo.sh
  • training/dpo/dpo.jsonl
sbatch training/dpo/launch_dpo.sh

5. Evaluate Models

Reference: evaluation/README.md

Supported directions:

  • MedQA / PubMedQA / MedMCQA
  • CMMLU medical subsets
  • OpenCompass configuration-based evaluation

Key files:

  • evaluation/eval_config.yaml
  • evaluation/eval_config_benchmark.yaml
  • evaluation/eval_config_dpo.yaml
  • evaluation/run_opencompass.slurm
sbatch evaluation/run_opencompass.slurm

Documentation

  • SFT data processing: dataset/sft/README.md
  • DPO data generation: dataset/dpo/README.md
  • SFT training: training/sft/Readme.md
  • DPO training: training/dpo/README.md
  • Evaluation: evaluation/README.md

Notes

  • Prefer absolute paths in SLURM scripts.
  • Avoid relying on variable expansion that may not work inside #SBATCH directives.
  • Run a small interactive smoke test before large-scale training.
  • Before training, verify consistency for data paths, output paths, DeepSpeed config, and environment variables such as HF and WandB.

License and Compliance

This repository is intended for research and engineering use. When working with medical data, follow:

  • dataset licensing requirements
  • privacy and de-identification policies
  • your institution's ethics and compliance process

About

A Chinese-enhanced Meditron-8B model fine-tuned via SFT and DPO

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors