Skip to content

Sphere-AI-Lab/pion

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Pion: A Spectrum-Preserving Optimizer via Orthogonal Equivalence Transformation

This repository is the official PyTorch implementation of Pion Optimizer, by Kexuan Shi, Hanxuan Li, Zeju Qiu, Yandong Wen, Simon Buchholz, Weiyang Liu.

The code is coming soon. Stay tuned. :)

Running RL Experiments

Environment Setup

The RL experiments are built on top of verl. Please follow the installation instructions in verl/README.md to set up the environment.

Before running, you need to edit the scripts and replace the placeholder paths:

  • /path/to/your/dataset/ — path to the preprocessed dataset (see verl data preparation)
  • /path/to/your/model — path to the pretrained model.

Running GRPO Training with Pion Optimizer

We provide a ready-to-use script for training Qwen3-1.7B on the DeepMath dataset using GRPO with the Pion optimizer:

cd verl
bash examples/grpo_trainer/run_qwen3_1.7b_pion_deepmath.sh # for Qwen3-1.7B
bash examples/grpo_trainer/run_distilled_pion_deepmath.sh # for DeepSeek-R1-Distilled-Qwen-1.5B

To run baseline comparisons with AdamW and Muon:

# Qwen3-1.7B
bash examples/grpo_trainer/run_qwen3_1.7b_adamw_deepmath.sh   # AdamW
bash examples/grpo_trainer/run_qwen3_1.7b_muon_deepmath.sh    # Muon

# DeepSeek-R1-Distilled-Qwen-1.5B
bash examples/grpo_trainer/run_distilled_adamw_deepmath.sh    # AdamW
bash examples/grpo_trainer/run_distilled_muon_deepmath.sh     # Muon

Releases

No releases published

Packages

 
 
 

Contributors