Custom trl.SFTTrainer that adds a KL divergence loss between a LoRA-adapted model and its base model.
-
Updated
Jul 25, 2025 - Python
Custom trl.SFTTrainer that adds a KL divergence loss between a LoRA-adapted model and its base model.
Originally computational physics projects for the statistical mechanics course I took in undergrad, The original branch contains the first revision after graduation, incorporating my strengthened python skills. The recent animated branches are milestones in approaching my ultimate goal of creating a Langevin dynamics simulator.
A Proactive Statistical Framework Using Distributional Divergence Metrics
This project explores the implementation of active learning techniques, focusing on various query strategies to optimize the selection of informative data points for model training. It aims to reduce the amount of labeled data required while improving model performance, especially in scenarios with limited labeled data.
Add a description, image, and links to the kldivergence topic page so that developers can more easily learn about it.
To associate your repository with the kldivergence topic, visit your repo's landing page and select "manage topics."