Fine-tuning Qwen2.5-1.5B-Instruct with LoRA to rewrite sloppy git commit messages into clean, professional ones.
Input : fix login bug
Output: Fix authentication failure caused by invalid session token handling
Developers often write vague commit messages under pressure. This project fine-tunes a 1.5B LLM using Parameter-Efficient Fine-Tuning (LoRA) on 5,000 real GitHub commit messages, teaching the model to rewrite informal inputs into descriptive, professional commits — without fine-tuning all model weights.
| Component | Tool |
|---|---|
| Base Model | Qwen/Qwen2.5-1.5B-Instruct |
| Fine-Tuning | LoRA via 🤗 PEFT |
| Trainer | TRL SFTTrainer |
| Quantization | 4-bit (bitsandbytes) |
| Dataset | GitHub Commit Messages — Kaggle |
| Framework | PyTorch + HuggingFace Transformers |
commit-message-lora/
├── data/
│ └── prepare_dataset.py # Download, clean & sample from Kaggle
├── src/
│ ├── config.py # All hyperparameters in one place
│ ├── dataset.py # HuggingFace Dataset + tokenization
│ ├── train.py # End-to-end training pipeline
│ ├── inference.py # Load adapter & generate messages
│ └── utils.py # Prompt templates & text helpers
├── commit_lora/ # Saved LoRA adapter weights (post-training)
├── requirements.txt
pip install -r requirements.txtAlso requires a Kaggle API key at
~/.kaggle/kaggle.json
python data/prepare_dataset.pyDownloads the dataset, filters by message length, and samples 5,000 rows to data/prepared.csv.
python -m src.trainTrains for 2 epochs with 4-bit quantization + LoRA. The adapter is saved to ./commit_lora.
python -m src.inferenceOr use it directly in code:
from src.inference import load_model_for_inference, generate_commit_message
model, tokenizer = load_model_for_inference()
print(generate_commit_message("add dark mode", model, tokenizer))
# → "Add dark mode support with system preference detection"| Hyperparameter | Value |
|---|---|
LoRA rank (r) |
16 |
| LoRA alpha | 32 |
| Dropout | 0.05 |
| Learning rate | 2e-4 |
| Batch size | 4 |
| Gradient accumulation | 4 steps |
| Epochs | 2 |
| Max sequence length | 256 tokens |
| Quantization | 4-bit NF4 |
Fine-tuning a full 1.5B model requires significant GPU memory and time. LoRA inserts small trainable rank-decomposition matrices into the attention layers, cutting trainable parameters by ~99% while achieving comparable task performance. This makes the project reproducible on a single consumer GPU (e.g. T4 on Colab).