Production-ready Thai sentiment classifier with 82% accuracy. Fine-tuned PhayaThaiBERT on 21k social media messages.
Try it now: Live Demo
Or run locally:
pip install transformers torch gradio
python app.py- Multiple Interfaces: CLI, Web UI, REST API
- Production Ready: Docker, FastAPI, HuggingFace hosting
- High Performance: 82% accuracy, ONNX optimization
- Explainable: LIME interpretability
- Well Tested: Unit tests, CI/CD
- Hardware Agnostic: GPU/Mac/CPU support
| Metric | Score |
|---|---|
| Overall Accuracy | 82% |
| Positive F1 | 0.72 |
| Neutral F1 | 0.85 |
| Negative F1 | 0.85 |
from transformers import pipeline
classifier = pipeline("sentiment-analysis",
model="SiemonCha/thai-sentiment-phayabert")
result = classifier("อาหารอร่อยมาก")
print(result)
# [{'label': 'POSITIVE', 'score': 0.98}]git clone https://github.com/SiemonCha/thai-sentiment
cd thai-sentiment
pip install -r requirements.txt
python app.pypython api.py
# API docs: http://localhost:8000/docscurl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{"text": "อาหารอร่อยมาก"}'docker build -t thai-sentiment .
docker run -p 7860:7860 thai-sentiment# 1. Install dependencies
pip install -r requirements.txt
# 2. Download dataset
python data_download.py
# 3. Train model (~40 min Mac M1 / ~15 min GPU)
python train.py
# 4. Evaluate
python evaluate.pythai-sentiment/
├── app.py # Gradio web UI
├── api.py # FastAPI REST API
├── demo.py # Interactive CLI
├── predict.py # Batch predictions
├── train.py # Training script
├── evaluate.py # Evaluation with metrics
├── explain.py # LIME explainability
├── optimize.py # ONNX conversion
├── benchmark.py # Model comparison
├── test_model.py # Unit tests
├── Dockerfile # Container
└── requirements.txt # Dependencies
| Platform | Training Time | Status |
|---|---|---|
| Mac M1/M2 | ~40 min | >> Tested |
| NVIDIA GPU | ~15 min | >> Tested |
| AMD GPU (Linux) | ~15 min | >> Tested |
| CPU | ~4-5 hours | >> Works |
| AMD GPU (Windows) | ~4-5 hours | XX CPU fallback |
# Run tests
pytest test_model.py -v
pytest test_model.py test_model_advanced.py -v
# Benchmark
python benchmark.py
# Explain predictions
python explain.pypython explain.py
# Generates explanation.html showing word contributionspython optimize.py
# Creates optimized model in ./model_onnx/python batch_predict.py
# Processes input.csv → output.csv- Base Model: PhayaThaiBERT (110M parameters)
- Dataset: Wisesight Sentiment (21k messages)
- Training: 5 epochs, class-weighted loss, AdamW
- Architecture: BERT with classification head
- Max Length: 128 tokens
- Classes: Positive, Neutral, Negative
@misc{thai-sentiment-2025,
author = {SiemonCha},
title = {Thai Sentiment Analysis with PhayaThaiBERT},
year = {2025},
publisher = {GitHub},
url = {https://github.com/SiemonCha/thai-sentiment}
}Dataset:
@software{wisesight_sentiment,
author = {Suriyawongkul, Arthit and Chuangsuwanich, Ekapol},
title = {PyThaiNLP/wisesight-sentiment},
year = 2019,
doi = {10.5281/zenodo.3457447}
}MIT License