Thai Sentiment Analysis with PhayaThaiBERT

Production-ready Thai sentiment classifier with 82% accuracy. Fine-tuned PhayaThaiBERT on 21k social media messages.

Quick Demo

Try it now: Live Demo

Or run locally:

pip install transformers torch gradio
python app.py

Features

Multiple Interfaces: CLI, Web UI, REST API
Production Ready: Docker, FastAPI, HuggingFace hosting
High Performance: 82% accuracy, ONNX optimization
Explainable: LIME interpretability
Well Tested: Unit tests, CI/CD
Hardware Agnostic: GPU/Mac/CPU support

Performance

Metric	Score
Overall Accuracy	82%
Positive F1	0.72
Neutral F1	0.85
Negative F1	0.85

Quick Start

Use Pre-trained Model (No Training)

from transformers import pipeline

classifier = pipeline("sentiment-analysis",
                     model="SiemonCha/thai-sentiment-phayabert")

result = classifier("อาหารอร่อยมาก")
print(result)
# [{'label': 'POSITIVE', 'score': 0.98}]

Run Web UI

git clone https://github.com/SiemonCha/thai-sentiment
cd thai-sentiment
pip install -r requirements.txt
python app.py

REST API

python api.py
# API docs: http://localhost:8000/docs

curl -X POST "http://localhost:8000/predict" \
  -H "Content-Type: application/json" \
  -d '{"text": "อาหารอร่อยมาก"}'

Docker

docker build -t thai-sentiment .
docker run -p 7860:7860 thai-sentiment

Training from Scratch

# 1. Install dependencies
pip install -r requirements.txt

# 2. Download dataset
python data_download.py

# 3. Train model (~40 min Mac M1 / ~15 min GPU)
python train.py

# 4. Evaluate
python evaluate.py

Project Structure

thai-sentiment/
├── app.py                 # Gradio web UI
├── api.py                 # FastAPI REST API
├── demo.py                # Interactive CLI
├── predict.py             # Batch predictions
├── train.py               # Training script
├── evaluate.py            # Evaluation with metrics
├── explain.py             # LIME explainability
├── optimize.py            # ONNX conversion
├── benchmark.py           # Model comparison
├── test_model.py          # Unit tests
├── Dockerfile             # Container
└── requirements.txt       # Dependencies

Hardware Support

Platform	Training Time	Status
Mac M1/M2	~40 min	>> Tested
NVIDIA GPU	~15 min	>> Tested
AMD GPU (Linux)	~15 min	>> Tested
CPU	~4-5 hours	>> Works
AMD GPU (Windows)	~4-5 hours	XX CPU fallback

Testing

# Run tests
pytest test_model.py -v
pytest test_model.py test_model_advanced.py -v

# Benchmark
python benchmark.py

# Explain predictions
python explain.py

Advanced Features

Model Explainability

python explain.py
# Generates explanation.html showing word contributions

ONNX Optimization (3x faster)

python optimize.py
# Creates optimized model in ./model_onnx/

Batch Processing

python batch_predict.py
# Processes input.csv → output.csv

Technical Details

Base Model: PhayaThaiBERT (110M parameters)
Dataset: Wisesight Sentiment (21k messages)
Training: 5 epochs, class-weighted loss, AdamW
Architecture: BERT with classification head
Max Length: 128 tokens
Classes: Positive, Neutral, Negative

Citation

@misc{thai-sentiment-2025,
  author = {SiemonCha},
  title = {Thai Sentiment Analysis with PhayaThaiBERT},
  year = {2025},
  publisher = {GitHub},
  url = {https://github.com/SiemonCha/thai-sentiment}
}

Dataset:

@software{wisesight_sentiment,
  author = {Suriyawongkul, Arthit and Chuangsuwanich, Ekapol},
  title = {PyThaiNLP/wisesight-sentiment},
  year = 2019,
  doi = {10.5281/zenodo.3457447}
}

License

MIT License

Links

Live Demo: https://huggingface.co/spaces/SiemonCha/thai-sentiment-demo
Model Hub: https://huggingface.co/SiemonCha/thai-sentiment-phayabert
GitHub: https://github.com/SiemonCha/thai-sentiment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Thai Sentiment Analysis with PhayaThaiBERT

Quick Demo

Features

Performance

Quick Start

Use Pre-trained Model (No Training)

Run Web UI

REST API

Docker

Training from Scratch

Project Structure

Hardware Support

Testing

Advanced Features

Model Explainability

ONNX Optimization (3x faster)

Batch Processing

Technical Details

Citation

License

Links

About

Uh oh!

Releases

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
.github/workflows		.github/workflows
.vscode		.vscode
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
api.py		api.py
app.py		app.py
app_spaces.py		app_spaces.py
batch_predict.py		batch_predict.py
benchmark.py		benchmark.py
data_download.py		data_download.py
demo.py		demo.py
evaluate.py		evaluate.py
explain.py		explain.py
explore.py		explore.py
optimize.py		optimize.py
predict.py		predict.py
requirements.txt		requirements.txt
test_hf_model.py		test_hf_model.py
test_model.py		test_model.py
test_model_advanced.py		test_model_advanced.py
train.py		train.py
upload_model.py		upload_model.py

Folders and files

Latest commit

History

Repository files navigation

Thai Sentiment Analysis with PhayaThaiBERT

Quick Demo

Features

Performance

Quick Start

Use Pre-trained Model (No Training)

Run Web UI

REST API

Docker

Training from Scratch

Project Structure

Hardware Support

Testing

Advanced Features

Model Explainability

ONNX Optimization (3x faster)

Batch Processing

Technical Details

Citation

License

Links

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Contributors

Uh oh!

Languages