Skip to content

surrealier/ssook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

132 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

ssook Logo

ssook

All-in-one desktop toolkit for AI model inference, evaluation, analysis, data management, training & deployment

Python 3.10+ FastAPI ONNX Runtime License: MIT Version


ssook Screenshot

Demo

ssook demo GIF

Screenshots

ssook viewer with sample inference frame ssook evaluation workflow ssook model comparison workflow ssook dataset explorer preview ssook VLM workflow ssook settings and model configuration

Do You Need…?

Feature What it does
🎬 Real-time Inference Viewer? Load an ONNX model, open a video or image, see detections/classifications live
πŸ“Š Multi-Model Evaluation? Compare multiple models side-by-side with mAP, Precision, Recall, F1
πŸ”¬ Inference Analysis? Inspect letterbox, tensor heatmap, and detection results on a single image
βš”οΈ Model A/B Compare? Run two models on the same images, navigate with a slider
🎯 FP/FN Error Analysis? Auto-classify false positives & negatives by size (S/M/L) and position
πŸ“ˆ Confidence Optimizer? Sweep thresholds per class, find the F1-maximizing confidence with PR curves
πŸ—ΊοΈ Embedding Visualization? t-SNE / UMAP / PCA 2D scatter plots from any feature extractor
⚑ Benchmark? Measure FPS, latency (P50/P95/P99), CPU/GPU usage with system info export
πŸ–ΌοΈ Segmentation Evaluation? mIoU, mDice, per-class IoU/Dice against GT masks
πŸ”€ CLIP Zero-Shot? Load image + text encoders, evaluate zero-shot classification
🧠 Vision-Language (VLM)? Caption & VQA via CLIP, local Qwen-VL (transformers), or OpenAI-compatible API
🧲 Embedder Evaluation? Retrieval@1/@K, cosine similarity, multi-image comparison
πŸ“ Dataset Explorer? Gallery with multi-class filter, box filter, class/size/aspect distribution charts
βœ‚οΈ Dataset Splitter? Random or stratified train/val/test split with progress tracking
πŸ”„ Format Converter? YOLO ↔ COCO JSON ↔ Pascal VOC XML batch conversion
🏷️ Class Remapper? Remap, merge, or delete class IDs in bulk
πŸ”— Dataset Merger? Combine datasets with dHash duplicate detection
πŸ“Š Smart Sampler? Balanced (equal per-class + diversity), Random, Stratified sampling
πŸ›‘οΈ Label Anomaly Detector? Find OOB boxes, size outliers, excessive overlaps
πŸ–ΌοΈ Image Quality Checker? Detect blur, brightness issues, overexposure, abnormal aspect ratios
πŸ‘― Near-Duplicate Detector? dHash perceptual hashing with configurable threshold
πŸ” Leaky Split Detector? Cross-split (train/val/test) duplicate detection
πŸ”Ž Similarity Search? Query any image β†’ top-K most similar results
🎨 Augmentation Preview? Mosaic, flip, rotate, Albumentations β€” preview before applying
πŸ‹οΈ Train a Model? YOLO detect/segment/pose/classify + timm/torchvision classifiers, live per-epoch metrics (docs)
πŸš€ Export / Deploy? .pt/.onnx β†’ onnxruntime (CPU/CUDA/DirectML) / TensorRT / OpenVINO / CoreML / TorchScript with configurable opset/batch/input/precision (docs)

All in one window. No code required.

Training & export frameworks (torch / ultralytics / openvino / …) are optional β€” install with pip install -r requirements-train.txt. Unavailable trainers/targets are greyed-out with an explanation; the ONNX-runtime-only core never breaks.


πŸ€– Supported Models

Task Model Format Metrics
Detection YOLO v5/v8/v9/v11, CenterNet (Darknet), Custom ONNX mAP@50, mAP@50:95, P/R/F1
Classification ONNX (2D output) Accuracy, per-class P/R/F1
Segmentation ONNX (CΓ—HΓ—W output) mIoU, mDice, per-class IoU/Dice
VLM / CLIP CLIP ONNX, local transformers (Qwen-VL), or OpenAI-compatible API Zero-shot Classification, Captioning, VQA
Embedder ONNX (feature extractor) Retrieval@1/@K, Cosine Similarity

Fixed-batch models (e.g., batch=4) are automatically detected and handled.


🧠 VLM Backends

The VLM tab supports three pluggable backends. Pick one in the Backend dropdown:

Backend What it runs Tasks Setup
clip (default) CLIP image + text encoder ONNX Zero-shot classification, template-based caption / VQA None β€” works out of the box, no extra deps
transformers Local generative VLM (e.g. Qwen2.5-VL) via πŸ€— transformers Captioning, VQA pip install -r requirements-vlm.txt (CUDA build of torch for GPU)
openai Any OpenAI-compatible chat endpoint β€” Ollama, vLLM, LM Studio, etc. Captioning, VQA pip install httpx (or requirements-vlm.txt); set endpoint URL + optional API key
  • clip is fully self-contained: just supply the image and text encoder ONNX files.
  • transformers downloads / loads a HuggingFace image-text-to-text model by ID or local path and runs it locally (GPU auto-detected via CUDA).
  • openai sends frames as base64 JPEG to {endpoint_url}/chat/completions. Point it at a local server (http://localhost:11434/v1 for Ollama, http://localhost:8000/v1 for vLLM) or a remote one with a Bearer API key.
# Enable the transformers + OpenAI-compatible backends (CLIP needs nothing):
pip install -r requirements-vlm.txt

πŸš€ Getting Started

Option 1: Download Release (Recommended)

Download the latest release from Releases:

  • Windows: .msi installer or .zip portable
  • macOS: .dmg disk image

Just run β€” no Python needed.

Option 2: Run from Source

Requires Python 3.10+.

git clone https://github.com/surrealier/ssook.git
cd ssook

pip install -r requirements-web.txt

# Optional extras
pip install matplotlib scikit-learn openpyxl   # charts & Excel export
pip install umap-learn                          # UMAP embedding
pip install pywebview                           # native desktop window
pip install onnxruntime-gpu                     # CUDA acceleration
pip install -r requirements-vlm.txt            # transformers / OpenAI VLM backends

# EP venv 격리 μ„€μΉ˜ (GPU/DirectML/OpenVINO/CoreML λ™μ‹œ 곡쑴)
python scripts/setup_ep.py                      # ν”Œλž«νΌ 전체 EP μ„€μΉ˜
python scripts/setup_ep.py cuda cpu             # νŠΉμ • EP만 μ„€μΉ˜
python scripts/setup_ep.py --status             # μ„€μΉ˜ μƒνƒœ 확인

python run_web.py
Flag Description
--port 9000 Custom port (default: 8765)
--browser Force browser mode instead of native window

πŸ“– Quick Start

1. Launch  β†’  Settings tab  β†’  Download test models & sample data
2. Viewer tab  β†’  Open video/image  β†’  See real-time inference
3. Evaluation tab  β†’  Add models, set GT labels  β†’  Run evaluation
4. Analysis tab  β†’  Dive into FP/FN, confidence optimization, embeddings
5. Data tab  β†’  Explore, split, convert, clean your dataset

πŸ“– Documentation

🌐 English | ν•œκ΅­μ–΄ | ζ—₯本θͺž | δΈ­ζ–‡

Document Topics
Model Optimization Quantization (INT8, FP16, Mixed Precision), Pruning, Graph Optimization
Model Analysis Model Diagnosis, Profiler, Inspector
Evaluation Metrics mAP, IoU, P/R/F1, Confidence Optimizer, FP/FN Error Analysis
Embedding & CLIP t-SNE / UMAP / PCA, CLIP Zero-Shot, Embedder Evaluation
Execution Providers Auto EP Selection, venv Isolation, GPU Acceleration
Tracking & Sampling ByteTrack / SORT, Smart Sampler, dHash Duplicate Detection
General Features Viewer, Explorer, Splitter, Converter, Quality Tools, Benchmark

βš™ Configuration

Settings are stored in settings/app_config.yaml and persist across sessions:

model_type: yolo
conf_threshold: 0.25
batch_size: 1
box_thickness: 2
label_size: 0.55
show_labels: true
show_confidence: true

πŸ“¦ Dependencies

Required (requirements-web.txt)

Package Purpose
fastapi Web backend
uvicorn ASGI server
opencv-python Image/video processing
numpy Numerical operations
onnxruntime ONNX model inference
psutil System resource monitoring
PyYAML Configuration management

Optional

Package Purpose
pywebview Native desktop window (instead of browser)
matplotlib Charts, scatter plots, PR curves
scikit-learn t-SNE, PCA dimensionality reduction
openpyxl Excel report export
umap-learn UMAP embedding visualization
onnxruntime-gpu CUDA/TensorRT acceleration
transformers / torch / accelerate Local generative VLM backend (Qwen-VL) β€” see requirements-vlm.txt
httpx OpenAI-compatible VLM backend (Ollama / vLLM / LM Studio)

πŸ§ͺ Testing

python -m pytest tests/ -v

πŸ“‹ Changelog

v1.6.0

  • QC release: P0 crash fixes across viewer, evaluation, data, and analysis flows
  • Security hardening: path-safety validation on all user-supplied paths (traversal prevention)
  • 7 specialized tabs now reachable: CLIP Zero-Shot, Embedder, Segmentation, Tracking, VLM, and more are registered in the sidebar
  • Pluggable VLM backends: choose clip (dependency-free), transformers (local Qwen-VL), or openai (OpenAI-compatible β€” Ollama / vLLM / LM Studio) β€” see VLM Backends

v1.4.0

  • EP venv Isolation: onnxruntime 변쒅별 독립 venv 격리 (ep_venvs/) β€” GPU/DirectML/OpenVINO/CoreML/CPU λ™μ‹œ 곡쑴
  • Auto EP Selection: ν”Œλž«νΌΒ·ν•˜λ“œμ›¨μ–΄ 기반 졜적 Execution Provider μžλ™ 선택
  • CoreML Support: macOS Apple Silicon CoreMLExecutionProvider 지원
  • OpenVINO GPU-first: OpenVINO EPκ°€ Intel iGPU μš°μ„  μ‹œλ„, λΆˆκ°€ μ‹œ OpenVINO CPU 폴백
  • Cross-platform Setup: python scripts/setup_ep.py 단일 슀크립트둜 Windows/Linux/macOS EP μ„€μΉ˜

v1.3.2

  • Bugfix: Fix Internal Server Error (index.html missing from build)
  • Bugfix: Fix frozen exe path resolution (sys._MEIPASS)
  • pywebview: Native desktop window as default, browser as fallback

v1.3.1

  • Sample Data: Built-in test images (bus.jpg, zidane.jpg) and video (people.mp4)
  • COCO128: Dataset download link in Settings tab
  • Bugfix: Fix frozen exe crash (sys.stderr=None in PyInstaller)

v1.3.0

  • Smart Sampler: Balanced mode now distributes target count equally across classes with farthest-point sampling for spatial diversity
  • Progress Bars: All tabs unified to explorer-style progress bar (20px height, % text overlay)
  • Remapper: Converted to async with progress tracking
  • Removed: Batch Inference tab (redundant with Viewer); Augmentation moved to Data section

v1.2.0

  • Explorer: Async loading with progress bar, double-click image preview with bbox overlay, multi-class checkbox filter, box operator filter (>=, =, <=), 5 view modes (file list, class distribution by box/image, box size distribution, aspect ratio distribution)
  • Splitter: Strategy selection (random / stratified), custom ratio inputs, 0-ratio skip, progress bar
  • Conf Optimizer: Per-class PR curve visualization, F1 display fix
  • Embedder: Multi-image cosine similarity comparison
  • Recursive folder support: Remapper, Merger, Sampler, Anomaly Detector, Quality Checker, Duplicate Detector
  • Merger: dHash threshold description and input binding
  • i18n: Korean translations for new UI elements

v1.1.0

  • Web UI overhaul with analysis tabs, class mapping, model downloads
  • Benchmark system info export
  • Rebrand to ssook

πŸ“„ License

MIT License