Computer vision edge deployment pipeline: MobileNetV3 fine-tuning, TFLite conversion, and quantization benchmarks.
Vision-Edge implements a lightweight object detection pipeline optimized for edge devices:
Input Image (320x320x3)
|
MobileNetV3Small (backbone)
|
+-----------+-----------+
| |
Feature Map 1 Feature Map 2
(mid-level) (high-level)
| |
SSD Head 1 SSD Head 2
| |
+-----------+-----------+
|
Concatenated Predictions
(class_scores + box_offsets)
|
Non-Maximum Suppression
|
Final Detections
The backbone uses tf.keras.applications.MobileNetV3Small pretrained on ImageNet. It extracts multi-scale feature maps at two spatial resolutions, providing both fine-grained and semantic features for detection.
Key features:
- Width multiplier (
alpha) for model size control - Optional backbone freezing for transfer learning
- Lightweight inverted residual blocks with squeeze-and-excitation
Each feature map feeds into a Single Shot Detector (SSD) head that predicts:
- Class scores:
num_anchors * num_classesper spatial location - Box offsets:
num_anchors * 4(dx, dy, dw, dh) per spatial location
The SSD loss combines:
- Localization: Smooth L1 (Huber) loss on positive anchor box offsets
- Classification: Cross-entropy with hard negative mining (neg:pos ratio = 3:1)
Three export variants are supported for deployment flexibility:
| Variant | Precision | Typical Size Reduction | Use Case |
|---|---|---|---|
| FP32 | 32-bit float | 1x (baseline) | Development, accuracy validation |
| FP16 | 16-bit float | ~2x | GPU-capable edge devices |
| INT8 | 8-bit integer | ~4x | Microcontrollers, mobile CPUs |
from src.export.tflite_converter import TFLiteExporter
from src.export.quantization import create_representative_dataset
exporter = TFLiteExporter(model)
# Export all variants
rep_dataset = create_representative_dataset(calibration_data, input_shape=(320, 320, 3))
results = exporter.convert_all("exported_models/", representative_dataset=rep_dataset)INT8 quantization uses a representative dataset (100 samples by default) for activation range calibration, ensuring minimal accuracy degradation.
Example results on MobileNetV3-SSD (320x320, 10 classes):
| Variant | Size (MB) | Latency (ms) | mAP@0.5 | Size Reduction | Speedup |
|---|---|---|---|---|---|
| FP32 | 5.80 | 28.3 | 0.6820 | 1.0x | 1.0x |
| FP16 | 3.10 | 22.1 | 0.6815 | 1.9x | 1.3x |
| INT8 | 1.55 | 12.4 | 0.6680 | 3.7x | 2.3x |
Benchmarks measured on CPU with 50 inference runs and 5 warmup iterations.
pip install -e .
# With development dependencies
pip install -e ".[dev]"from src.model.mobilenet_ssd import MobileNetSSD
from src.model.losses import SSDLoss
from src.data.dataset import create_synthetic_dataset
# Build model
model = MobileNetSSD(input_size=320, num_classes=10)
# Create dataset
dataset = create_synthetic_dataset(num_samples=1000, image_size=320, num_classes=10)
# Train
loss_fn = SSDLoss(num_classes=10)
model.compile(optimizer="adam", loss=loss_fn)from src.export.tflite_converter import TFLiteExporter
exporter = TFLiteExporter(model)
exporter.convert_all("exported_models/")from src.inference.tflite_engine import TFLiteEngine
from src.inference.nms import non_max_suppression
engine = TFLiteEngine("exported_models/model_fp32.tflite")
image = engine.preprocess(raw_image)
outputs = engine.predict(image)
# Post-process with NMS
boxes, scores, indices = non_max_suppression(
outputs["boxes"], outputs["scores"], iou_threshold=0.45
)from src.benchmark.latency import measure_tflite_latency
from src.benchmark.report import generate_report
stats = measure_tflite_latency("exported_models/model_fp32.tflite")
print(f"Mean latency: {stats['mean_ms']:.1f} ms")vision-edge/
+-- src/
| +-- data/ # Data schemas, tf.data pipeline, augmentation
| +-- model/ # MobileNetV3 backbone, SSD head, losses
| +-- export/ # TFLite conversion, quantization, SavedModel
| +-- benchmark/ # Latency, accuracy (mAP), report generation
| +-- inference/ # TFLite engine, NMS post-processing
| +-- deploy/ # Hugging Face Hub push utilities
| +-- utils/ # Device config, visualization
+-- tests/ # Comprehensive test suite
+-- configs/ # YAML configuration files
+-- app.py # Gradio demo application
+-- Dockerfile # Container deployment
pytest tests/ -vdocker build -t vision-edge .
docker run -p 7860:7860 vision-edgeApache-2.0