A production-grade example project demonstrating auto-scaling microservices on Kubernetes with Go. Features a complete multi-service architecture with HPA, monitoring, RBAC, network policies, and progressive deployment strategies.
┌─────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
Internet ──▶ ┌───┴────┐ ┌──────────┐ ┌───────────────┐ │
│ Ingress │───▶│ API │───▶│ Worker │ │
│ │ │ Gateway │ │ (auto-scaled) │ │
└───┬─────┘ └────┬─────┘ └───────┬───────┘ │
│ │ │ │
│ ┌────┴─────┐ ┌──────┴──────┐ │
│ │ Redis │ │ Redis │ │
│ │ (cache) │ │ (queue) │ │
│ └──────────┘ └─────────────┘ │
│ │
┌───┴─────┐ ┌────────────┐ ┌──────────────┐ │
│ Frontend │ │ Prometheus │ │ Grafana │ │
│ (Go SPA) │ │ │──▶│ │ │
└──────────┘ └────────────┘ └──────────────┘ │
└─────────────────────────────────────────────┘
| Service | Description | Key K8s Features |
|---|---|---|
| API Gateway | HTTP API that accepts tasks, caches results in Redis, and queues work | HPA, rolling updates, liveness/readiness probes |
| Worker | Processes tasks from Redis queue, CPU-intensive workload simulation | HPA (CPU-based auto-scaling), PDB |
| Frontend | Lightweight Go server serving a dashboard UI | Deployment, health checks |
| Redis | Cache + message queue | StatefulSet, PVC, resource limits |
- HorizontalPodAutoscaler (HPA) — API and Worker scale 2→10 replicas based on CPU
- PodDisruptionBudget (PDB) — Ensures minimum availability during disruptions
- Rolling Updates — Zero-downtime deployments with
maxSurge/maxUnavailable - Health Probes — Liveness, readiness, and startup probes on all services
- RBAC — Least-privilege ServiceAccounts per service
- NetworkPolicies — Microsegmentation limiting pod-to-pod traffic
- Secrets — Redis auth via Kubernetes secrets
- SecurityContext — Non-root, read-only filesystem, dropped capabilities
- Prometheus — Metrics collection with ServiceMonitor CRDs
- Grafana — Pre-built dashboard for request rate, latency, queue depth
- ConfigMaps — Externalized configuration per environment
- Resource Requests/Limits — CPU and memory governance on every pod
- Namespaces — Logical isolation (
nexusnamespace)
- A Kubernetes cluster (minikube, kind, EKS, GKE, AKS)
kubectlconfigureddocker(to build images, or use pre-built)- Metrics Server installed (for HPA)
# Create namespace and deploy everything
./scripts/deploy.sh
# Or step by step:
kubectl apply -f k8s/namespace.yaml
kubectl apply -f k8s/secrets/
kubectl apply -f k8s/configmaps/
kubectl apply -f k8s/rbac/
kubectl apply -f k8s/redis/
kubectl apply -f k8s/api-gateway/
kubectl apply -f k8s/worker/
kubectl apply -f k8s/frontend/
kubectl apply -f k8s/network-policies/
kubectl apply -f k8s/monitoring/# Check all pods are running
kubectl -n nexus get pods
# Watch HPA in action
kubectl -n nexus get hpa --watch
# Generate load to trigger auto-scaling
./scripts/load-test.sh# Build all service images
docker build -t nexus-api-gateway:latest -f services/api-gateway/Dockerfile services/api-gateway/
docker build -t nexus-worker:latest -f services/worker/Dockerfile services/worker/
docker build -t nexus-frontend:latest -f services/frontend/Dockerfile services/frontend/.
├── k8s/ # All Kubernetes manifests
│ ├── namespace.yaml
│ ├── api-gateway/ # API Gateway deployment, service, HPA
│ ├── worker/ # Worker deployment, service, HPA, PDB
│ ├── frontend/ # Frontend deployment, service
│ ├── redis/ # Redis StatefulSet, service, PVC
│ ├── configmaps/ # Externalized config per service
│ ├── secrets/ # Kubernetes secrets
│ ├── rbac/ # ServiceAccounts, Roles, RoleBindings
│ ├── network-policies/ # Pod-to-pod traffic rules
│ └── monitoring/ # Prometheus + Grafana configs
├── services/ # Go microservice source code
│ ├── api-gateway/
│ ├── worker/
│ └── frontend/
└── scripts/ # Deploy and load-test helpers
Run the load test to see HPA scale the API gateway and workers:
# Terminal 1: Watch pods scale up
kubectl -n nexus get pods --watch
# Terminal 2: Watch HPA metrics
kubectl -n nexus get hpa --watch
# Terminal 3: Generate sustained load
./scripts/load-test.shYou should see pod replicas increase from 2 → up to 10 as CPU utilization rises above 60%.
MIT