## Overview Support deploying the GPU backend on AWS (p4d/p5) and GCP (a2/a3) instances. ## Tasks - [ ] AWS p4d.24xlarge (8× A100) deployment scripts - [ ] AWS p5.48xlarge (8× H100) deployment scripts - [ ] GCP a2-highgpu-8g (8× A100) deployment scripts - [ ] Terraform/Pulumi IaC templates - [ ] Auto-scaling configuration - [ ] Spot instance support for cost savings - [ ] AMI/image baking with pre-loaded model weights ## Acceptance Criteria - One-command deployment to AWS or GCP - Model weights pre-loaded or cached for fast cold start - Spot instance fallback logic
Overview
Support deploying the GPU backend on AWS (p4d/p5) and GCP (a2/a3) instances.
Tasks
Acceptance Criteria