Benchmark: MI300X multi-instance Wan 2.2 concurrency

## Goal
Determine optimal number of concurrent Wan 2.2 instances per MI300X GPU.

Theoretical: 4-9 instances per GPU (32-72 total on 8-GPU node). Needs real hardware validation.

## Tasks
- [ ] Get access to MI300X hardware (RunPod, Hot Aisle, or Azure ND MI300X v5)
- [ ] Run baseline: single instance peak VRAM + time per clip
- [ ] Scale test: 2, 3, 4 instances on same GPU — measure throughput
- [ ] Test weight sharing approach (shared model, independent working memory)
- [ ] Full node test: optimal × 8 GPUs
- [ ] Record all metrics (VRAM, bandwidth, time/clip, total throughput)
- [ ] Update docs/research/mi300x-benchmarking.md with results
- [ ] Set default `max_instances_per_gpu` in config based on findings

## Details
See [docs/research/mi300x-benchmarking.md](docs/research/mi300x-benchmarking.md) for full protocol.

## Blocked By
Access to MI300X hardware.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmark: MI300X multi-instance Wan 2.2 concurrency #17

Goal

Tasks

Details

Blocked By

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Benchmark: MI300X multi-instance Wan 2.2 concurrency #17

Description

Goal

Tasks

Details

Blocked By

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions