Context
Tracking the next architectural bet for ollamadiffuser. v2.0.14 added two new registry entries (FLUX.1-Kontext-dev, Chroma1-HD) that ride on existing strategies. The next batch needs a new strategy class — specifically for Apple Silicon (MLX) native inference.
Why MLX
ollamadiffuser currently routes everything through PyTorch (diffusers library). On Apple Silicon, PyTorch + MPS is slower than Apple's MLX framework for many ops (often 2-3× on the same hardware).
Concretely, the mlx-community org publishes MLX-quantized ports of major diffusion models, and mflux (0.17.5, April 2026) is the de-facto MLX inference library — supports FLUX.1, FLUX.2 (4B + 9B), Z-Image, FIBO, SeedVR2, Qwen-Image, Depth Pro.
Status of competition: ComfyUI, A1111, InvokeAI, diffusers-studio — none have a native MLX backend. This is a real differentiation opportunity for the "Ollama-for-X" niche.
Concrete plan
Phase 1: MLXStrategy base class
New file: ollamadiffuser/core/inference/strategies/mlx_strategy.py
Mirror the shape of FluxStrategy / GenericPipelineStrategy but route through mlx-community model packages. Registry entries opt in via model_type: "mlx" + parameters.mlx_backend: "mflux" | "mlx-community".
Phase 2: HiDream-O1-Image-Dev as first consumer
HiDream-ai/HiDream-O1-Image-Dev (MIT, May 8 2026):
- 8B unified transformer on Qwen3-VL backbone
- No VAE — predicts raw 32×32 RGB patches directly
- Already has an MLX port: mlx-community/HiDream-O1-Image-Dev-mlx-bf16
- 16 GB peak at 1024×1024 — fits the M4 16GB
- ~67s/image at 1024² on Apple Silicon
This is the natural first consumer of MLXStrategy because (a) it has zero diffusers integration, so the alternative is a custom HF transformers wrapper; (b) the MLX path is already published.
Phase 3: mflux integration (FLUX family)
After HiDream-O1 proves the pattern, extend MLXStrategy to support mflux:
- FLUX.1-schnell / dev (already in registry as PyTorch — would get an
mlx variant)
- FLUX.2-klein-4B / 9B (gives M4 16GB users a usable FLUX.2 path)
- Z-Image-Turbo (already in registry — would get MLX acceleration)
Hardware constraints
Maintainer can only test on:
- Mac Pro M1 32GB (~24 GB UMA)
- Mac Mini M4 16GB (~12 GB UMA)
So this work is doubly good for the project: solves a real differentiation gap AND aligns with what the maintainer can actually develop on.
Effort estimate
- Phase 1 (base class): ~2-3 days
- Phase 2 (HiDream-O1): ~1-2 days on top of Phase 1
- Phase 3 (mflux): ~2-3 days
Tracking
Reply to this issue with implementation progress, or open child issues per phase if discussion warrants.
cc / community: interested in helping? PRs welcome — the test fixture pattern from v2.0.13 (tests/unit/test_*_endpoint.py uses TestClient + monkeypatched loaded_models) is the template for adding a new strategy without needing actual GPU weights.
Context
Tracking the next architectural bet for ollamadiffuser. v2.0.14 added two new registry entries (FLUX.1-Kontext-dev, Chroma1-HD) that ride on existing strategies. The next batch needs a new strategy class — specifically for Apple Silicon (MLX) native inference.
Why MLX
ollamadiffuser currently routes everything through PyTorch (
diffuserslibrary). On Apple Silicon, PyTorch + MPS is slower than Apple's MLX framework for many ops (often 2-3× on the same hardware).Concretely, the mlx-community org publishes MLX-quantized ports of major diffusion models, and mflux (0.17.5, April 2026) is the de-facto MLX inference library — supports FLUX.1, FLUX.2 (4B + 9B), Z-Image, FIBO, SeedVR2, Qwen-Image, Depth Pro.
Status of competition: ComfyUI, A1111, InvokeAI, diffusers-studio — none have a native MLX backend. This is a real differentiation opportunity for the "Ollama-for-X" niche.
Concrete plan
Phase 1:
MLXStrategybase classNew file:
ollamadiffuser/core/inference/strategies/mlx_strategy.pyMirror the shape of
FluxStrategy/GenericPipelineStrategybut route throughmlx-communitymodel packages. Registry entries opt in viamodel_type: "mlx"+parameters.mlx_backend: "mflux" | "mlx-community".Phase 2: HiDream-O1-Image-Dev as first consumer
HiDream-ai/HiDream-O1-Image-Dev (MIT, May 8 2026):
This is the natural first consumer of
MLXStrategybecause (a) it has zero diffusers integration, so the alternative is a custom HF transformers wrapper; (b) the MLX path is already published.Phase 3: mflux integration (FLUX family)
After HiDream-O1 proves the pattern, extend
MLXStrategyto supportmflux:mlxvariant)Hardware constraints
Maintainer can only test on:
So this work is doubly good for the project: solves a real differentiation gap AND aligns with what the maintainer can actually develop on.
Effort estimate
Tracking
Reply to this issue with implementation progress, or open child issues per phase if discussion warrants.
cc / community: interested in helping? PRs welcome — the test fixture pattern from v2.0.13 (
tests/unit/test_*_endpoint.pyusesTestClient+ monkeypatchedloaded_models) is the template for adding a new strategy without needing actual GPU weights.