OllamaDiffuser 🎨

Project Status: Active Development

Thank you for the incredible support and over 30,000 downloads!

ollamadiffuser is in active development. v2.0 brought a major architecture overhaul (strategy pattern, MCP/OpenClaw integration, Apple Silicon support, GGUF). The May 2026 line (v2.0.13 → v2.0.17) added an MLX backend for Apple Silicon plus 7 new diffusers-pipeline models — see What's New below. Part of the LocalKinAI ecosystem.

🆕 What's New

v2.0.17 — MLX Phase 2.5: FLUX.1 family completion

MLXStrategy now covers the full FLUX.1 family on Apple Silicon: flux1-fill (inpaint/outpaint), flux1-redux (image variation), flux1-depth (depth-conditioned), flux1-controlnet (canny + upscaler). Five new registry entries — 14 MLX entries total (see the MLX section below).

v2.0.15–v2.0.16 — MLX Backend (Phases 1 + 2)

New MLXStrategy routes FLUX.1 / FLUX.2 / Z-Image / Qwen-Image / Kontext through mflux for native Apple Silicon inference. Typically 2-3× faster than the PyTorch + MPS path. Install with pip install 'ollamadiffuser[mlx]'. Tracks #7.

v2.0.14 — Diffusers Pipeline Additions

flux.1-kontext-dev (PyTorch) — 12B instruction-based image editing. Pass an input image + edit prompt; the model rewrites the image.
chroma1-hd — 8.9B Apache-2.0 base T2I (FLUX-schnell derivative). Rare commercial-friendly license at this quality tier.

v2.0.13 — Bug Fixes + Discussions

Fixed ollamadiffuser recommend crash on CUDA hosts (PyTorch attribute typo).
GitHub Discussions enabled: https://github.com/LocalKinAI/ollamadiffuser/discussions
18 broken org-name URLs fixed across PyPI metadata, README, and guides.

See CHANGELOG.md for the full history (back to v1.0.0, May 2025).

OllamaDiffuser 🎨

Local AI Image Generation with OllamaDiffuser

OllamaDiffuser simplifies local deployment of Stable Diffusion, FLUX, CogView4, Kolors, SANA, PixArt-Sigma, and 40+ other AI image generation models. An intuitive local SD tool inspired by Ollama's simplicity - perfect for local diffuser workflows with CLI, web UI, and LoRA support.

🌐 Website: ollamadiffuser.com | 📦 PyPI: pypi.org/project/ollamadiffuser

Upgrading from v1.x? v2.0 is a major rewrite requiring Python 3.10+. Run pip install --upgrade "ollamadiffuser[full]" and see the Migration Guide below.

🚀 Quick Start

For Mac/PC Users:

pip install "ollamadiffuser[full]"
ollamadiffuser recommend  # Find which models fit your GPU

For OpenClaw/Agent Users:

pip install "ollamadiffuser[mcp]"
ollamadiffuser mcp        # Starts the MCP server

For Low-VRAM / Budget GPU Users:

pip install "ollamadiffuser[gguf]"
ollamadiffuser pull flux.1-dev-gguf-q4ks  # Only 6GB VRAM needed
ollamadiffuser run flux.1-dev-gguf-q4ks

Most models work without any token -- just install and go. See Hugging Face Authentication when you want gated models like FLUX.1-dev or SD 3.5.

✨ Features

🏗️ Strategy Architecture: Clean per-model strategy pattern (SD1.5, SDXL, FLUX, SD3, ControlNet, Video, HiDream, GGUF, MLX, Generic)
🌐 60+ Models: FLUX.1/2, SD 3.5, SDXL Lightning, CogView4, Kolors, SANA, PixArt-Sigma, Z-Image, Qwen-Image, Chroma1, and more
🔌 Generic Pipeline: Add new diffusers models via registry config alone -- no code changes needed
🖼️ img2img & Inpainting: Image-to-image and inpainting support across SD1.5, SDXL, and the API/Web UI
⚡ Async API: Non-blocking FastAPI server using asyncio.to_thread for GPU operations
🎲 Random Seeds: Reproducible generation with explicit seeds, random by default
🎛️ ControlNet Support: Precise image generation control with 10+ control types (PyTorch + MLX)
🔄 LoRA Integration: Dynamic LoRA loading and management
🔌 MCP & OpenClaw: Model Context Protocol server for AI assistant integration (OpenClaw, Claude Code, Cursor)
🍎 Apple Silicon, two paths:
- MLX backend via mflux — 14 native MLX entries (FLUX.1 family, FLUX.2 Klein, Z-Image, Qwen-Image, Kontext, Fill, Redux, Depth, ControlNet). Typically 2-3× faster than the PyTorch + MPS path on M-series.
- PyTorch + MPS — full diffusers pipeline support with per-model dtype handling (float16/bfloat16, NaN sanitization), GGUF Metal acceleration, and ollamadiffuser recommend for hardware-aware model suggestions.
📦 Smart Downloads: ollamadiffuser pull downloads only diffusers pipeline files — skips root-level checkpoints, ONNX/Flax exports, and safety_checker. Saves 10–200 GB per model.
📦 GGUF Support: Memory-efficient quantized models (3GB VRAM minimum!) with CUDA and Metal acceleration
🌐 Multiple Interfaces: CLI, Python API, Web UI, and REST API
📦 Model Management: Easy installation and switching between models
⚡ Performance Optimized: Memory-efficient with GPU acceleration
🧪 Test Suite: 124 tests across settings, registry, engine, API, MPS, MLX, and MCP

Option 1: Install from PyPI (Recommended)

# Install from PyPI
pip install ollamadiffuser

# Pull and run a model
ollamadiffuser pull flux.1-schnell
ollamadiffuser run flux.1-schnell

# Generate via API (seed is optional for reproducibility)
curl -X POST http://localhost:8000/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "A beautiful sunset", "seed": 12345}' \
  --output image.png

🔄 Update to Latest Version

Always use the latest version for the newest features and bug fixes:

# Update to latest version
pip uninstall ollamadiffuser
pip install --no-cache-dir ollamadiffuser

This ensures you get:

🐛 Latest bug fixes
✨ New features and improvements
🚀 Performance optimizations
🔒 Security updates

GGUF Quick Start (Low VRAM)

# For systems with limited VRAM (3GB+)
pip install "ollamadiffuser[gguf]"

# Download memory-efficient GGUF model
ollamadiffuser pull flux.1-dev-gguf-q4ks

# Generate with reduced memory usage
ollamadiffuser run flux.1-dev-gguf-q4ks

Apple Silicon Quick Start (Mac Mini / MacBook)

# See which models fit your Mac
ollamadiffuser recommend

# Fast single-step model (<6GB)
ollamadiffuser pull sdxl-turbo
ollamadiffuser run sdxl-turbo

# GGUF with Metal acceleration (6GB, great quality)
pip install "ollamadiffuser[gguf]"
CMAKE_ARGS="-DSD_METAL=ON" pip install stable-diffusion-cpp-python
ollamadiffuser pull flux.1-dev-gguf-q4ks
ollamadiffuser run flux.1-dev-gguf-q4ks

Option 2: Development Installation

# Clone the repository
git clone https://github.com/LocalKinAI/ollamadiffuser.git
cd ollamadiffuser

# Install dependencies
pip install -e .

Basic Usage

# Check version
ollamadiffuser -V

# Install a model
ollamadiffuser pull stable-diffusion-1.5

# Run the model (loads and starts API server)
ollamadiffuser run stable-diffusion-1.5

# Generate an image via API
curl -X POST http://localhost:8000/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "a beautiful sunset over mountains"}' \
  --output image.png

# Start web interface
ollamadiffuser --mode ui

open http://localhost:8001

ControlNet Quick Start

# Install ControlNet model
ollamadiffuser pull controlnet-canny-sd15

# Run ControlNet model (loads and starts API server)
ollamadiffuser run controlnet-canny-sd15

# Generate with control image
curl -X POST http://localhost:8000/api/generate/controlnet \
  -F "prompt=a beautiful landscape" \
  -F "control_image=@your_image.jpg"

🔑 Hugging Face Authentication

Do you need a Hugging Face token? It depends on which models you want to use!

Models that DON'T require a token -- ready to use right away:

FLUX.1-schnell, Stable Diffusion 1.5, DreamShaper, PixArt-Sigma, SANA 1.5, most ControlNet models

Models that DO require a token:

FLUX.1-dev, Stable Diffusion 3.5, some premium LoRAs

Setup (only needed for gated models):

# 1. Create account at https://huggingface.co and generate an access token
# 2. Accept license on the model page (e.g. FLUX.1-dev, SD 3.5)
# 3. Set your token
export HF_TOKEN=your_token_here

# 4. Now you can access gated models
ollamadiffuser pull flux.1-dev
ollamadiffuser pull stable-diffusion-3.5-medium

Tips: Use "read" permissions for the token. Your token stays local -- never shared with OllamaDiffuser servers. Add export HF_TOKEN=... to ~/.bashrc or ~/.zshrc to make it permanent.

🎯 Supported Models

Choose from 40+ models spanning every major architecture:

Core Models

Model	Type	Steps	VRAM	Commercial	License
`flux.1-schnell`	flux	4	16GB+	✅	Apache 2.0
`flux.1-dev`	flux	20	20GB+	❌	Non-commercial
`stable-diffusion-3.5-medium`	sd3	28	8GB+	⚠️	Stability AI
`stable-diffusion-3.5-large`	sd3	28	12GB+	⚠️	Stability AI
`stable-diffusion-3.5-large-turbo`	sd3	4	12GB+	⚠️	Stability AI
`stable-diffusion-xl-base`	sdxl	50	6GB+	⚠️	CreativeML
`stable-diffusion-1.5`	sd15	50	4GB+	⚠️	CreativeML

Next-Generation Models

Model	Origin	Params	Steps	VRAM	Commercial	License
`flux.2-dev`	Black Forest Labs	32B	28	14GB+	❌	Non-commercial
`flux.2-klein-4b`	Black Forest Labs	4B	28	10GB+	✅	Apache 2.0
`z-image-turbo`	Alibaba (Tongyi)	6B	8	10GB+	✅	Apache 2.0
`sana-1.5`	NVIDIA	1.6B	20	8GB+	✅	Apache 2.0
`cogview4`	Zhipu AI	6B	50	12GB+	✅	Apache 2.0
`kolors`	Kuaishou	8.6B	50	8GB+	✅	Kolors License
`hunyuan-dit`	Tencent	1.5B	50	6GB+	✅	Tencent Community
`lumina-2`	Alpha-VLLM	2B	30	8GB+	✅	Apache 2.0
`pixart-sigma`	PixArt	0.6B	20	6GB+	✅	Open
`auraflow`	Fal	6.8B	50	12GB+	✅	Apache 2.0
`omnigen`	BAAI	3.8B	50	12GB+	✅	MIT

Fast / Turbo Models

Model	Steps	VRAM	Notes
`sdxl-turbo`	1	6GB+	Single-step distilled SDXL
`sdxl-lightning-4step`	4	6GB+	ByteDance, single-file checkpoint, custom scheduler
`stable-diffusion-3.5-large-turbo`	4	12GB+	Distilled SD 3.5 Large
`z-image-turbo`	8	10GB+	Alibaba 6B turbo

Community Fine-Tunes

Model	Base	Notes
`realvisxl-v4`	SDXL	Photorealistic, very popular
`dreamshaper`	SD 1.5	Versatile artistic model
`realistic-vision-v6`	SD 1.5	Portrait specialist

FLUX Pipeline Variants

Model	Pipeline	Use Case
`flux.1-kontext-dev`	FluxKontextPipeline	Instruction-based image editing — pass an input image + edit prompt (added v2.0.14)
`flux.1-fill-dev`	FluxFillPipeline	Inpainting / outpainting
`flux.1-canny-dev`	FluxControlPipeline	Canny edge control
`flux.1-depth-dev`	FluxControlPipeline	Depth map control

Apache-2.0 Commercial-Friendly

Model	Origin	Params	Notes
`chroma1-hd`	lodestones	8.9B	FLUX-schnell derivative with custom MMDiT masking + 250M timestep FFN (added v2.0.14)
`flux.1-schnell`	Black Forest Labs	12B	4-step distilled
`flux.2-klein-4b`	Black Forest Labs	4B	FLUX.2 family, MPS-friendly
`z-image-turbo`	Alibaba (Tongyi)	6B	8-step DMD
`sana-1.5`	NVIDIA	1.6B	Fastest >1024² generation
`cogview4`	Zhipu AI	6B	Multilingual including CJK
`pixart-sigma`	PixArt	0.6B	Fits 6GB GPUs
`lumina-2`	Alpha-VLLM	2B	Open multimodal foundation
`auraflow`	Fal	6.8B	Latest open MMDiT
`omnigen`	BAAI	3.8B	Unified gen + edit

💾 GGUF Models - Reduced Memory Requirements

GGUF quantized models enable running FLUX.1-dev on budget hardware:

GGUF Variant	VRAM	Quality	Best For
`flux.1-dev-gguf-q4ks`	6GB	⭐⭐⭐⭐	Recommended - RTX 3060/4060
`flux.1-dev-gguf-q3ks`	4GB	⭐⭐⭐	Mobile GPUs, GTX 1660 Ti
`flux.1-dev-gguf-q2k`	3GB	⭐⭐	Entry-level hardware
`flux.1-dev-gguf-q6k`	10GB	⭐⭐⭐⭐⭐	RTX 3080/4070+

📖 Complete GGUF Guide - Hardware recommendations, installation, and optimization tips

🍎 MLX Models — Apple Silicon native

MLX entries run through mflux on Apple Silicon (M1/M2/M3/M4). On M-series hardware they are typically 2-3× faster than the same model on the PyTorch + MPS path. Install with pip install 'ollamadiffuser[mlx]'.

Text-to-image:

Entry	Family	Quant	Disk	Recommended VRAM	License
`flux.1-schnell-mlx`	FLUX.1	Q8	14 GB	16 GB (M1 32GB)	Apache 2.0
`flux.1-schnell-mlx-q4`	FLUX.1	Q4	8 GB	12 GB (M4 16GB)	Apache 2.0
`flux.1-dev-mlx`	FLUX.1	Q8	14 GB	16 GB	Non-Commercial
`flux.2-klein-4b-mlx`	FLUX.2 Klein	Q8	7 GB	12 GB (M4 16GB)	Apache 2.0
`flux.2-klein-9b-mlx`	FLUX.2 Klein	Q8	13 GB	20 GB	Apache 2.0
`z-image-turbo-mlx`	Z-Image (6B, 8-step DMD)	Q8	8 GB	12 GB (M4 16GB)	Apache 2.0
`qwen-image-mlx`	Qwen-Image (20B)	Q8	22 GB	24 GB	Apache 2.0

Image editing / control:

Entry	Required inputs	License
`flux.1-kontext-dev-mlx`	`image=`	Non-Commercial
`flux.1-fill-dev-mlx`	`image=`, `mask_image=`	Non-Commercial
`flux.1-redux-dev-mlx`	`redux_images=[...]`	Non-Commercial
`flux.1-depth-dev-mlx`	`image=`	Non-Commercial
`flux.1-controlnet-canny-mlx`	`control_image=` (canny edges)	Non-Commercial
`flux.1-controlnet-upscaler-mlx`	`control_image=` (low-res source)	Non-Commercial
`qwen-image-edit-mlx`	`image=`	Apache 2.0

Hardware fit at a glance:

Mac Mini M4 16 GB can run anything marked ✅ above (Q4 FLUX.1-schnell, FLUX.2 Klein 4B, Z-Image-Turbo).
Mac Pro M1 32 GB / M2 Pro 32 GB+ can run all entries except the 20B Qwen-Image at the larger resolutions.

Quick start:

pip install 'ollamadiffuser[mlx]'
ollamadiffuser pull z-image-turbo-mlx    # smallest Apache-2.0 option
ollamadiffuser run z-image-turbo-mlx

🎛️ ControlNet Features

⚡ Lazy Loading Architecture

New in v1.1.0: ControlNet preprocessors use intelligent lazy loading:

Instant Startup: ollamadiffuser --help runs immediately without downloading models
On-Demand Loading: Preprocessors initialize only when actually needed
Automatic Initialization: Seamless loading when uploading control images
User Control: Manual initialization available for pre-loading

Available Control Types

Canny Edge Detection: Structural control with edge maps
Depth Estimation: 3D structure control with depth maps
OpenPose: Human pose and body position control
Scribble/Sketch: Artistic control with hand-drawn inputs
Advanced Types: HED, MLSD, Normal, Lineart, Anime Lineart, Content Shuffle

ControlNet Models

# SD 1.5 ControlNet Models
ollamadiffuser pull controlnet-canny-sd15
ollamadiffuser pull controlnet-depth-sd15
ollamadiffuser pull controlnet-openpose-sd15
ollamadiffuser pull controlnet-scribble-sd15

# SDXL ControlNet Models
ollamadiffuser pull controlnet-canny-sdxl
ollamadiffuser pull controlnet-depth-sdxl

🔄 LoRA Support

Dynamic LoRA Management

# Download LoRA from Hugging Face
ollamadiffuser lora pull "openfree/flux-chatgpt-ghibli-lora"

# Load LoRA with custom strength
ollamadiffuser lora load ghibli --scale 1.2

# Unload LoRA
ollamadiffuser lora unload

Web UI LoRA Integration

Easy Download: Enter Hugging Face repository ID
Strength Control: Adjust LoRA influence with sliders
Real-time Loading: Load/unload LoRAs without restarting
Alias Support: Create custom names for your LoRAs

🌐 Multiple Interfaces

Command Line Interface

# Pull and run a model
ollamadiffuser pull stable-diffusion-1.5
ollamadiffuser run stable-diffusion-1.5

# Model registry management
ollamadiffuser registry list
ollamadiffuser registry list --installed-only
ollamadiffuser registry check-gguf

# Configuration management
ollamadiffuser config                                    # show all config
ollamadiffuser config set models_dir /mnt/ssd/models     # custom model path
ollamadiffuser config set server.port 9000               # change server port

# In another terminal, generate images via API
curl -X POST http://localhost:8000/api/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "a futuristic cityscape",
    "negative_prompt": "blurry, low quality",
    "num_inference_steps": 30,
    "guidance_scale": 7.5,
    "width": 1024,
    "height": 1024
  }' \
  --output image.png

Web UI

# Start web interface
ollamadiffuser --mode ui
Open http://localhost:8001

Features:

Responsive Design: Works on desktop and mobile
Real-time Status: Model and LoRA loading indicators
ControlNet Integration: File upload with preprocessing
Parameter Controls: Intuitive sliders and inputs

REST API

# Start API server
ollamadiffuser --mode api
ollamadiffuser load stable-diffusion-1.5

# Text-to-image
curl -X POST http://localhost:8000/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "a beautiful landscape", "width": 1024, "height": 1024, "seed": 42}'

# Image-to-image
curl -X POST http://localhost:8000/api/generate/img2img \
  -F "prompt=oil painting style" \
  -F "strength=0.75" \
  -F "image=@input.png" \
  --output result.png

# Inpainting
curl -X POST http://localhost:8000/api/generate/inpaint \
  -F "prompt=a red car" \
  -F "image=@photo.png" \
  -F "mask=@mask.png" \
  --output inpainted.png

# API docs: http://localhost:8000/docs

MCP Server (AI Assistant Integration)

OllamaDiffuser includes a Model Context Protocol server for integration with AI assistants like OpenClaw, Claude Code, and Cursor.

# Install MCP support
pip install "ollamadiffuser[mcp]"

# Start MCP server (stdio transport)
ollamadiffuser mcp

MCP client configuration (e.g. claude_desktop_config.json):

{
  "mcpServers": {
    "ollamadiffuser": {
      "command": "ollamadiffuser-mcp"
    }
  }
}

Available MCP tools:

generate_image -- Generate images from text prompts (auto-loads model)
list_models -- List available and installed models
load_model -- Load a model into memory
get_status -- Check device, loaded model, and system status

OpenClaw AgentSkill

An OpenClaw skill is included at integrations/openclaw/SKILL.md. It uses the REST API with response_format=b64_json for agent-friendly base64 image responses. Copy the skill directory to your OpenClaw skills folder or publish to ClawHub.

Base64 JSON API Response

For AI agents and messaging platforms, use response_format=b64_json to get images as JSON:

curl -X POST http://localhost:8000/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "a sunset over mountains", "response_format": "b64_json"}'

Response: {"image": "<base64 PNG>", "format": "png", "width": 1024, "height": 1024}

Python API

from ollamadiffuser.core.models.manager import model_manager

# Load model
success = model_manager.load_model("stable-diffusion-1.5")
if success:
    engine = model_manager.loaded_model

    # Text-to-image (seed is optional; omit for random)
    image = engine.generate_image(
        prompt="a beautiful sunset",
        width=1024,
        height=1024,
        seed=42,
    )
    image.save("output.jpg")

    # Image-to-image
    from PIL import Image
    input_img = Image.open("photo.jpg")
    result = engine.generate_image(
        prompt="watercolor painting",
        image=input_img,
        strength=0.7,
    )
    result.save("img2img_output.jpg")
else:
    print("Failed to load model")

📦 Model Ecosystem

Base Models

Stable Diffusion 1.5: Classic, reliable, fast (img2img + inpainting)
Stable Diffusion XL: High-resolution, detailed (img2img + inpainting, scheduler overrides)
Stable Diffusion 3.5: Medium, Large, and Large Turbo variants
FLUX.1: schnell, dev, Fill, Canny, Depth pipeline variants
HiDream: Multi-prompt generation with bfloat16
AnimateDiff: Video/animation generation

Next-Generation Models

FLUX.2: 32B dev and 4B Klein variants from Black Forest Labs
Chinese Models: CogView4 (Zhipu), Kolors (Kuaishou), Hunyuan-DiT (Tencent), Z-Image (Alibaba)
Efficient Models: SANA 1.5 (1.6B), PixArt-Sigma (0.6B) -- high quality at low VRAM
Open Models: AuraFlow (6.8B, Apache 2.0), OmniGen (3.8B, MIT), Lumina 2.0 (2B, Apache 2.0)

Fast / Turbo Models

SDXL Turbo: Single-step inference from Stability AI
SDXL Lightning: 4-step single-file checkpoint from ByteDance (6.5 GB download)
Z-Image Turbo: 8-step turbo from Alibaba

Community Fine-Tunes

RealVisXL V4: Photorealistic SDXL, very popular
DreamShaper: Versatile artistic SD 1.5 model
Realistic Vision V6: Portrait specialist

GGUF Quantized Models

FLUX.1-dev GGUF: 7 quantization levels (3GB-16GB VRAM)
Memory Efficient: Run high-quality models on budget hardware
Optional Install: pip install "ollamadiffuser[gguf]"

ControlNet Models

SD 1.5 ControlNet: 4 control types (canny, depth, openpose, scribble)
SDXL ControlNet: 2 control types (canny, depth)

LoRA Support

Hugging Face Integration: Direct download from HF Hub
Local LoRA Files: Support for local .safetensors files
Dynamic Loading: Load/unload without model restart
Strength Control: Adjustable influence (0.1-2.0)

⚙️ Architecture

Strategy Pattern Engine

Each model type has a dedicated strategy class handling loading and generation:

InferenceEngine (facade)
  -> SD15Strategy            (512x512, float16 on MPS, img2img, inpainting)
  -> SDXLStrategy            (1024x1024, float16 on MPS, diffusers force_upcast, img2img, inpainting, scheduler overrides, single-file)
  -> FluxStrategy            (schnell/dev/Fill/Canny/Depth, bfloat16 on MPS, dynamic pipeline class)
  -> SD3Strategy             (1024x1024, float16 on MPS, 28 steps, guidance=3.5)
  -> ControlNetStrategy      (SD15 + SDXL, float16 on MPS, SDXL uses diffusers force_upcast)
  -> VideoStrategy           (AnimateDiff, float16 on MPS, 16 frames)
  -> HiDreamStrategy         (bfloat16 on MPS, multi-prompt)
  -> GGUFStrategy            (quantized via stable-diffusion-cpp)
  -> GenericPipelineStrategy (any diffusers pipeline via config, per-model dtype on MPS, opt-in VAE upcast)

The GenericPipelineStrategy dynamically loads any diffusers pipeline class specified in the model registry, so new models can be added with zero code changes.

Configuration

Models are automatically configured with optimal settings:

Memory Optimization: Attention slicing, CPU offloading
Device Detection: Automatic CUDA/MPS/CPU selection
Precision Handling: FP16/BF16 per model type
Safety Disabled: Unified SAFETY_DISABLED_KWARGS (no monkey-patching)
Smart Downloads: Pipeline-only filtering by model type — skips ONNX, Flax, root checkpoints, and safety_checker

🔧 Advanced Usage

ControlNet Parameters

# Fine-tune ControlNet behavior
image = engine.generate_image(
    prompt="architectural masterpiece",
    control_image=control_img,
    controlnet_conditioning_scale=1.2,  # Strength (0.0-2.0)
    control_guidance_start=0.0,         # When to start (0.0-1.0)
    control_guidance_end=1.0            # When to end (0.0-1.0)
)

GGUF Model Usage

# Check GGUF support
ollamadiffuser registry check-gguf

# Download GGUF model for your hardware
ollamadiffuser pull flux.1-dev-gguf-q4ks  # 6GB VRAM
ollamadiffuser pull flux.1-dev-gguf-q3ks  # 4GB VRAM

# Use with optimized settings
ollamadiffuser run flux.1-dev-gguf-q4ks

Batch Processing

from ollamadiffuser.core.utils.controlnet_preprocessors import controlnet_preprocessor

# Pre-initialize for faster processing
controlnet_preprocessor.initialize()

# Process multiple images
prompt = "beautiful landscape"  # Define the prompt
for i, image_path in enumerate(image_list):
    control_img = controlnet_preprocessor.preprocess(image_path, "canny")
    result = engine.generate_image(prompt, control_image=control_img)
    result.save(f"output_{i}.jpg")

API Integration

import requests

# Initialize ControlNet preprocessors
response = requests.post("http://localhost:8000/api/controlnet/initialize")

# Check available preprocessors
response = requests.get("http://localhost:8000/api/controlnet/preprocessors")
print(response.json()["available_types"])

# Generate with file upload
with open("control.jpg", "rb") as f:
    response = requests.post(
        "http://localhost:8000/api/generate/controlnet",
        data={"prompt": "beautiful landscape"},
        files={"control_image": f}
    )

📚 Documentation & Guides

GGUF Models Guide: Complete guide to memory-efficient GGUF models
ControlNet Guide: Comprehensive ControlNet usage and examples
Website Documentation: Complete tutorials and guides

🚀 Performance & Hardware

Minimum Requirements

RAM: 8GB system RAM
Storage: 10GB free space
Python: 3.10+

Recommended Hardware

For Regular Models

GPU: 8GB+ VRAM (NVIDIA/AMD)
RAM: 16GB+ system RAM
Storage: SSD with 50GB+ free space

For Apple Silicon (Mac Mini / MacBook)

16GB unified memory: SANA 1.5, Lumina 2.0, DreamShaper, SD 1.5, SDXL/SDXL Turbo, GGUF q2k-q5ks
24GB+ unified memory: CogView4, Hunyuan-DiT, FLUX.1-schnell, GGUF q6k-q8
32GB unified memory: Kolors, SD 3.5 Large, all MPS-supported models
GGUF with Metal: Install with CMAKE_ARGS="-DSD_METAL=ON" for GPU acceleration
Note: CPU offload does not help on Apple Silicon (unified memory) -- the full model must fit in RAM
Run ollamadiffuser recommend to see what fits your hardware

For GGUF Models (Memory Efficient)

GPU: 3GB+ VRAM (or CPU only)
RAM: 8GB+ system RAM (16GB+ for CPU inference)
Storage: SSD with 20GB+ free space

Supported Platforms

CUDA: NVIDIA GPUs (recommended)
MPS: Apple Silicon (M1/M2/M3/M4) -- native support for 30+ models including GGUF
CPU: All platforms (slower but functional)

🔧 Troubleshooting

Installation Issues

Missing Dependencies (cv2/OpenCV Error)

If you encounter ModuleNotFoundError: No module named 'cv2', run:

# Quick fix
pip install opencv-python>=4.8.0

# Or use the built-in verification tool
ollamadiffuser verify-deps

# Or install with all optional dependencies
# For bash/sh:
pip install ollamadiffuser[full]

# For zsh (macOS default):
pip install "ollamadiffuser[full]"

# For fish shell:
pip install 'ollamadiffuser[full]'

GGUF Support Issues

# Install GGUF dependencies
pip install "ollamadiffuser[gguf]"

# Check GGUF support
ollamadiffuser registry check-gguf

# See full GGUF troubleshooting guide
# Read GGUF_GUIDE.md for detailed troubleshooting

Complete Dependency Check

# Run comprehensive system diagnostics
ollamadiffuser doctor

# Verify and install missing dependencies interactively
ollamadiffuser verify-deps

Clean Installation

If you're having persistent issues:

# Uninstall and reinstall
pip uninstall ollamadiffuser

# Reinstall with all dependencies (shell-specific syntax):
# For bash/sh:
pip install --no-cache-dir ollamadiffuser[full]

# For zsh (macOS default):
pip install --no-cache-dir "ollamadiffuser[full]"

# For fish shell:
pip install --no-cache-dir 'ollamadiffuser[full]'

# Verify installation
ollamadiffuser verify-deps

Common Issues

Slow Startup

If you experience slow startup, ensure you're using the latest version with lazy loading:

git pull origin main
pip install -e .

ControlNet Not Working

# Check preprocessor status
python -c "
from ollamadiffuser.core.utils.controlnet_preprocessors import controlnet_preprocessor
print('Available:', controlnet_preprocessor.is_available())
print('Initialized:', controlnet_preprocessor.is_initialized())
"

# Manual initialization
curl -X POST http://localhost:8000/api/controlnet/initialize

Memory Issues

# Use GGUF models for lower memory usage
ollamadiffuser pull flux.1-dev-gguf-q4ks  # 6GB VRAM
ollamadiffuser pull flux.1-dev-gguf-q3ks  # 4GB VRAM

# Use smaller image sizes via API
curl -X POST http://localhost:8000/api/generate \
  -H "Content-Type: application/json" \
  -d '{"prompt": "test", "width": 512, "height": 512}' \
  --output test.png

# CPU offloading is automatic
# Close other applications to free memory
# Use basic preprocessors instead of advanced ones

Platform-Specific Issues

macOS Apple Silicon

# If you encounter OpenCV issues on Apple Silicon
pip uninstall opencv-python
pip install opencv-python-headless>=4.8.0

# For GGUF Metal acceleration
CMAKE_ARGS="-DSD_METAL=ON" pip install stable-diffusion-cpp-python

Windows

# If you encounter build errors
pip install --only-binary=all opencv-python>=4.8.0

# For GGUF CUDA acceleration
CMAKE_ARGS="-DSD_CUDA=ON" pip install stable-diffusion-cpp-python

Linux

# If you need system dependencies
sudo apt-get update
sudo apt-get install libgl1-mesa-glx libglib2.0-0
pip install opencv-python>=4.8.0

Debug Mode

# Enable verbose logging
ollamadiffuser --verbose run model-name

🤝 Contributing

We welcome contributions! Please check the GitHub repository for contribution guidelines.

🤝 Community & Support

Quick Actions

🐛 Report a Bug - Found an issue? Let us know
💡 Feature Request - Have an idea? Share it with us
💬 Join Discussions - Community discussion
⭐ Star on GitHub - Show your support

Community Driven

OllamaDiffuser is an open-source project that thrives on community feedback. Every suggestion, bug report, and contribution helps make it better for everyone.

Open Source • Community Driven • Actively Maintained

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Stability AI: For Stable Diffusion models
Black Forest Labs: For FLUX.1 and FLUX.2 models
Alibaba (Tongyi-MAI): For Z-Image Turbo
NVIDIA (Efficient-Large-Model): For SANA 1.5
Zhipu AI (THUDM): For CogView4
Kuaishou (Kwai-Kolors): For Kolors
Tencent (Hunyuan): For Hunyuan-DiT
Alpha-VLLM: For Lumina 2.0
PixArt-alpha: For PixArt-Sigma
Fal: For AuraFlow
BAAI (Shitao): For OmniGen
ByteDance: For SDXL Lightning
city96: For FLUX.1-dev GGUF quantizations
Hugging Face: For model hosting and diffusers library
Anthropic: For Model Context Protocol (MCP)
OpenClaw: For AI agent ecosystem integration
ControlNet Team: For ControlNet architecture
Community: For feedback and contributions

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Ready to get started? Install from PyPI: pip install ollamadiffuser or visit ollamadiffuser.com 🎨✨

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
.github/workflows		.github/workflows
assets		assets
examples		examples
integrations/openclaw		integrations/openclaw
ollamadiffuser		ollamadiffuser
scripts		scripts
tests		tests
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CONTROLNET_GUIDE.md		CONTROLNET_GUIDE.md
Dockerfile		Dockerfile
GGUF_GUIDE.md		GGUF_GUIDE.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
PUBLISHING.md		PUBLISHING.md
README.md		README.md
check_version.py		check_version.py
create_controlnet_samples.py		create_controlnet_samples.py
docker-compose.yml		docker-compose.yml
install_helper.py		install_helper.py
install_ollamadiffuser.sh		install_ollamadiffuser.sh
publish_to_pypi.sh		publish_to_pypi.sh
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

Project Status: Active Development

🆕 What's New

v2.0.17 — MLX Phase 2.5: FLUX.1 family completion

v2.0.15–v2.0.16 — MLX Backend (Phases 1 + 2)

v2.0.14 — Diffusers Pipeline Additions

v2.0.13 — Bug Fixes + Discussions

OllamaDiffuser 🎨

Local AI Image Generation with OllamaDiffuser

🚀 Quick Start

✨ Features

Option 1: Install from PyPI (Recommended)

🔄 Update to Latest Version

GGUF Quick Start (Low VRAM)

Apple Silicon Quick Start (Mac Mini / MacBook)

Option 2: Development Installation

Basic Usage

ControlNet Quick Start

🔑 Hugging Face Authentication

🎯 Supported Models

Core Models

Next-Generation Models

Fast / Turbo Models

Community Fine-Tunes

FLUX Pipeline Variants

Apache-2.0 Commercial-Friendly

💾 GGUF Models - Reduced Memory Requirements

🍎 MLX Models — Apple Silicon native

🎛️ ControlNet Features

⚡ Lazy Loading Architecture

Available Control Types

ControlNet Models

🔄 LoRA Support

Dynamic LoRA Management

Web UI LoRA Integration

🌐 Multiple Interfaces

Command Line Interface

Web UI

REST API

MCP Server (AI Assistant Integration)

OpenClaw AgentSkill

Base64 JSON API Response

Python API

📦 Model Ecosystem

Base Models

Next-Generation Models

Fast / Turbo Models

Community Fine-Tunes

GGUF Quantized Models

ControlNet Models

LoRA Support

⚙️ Architecture

Strategy Pattern Engine

Configuration

🔧 Advanced Usage

ControlNet Parameters

GGUF Model Usage

Batch Processing

API Integration

📚 Documentation & Guides

🚀 Performance & Hardware

Minimum Requirements

Recommended Hardware

For Regular Models

For Apple Silicon (Mac Mini / MacBook)

For GGUF Models (Memory Efficient)

Supported Platforms

🔧 Troubleshooting

Installation Issues

Missing Dependencies (cv2/OpenCV Error)

GGUF Support Issues

Complete Dependency Check

Clean Installation

Common Issues

Slow Startup

ControlNet Not Working

Memory Issues

Packages