Z-Image ComfyUI Guide
Next-gen open-source image generation model based on S3-DiT architecture.
6B Params · 8-Step Turbo · Photorealistic
Z-Image-Turbo
RecommendedDistilled version, optimized for speed and VRAM.
- Only 8 steps inference
- < 16GB VRAM (Consumer)
- No negative prompt needed
Z-Image-Base
Non-distilled foundation model, base for community dev.
- Ideal for LoRA training
- Community fine-tuning
- Checkpoints released
Z-Image-Edit
Fine-tuned specifically for image editing tasks.
- Strong instruction following
- Precise local editing
- Complex editing tasks
Installation Guide
Ensure you have the latest ComfyUI. Download files from Hugging Face and place them as follows.
# Structure
ComfyUI/
├── models/
├── text_encoders/
└── qwen_3_4b.safetensors // Text Encoder
├── diffusion_models/
└── z_image_turbo_bf16.safetensors // Main Model (FP8/GGUF opt)
├── vae/
└── ae.safetensors // Flux 1 VAE
├── model_patches/
└── Z-Image-Turbo-Fun-Controlnet-Union.safetensors // (Optional) ControlNet
Core Advantages
✓
Bilingual: Native support for Chinese prompts, excellent complex text rendering.
✓
Uncensored: Supports uncensored generation modes for high creative freedom.
✓
Ecosystem: Perfect support for ControlNet and LoRA extensions.
Prompt Tips
!
Turbo model needs NO negative prompts.
!
Add lighting keywords: "volumetric lighting", "cinematic lighting".
!
Be as specific as possible (scene, pose, texture).