TL;DR — Quick Summary
Run Stable Diffusion locally for AI image generation. Covers AUTOMATIC1111 WebUI, ComfyUI, model selection (SDXL, SD 1.5), LoRA fine-tuning, ControlNet, and GPU optimization.
What Is Stable Diffusion?
Stable Diffusion is an open-source AI model that generates photorealistic images, illustrations, concept art, and designs from text descriptions. Running it locally gives you:
- Free — no subscription, no per-image cost
- Private — prompts and images stay on your machine
- Unrestricted — no content filters (you control the rules)
- Customizable — LoRA models, ControlNet, custom training
- Fast — 2-10 seconds per image on modern GPUs
Web UI Options
| Interface | Best For | VRAM Usage | Difficulty |
|---|---|---|---|
| AUTOMATIC1111 | Beginners, extensions | Normal | Easy |
| Forge | Low VRAM, speed | 30-50% less | Easy |
| ComfyUI | Advanced workflows | Most efficient | Medium |
| InvokeAI | Creative professionals | Normal | Easy |
Installation — AUTOMATIC1111
Linux / macOS
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui
./webui.sh --xformers
Docker
docker run -d --gpus all \
-p 7860:7860 \
-v sd-models:/app/models \
-v sd-outputs:/app/outputs \
--name sd-webui \
universonic/stable-diffusion-webui:latest
Forge (Optimized Fork)
git clone https://github.com/lllyasviel/stable-diffusion-webui-forge.git
cd stable-diffusion-webui-forge
./webui.sh --xformers
Model Types
Checkpoints (Base Models)
| Model | Size | Resolution | Best For |
|---|---|---|---|
| SD 1.5 | 2-4 GB | 512×512 | General purpose, largest ecosystem |
| SDXL | 6-7 GB | 1024×1024 | High quality, detailed images |
| SDXL Turbo | 6 GB | 512×512 | Fast (1-4 steps), real-time |
| SD 3 | 4-8 GB | 1024×1024 | Best text rendering |
| Flux | 12-24 GB | Variable | Latest architecture |
Where to download: Civitai and Hugging Face are the main model repositories.
LoRA (Fine-Tuning)
LoRA files add new styles or concepts without replacing the base model:
Prompt: a portrait photo of a woman, <lora:add_detail:0.8>, cinematic lighting
ControlNet
ControlNet adds structural control to generation — pose from a reference image, edge detection, depth maps, or segmentation:
- OpenPose — copy human poses from a reference
- Canny — preserve edges / line art
- Depth — maintain 3D structure
- Scribble — generate from rough sketches
Optimization Flags
| Flag | Effect | When to Use |
|---|---|---|
--xformers | 30% faster, less VRAM | Always (NVIDIA) |
--medvram | Splits model across VRAM stages | 6-8 GB VRAM |
--lowvram | Extreme VRAM optimization | 4 GB VRAM |
--opt-sdp-no-mem-attention | Alternative to xformers | AMD GPUs |
--listen | Access from network | Remote access |
--api | Enable REST API | Automation |
Stable Diffusion vs. Cloud AI Image Services
| Aspect | Stable Diffusion (Local) | Midjourney | DALL-E 3 |
|---|---|---|---|
| Cost | Free (GPU electricity) | $10-60/mo | Pay per image |
| Privacy | ✅ 100% local | ❌ Discord | ❌ OpenAI |
| Speed | 2-10s per image | 30-60s | 10-30s |
| Customization | ✅ LoRA, ControlNet | ❌ Limited | ❌ None |
| NSFW control | You decide | Restricted | Restricted |
| Quality | Excellent (with tuning) | Excellent | Very good |
| Offline | ✅ | ❌ | ❌ |
Summary
Stable Diffusion lets you generate unlimited AI images locally, for free, with complete creative control. Use AUTOMATIC1111 or Forge for the easiest setup, ComfyUI for advanced workflows, and customize with LoRA models and ControlNet for professional results.