TL;DR — Quick Summary

Run Stable Diffusion locally for AI image generation. Covers AUTOMATIC1111 WebUI, ComfyUI, model selection (SDXL, SD 1.5), LoRA fine-tuning, ControlNet, and GPU optimization.

What Is Stable Diffusion?

Stable Diffusion is an open-source AI model that generates photorealistic images, illustrations, concept art, and designs from text descriptions. Running it locally gives you:

  • Free — no subscription, no per-image cost
  • Private — prompts and images stay on your machine
  • Unrestricted — no content filters (you control the rules)
  • Customizable — LoRA models, ControlNet, custom training
  • Fast — 2-10 seconds per image on modern GPUs

Web UI Options

InterfaceBest ForVRAM UsageDifficulty
AUTOMATIC1111Beginners, extensionsNormalEasy
ForgeLow VRAM, speed30-50% lessEasy
ComfyUIAdvanced workflowsMost efficientMedium
InvokeAICreative professionalsNormalEasy

Installation — AUTOMATIC1111

Linux / macOS

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui
./webui.sh --xformers

Docker

docker run -d --gpus all \
  -p 7860:7860 \
  -v sd-models:/app/models \
  -v sd-outputs:/app/outputs \
  --name sd-webui \
  universonic/stable-diffusion-webui:latest

Forge (Optimized Fork)

git clone https://github.com/lllyasviel/stable-diffusion-webui-forge.git
cd stable-diffusion-webui-forge
./webui.sh --xformers

Model Types

Checkpoints (Base Models)

ModelSizeResolutionBest For
SD 1.52-4 GB512×512General purpose, largest ecosystem
SDXL6-7 GB1024×1024High quality, detailed images
SDXL Turbo6 GB512×512Fast (1-4 steps), real-time
SD 34-8 GB1024×1024Best text rendering
Flux12-24 GBVariableLatest architecture

Where to download: Civitai and Hugging Face are the main model repositories.

LoRA (Fine-Tuning)

LoRA files add new styles or concepts without replacing the base model:

Prompt: a portrait photo of a woman, <lora:add_detail:0.8>, cinematic lighting

ControlNet

ControlNet adds structural control to generation — pose from a reference image, edge detection, depth maps, or segmentation:

  • OpenPose — copy human poses from a reference
  • Canny — preserve edges / line art
  • Depth — maintain 3D structure
  • Scribble — generate from rough sketches

Optimization Flags

FlagEffectWhen to Use
--xformers30% faster, less VRAMAlways (NVIDIA)
--medvramSplits model across VRAM stages6-8 GB VRAM
--lowvramExtreme VRAM optimization4 GB VRAM
--opt-sdp-no-mem-attentionAlternative to xformersAMD GPUs
--listenAccess from networkRemote access
--apiEnable REST APIAutomation

Stable Diffusion vs. Cloud AI Image Services

AspectStable Diffusion (Local)MidjourneyDALL-E 3
CostFree (GPU electricity)$10-60/moPay per image
Privacy✅ 100% local❌ Discord❌ OpenAI
Speed2-10s per image30-60s10-30s
Customization✅ LoRA, ControlNet❌ Limited❌ None
NSFW controlYou decideRestrictedRestricted
QualityExcellent (with tuning)ExcellentVery good
Offline

Summary

Stable Diffusion lets you generate unlimited AI images locally, for free, with complete creative control. Use AUTOMATIC1111 or Forge for the easiest setup, ComfyUI for advanced workflows, and customize with LoRA models and ControlNet for professional results.