Self Hosting 101: Deploying Stable Diffusion Models
Overview: Stable Diffusion offers a family of open-source multi-modal models that let you create high-quality, AI-generated artifacts using text prompts (supplemented by input types like images or input audio). While you can use hosted interfaces like DreamStudio or Hugging Face Spaces, many users prefer self-hosting for greater control, customization, and privacy. Note: For commercial reuse of model content, please visit our Licensing Page, Terms of Service Agreement, and Privacy Policy.
This guide covers three common deployment approaches:
-
Running Stable Diffusion locally
-
Deploying on a cloud virtual machine (e.g., AWS EC2, GCP, Azure)
-
Using hosted inference services (e.g., Replicate, RunPod)
1. Local Deployment
Why Run Locally?
Running Stable Diffusion locally gives you:
- Full control over your environment, settings, and models
- Offline image generation (no reliance on cloud services)
- Freedom to experiment with custom workflows and fine-tuning
- No recurring usage costs
System Requirements
- GPU: NVIDIA GPU with at least 6 GB VRAM (RTX 3060 or higher recommended)
- OS: Windows, macOS (with M-series chip), or Linux
Software: Python 3.10+, Git, and Conda (or venv)
Setup Steps
1. Create a Python environment
conda create -n sd-env python=3.10
conda activate sd-env
2. Choose and install a Stable Diffusion interface
There are multiple open-source interfaces for running Stable Diffusion.
ComfyUI is currently the most flexible and modular choice, ideal for both beginners and advanced users.
|
Interface |
Description |
Link |
|
ComfyUI |
A node-based, modular interface that lets you visually design custom workflows. Ideal for experimentation, automation, and advanced setups. |
|
|
Automatic1111 WebUI |
A widely-used, traditional web interface with strong community support and a vast plugin ecosystem. |
|
|
InvokeAI |
A modern dashboard with a focus on workflow clarity and post-processing tools like upscaling and inpainting. |
|
3. Example: Installing ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
4. Download model weights
Model checkpoint files (.ckpt or .safetensors) can be downloaded from:
- Hugging Face
- Civitai for community-trained models, LoRAs, and embeddings
Place model weights in:
ComfyUI/models/checkpoints/
5. Launch ComfyUI
python main.py
Then open your browser to:
http://127.0.0.1:8188
You’ll see a graph-based interface where you can build custom generation pipelines by connecting visual nodes.
Alternative: Installing Automatic1111 WebUI
git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git
cd stable-diffusion-webui
pip install -r requirements.txt
python launch.py
Access the interface at http://127.0.0.1:7860
Performance Tips
- Enable xformers for faster generation (supported by both ComfyUI and Automatic1111).
- Use half-precision (fp16) models to reduce VRAM usage.
Keep repositories and dependencies updated to benefit from optimization improvements.
2. Cloud Deployment (e.g., AWS EC2)
Why Use the Cloud?
Cloud GPU instances provide:
- Access to high-end hardware (A100, L4, RTX 4090, etc.)
- Scalability for large projects or shared teams
- No need to own expensive GPUs
Example: AWS EC2 Setup
- Select Instance Type
- Recommended: g4dn.xlarge, g5.xlarge, or higher
- Use an AWS Deep Learning AMI (includes CUDA, PyTorch preinstalled)
- Recommended: g4dn.xlarge, g5.xlarge, or higher
- Launch the Instance
- Open inbound ports 8188 (for ComfyUI) or 7860 (for Automatic1111)
- Allocate at least 50 GB of storage for models and outputs
- Open inbound ports 8188 (for ComfyUI) or 7860 (for Automatic1111)
Connect via SSH
ssh -i your-key.pem ubuntu@ec2-XX-XX-XX-XX.compute.amazonaws.com
Deploy ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
python main.py --listen 0.0.0.0 --port 8188
Access the Interface
Visit:
http://<EC2-public-IP>:8188
Optional: Dockerized Deployment
docker build -t comfyui .
docker run -p 8188:8188 comfyui
3. Using Hosted Inference Platforms
Why Use Hosted Inference?
Hosted inference platforms handle infrastructure, GPU management, and scaling for you.
They’re ideal for users who want fast deployment without managing servers.
Replicate
- Lets you run Stable Diffusion models through an API.
- Ideal for integrating image generation into web apps or backends.
Example (Python):
import replicate
output = replicate.run(
"stability-ai/stable-diffusion:latest",
input={"prompt": "a futuristic city skyline at sunset"}
)
print(output)
RunPod
- Provides on-demand GPU “Pods” for AI workloads.
- Offers prebuilt templates for ComfyUI and Automatic1111.
- Includes web access and optional public endpoints.
Steps:
- Sign up at runpod.io
- Launch a GPU pod (A100, 4090, etc.)
- Choose the “ComfyUI + Stable Diffusion” template
- Access your workspace via the provided web URL
Other Options
- Modal — Serverless GPU compute with Python SDKs
- Vast.ai — Marketplace for affordable GPU rentals
Paperspace Gradient — Cloud GPU notebooks for quick experimentation
Conclusion
Self-hosting Stable Diffusion gives you complete creative control and flexibility:
- For experimentation and customization, run ComfyUI locally.
- For scalability, deploy on cloud GPUs like EC2 or GCP.
- For ease of use, rely on hosted inference providers like Replicate or RunPod.
Best Practices:
- Secure your endpoints (use firewalls or VPNs)
- Monitor GPU usage to control costs
- Follow model license terms and ethical use policies
Quick Comparison Table
|
Deployment Type |
Example Tools |
Best For |
Pros |
Cons |
|
Local |
ComfyUI, Automatic1111, InvokeAI |
Hobbyists, Artists, Developers |
Offline control, full customization, no recurring cost |
Requires GPU + manual setup |
|
Cloud VM |
AWS EC2, GCP, Azure |
Teams, scalable workloads |
Access to powerful GPUs, scalable, reproducible |
Hourly cost, setup complexity |
|
Hosted Service |
Replicate, RunPod, Modal |
Developers, integrations |
Instant deployment, managed infrastructure |
Limited customization, usage fees |
Next Steps
Now that you’ve deployed Stable Diffusion, here are recommended next topics to explore:
- Fine-Tuning & LoRA Training – Train custom aesthetic or style-specific models.
- Workflow Automation in ComfyUI – Use nodes and batch processing for large-scale generation.
- Optimizing for Speed – Explore GPU acceleration, quantization, and TensorRT.
- Integrating via API – Use REST or WebSocket APIs to trigger generations from your own apps.
Model Management – Organize checkpoints, LoRAs, and embeddings efficiently across multiple environments.