The Right GPT

How To, Unbiased Guides, Prompts & Reviews for AI Tools


How to Run Stable Diffusion Locally on Mac

How to Run Stable Diffusion Locally on Mac (2026) | The Right GPT

How to Run Stable Diffusion Locally on Mac

Here are the most practical and up-to-date ways (as of 2025–2026) to run Stable Diffusion locally on a Mac, especially on Apple Silicon (M1 / M2 / M3 / M4 chips). These methods use the unified memory + GPU acceleration via Metal / MLX / MPS.

Getting Started with Stable Diffusion on Mac

The easiest and fastest options for most users in 2025–2026 are the native Mac apps. Below we’ll cover both beginner-friendly apps and advanced terminal-based installations.

🍏 Easiest & Recommended Options (Best for Beginners / Fast Setup)

Draw Things (App Store – free, very well-optimized)

  • Native Mac/iOS app, excellent Metal performance
  • Often considered the best balance of speed + features on Apple Silicon right now
  • Supports SD 1.5, SDXL, many Flux variants, ControlNet, LoRAs, upscaling, inpainting, etc.
  • Completely offline after model download

How to get started:

  • Open the Mac App Store → search for “Draw Things: AI Art Generator”
  • Install (free)
  • Open the app → it will offer to download models (start with Realistic Vision, Juggernaut XL, or Flux.1-dev if your Mac has ≥16 GB RAM)
  • Type a prompt and generate

Diffusion Bee (still good, one-click installer)

  • Download from: https://diffusionbee.com
  • Very simple, no terminal needed
  • Good speed on M1–M4 (≈20–45 seconds per image depending on model/resolution/RAM)
  • Supports recent models (SDXL, some Flux versions via updates)
  • Great choice if you want the absolute minimum setup

⚡ More Powerful / Advanced Options

If you want extensions, ControlNet, IP-Adapter, many LoRAs at once, or Flux.1 full power:

Automatic1111 WebUI (A1111 / Forge variant)

Still very popular, huge ecosystem, but requires terminal setup.

Requirements:

  • macOS 13.5+ (Ventura or later)
  • Apple Silicon Mac (M1 or newer)
  • Preferably ≥16 GB RAM

Installation Steps:

Install Homebrew (if you don’t have it)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Install dependencies
brew install cmake protobuf rust python@3.11 git wget
Clone the repo (use Forge fork for better Mac performance)
git clone https://github.com/lllyasviel/stable-diffusion-webui-forge.git cd stable-diffusion-webui-forge
Launch with Apple Silicon flags (first launch downloads ~6–10 GB)
./webui.sh --skip-torch-cuda-test --no-half-vae --opt-sdp-attention
Later launches can use
./webui.sh --opt-sdp-no-mem-attention --no-half-vae

Wait → browser opens at http://127.0.0.1:7860. Download models into models/Stable-diffusion/ (e.g., from Civitai).

Performance notes: 512×512 SD 1.5 images usually take 8–30 seconds on M2/M3 16–32 GB machines.

ComfyUI (node-based, very powerful, increasingly popular)

  • Excellent for complex workflows (Flux.1, ControlNet++, etc.)
  • Often fastest for Flux on Mac via MLX backend
  • Install guide: Search “ComfyUI Mac MLX” or use Stability Matrix (one-click launcher that handles Comfy + A1111 + InvokeAI)

📊 Quick Recommendation Table (2025–2026)

Goal Best Choice Setup Difficulty Speed on M3/M4 (16–32 GB) Features
Easiest & good performance Draw Things Very easy Excellent Very good
One-click, classic Diffusion Bee Easiest Very good Good
Maximum extensions & community A1111 / Forge Medium Good Excellent
Complex workflows, Flux ComfyUI (+ MLX backend) Medium–Hard Excellent (Flux) Best for advanced
Pretty UI + offline InvokeAI or DiffusionBee Easy–Medium Good Good

💡 Quick Tips for Best Results on Mac

  • Use 16 GB RAM or more (8 GB works but is slow / limited to small models)
  • Prefer SDXL / Flux.1-dev-fp8 quantized models for newer Macs
  • Generate at 512–768 resolution first → use hires fix / upscaler
  • Close other heavy apps (Chrome with 30 tabs kills performance)

If you’re just starting, install Draw Things from the App Store right now — it’s free, looks native, and you’ll have your first image in <5 minutes.

❓ Frequently Asked Questions (2026 Edition)

Q1: What’s the easiest way to run Stable Diffusion on a Mac right now?

A: For most people, Draw Things (free on the Mac App Store) is the easiest and best-optimized option. It’s a native Mac app that uses Apple’s Metal/ML framework, so it runs fast and smoothly on M-series chips with minimal setup. Just install, let it download a model (like Juggernaut XL or Flux fp8), and start generating. If you want something even simpler with a one-click installer, try Diffusion Bee (download from diffusionbee.com). Both work offline after the initial model download.

Q2: Which Mac is good enough to run Stable Diffusion well?

A:

  • 8 GB RAM (base M1/M2 Air): Works, but slow and limited to small models like SD 1.5 or LCM variants (~30–60+ sec/image).
  • 16 GB RAM (most common recommendation): Solid sweet spot — good speed with SD 1.5, SDXL, and lighter Flux models (~10–40 sec/image depending on resolution).
  • 24–32 GB+ RAM (M2/M3/M4 Pro/Max): Excellent — handles Flux.1-dev, high-res, multiple LoRAs/ControlNets comfortably (~5–25 sec/image). More GPU cores = faster generation (e.g., M3 Max > M3 > M2). Avoid Intel Macs — they don’t get good acceleration.

Q3: Why is generation so slow on a Mac compared to Windows PCs with NVIDIA?

A: Macs use Apple’s unified memory + Metal acceleration (no CUDA). NVIDIA GPUs are still faster for raw diffusion workloads, especially at high resolutions or with complex extensions. On a modern M3/M4 with 16–32 GB, expect 5–40 seconds per image (vs. 1–5 sec on a good RTX card). Use quantized/fp8 models, lower steps (20–30), and resolutions like 768×768 + hires fix for best speed.

Q4: Should I use Draw Things, Diffusion Bee, A1111/Forge, or ComfyUI?

A:

  • Draw Things → Best for beginners/fast native performance/simple UI. Great ControlNet, LoRA, inpainting support.
  • Diffusion Bee → Easiest one-click setup, good for quick tests. Fewer advanced features.
  • A1111 / Forge → Huge ecosystem (extensions, scripts), but slower and more error-prone on Mac unless using optimized forks (e.g., Forge with –no-half-vae flags).
  • ComfyUI → Most powerful/flexible (especially for Flux + workflows), often fastest for newer models via MLX backend, but steeper learning curve (node-based). Start with Draw Things → move to ComfyUI if you want maximum control.

Q5: I get errors/glitches/black images when switching models in A1111 or Forge — how to fix?

A: Common on Mac due to precision/memory issues. Add these launch flags: --no-half-vae --opt-sdp-no-mem-attention --skip-torch-cuda-test (or --no-half for full fp32 if still unstable, though slower). Restart the UI after changing models. Use Forge fork instead of classic A1111 — it’s usually more stable on Apple Silicon.

Q6: Can I run the latest models like Flux.1 on Mac?

A: Yes! Flux.1-dev (fp8/quantized versions) runs well on 16 GB+ Macs via:

  • Draw Things (native support, very fast)
  • ComfyUI with MLX backend (often the fastest)

Avoid full unquantized Flux on <32 GB — it'll be very slow or crash. Download fp8/gguf variants from Hugging Face or Civitai.

Q7: How do I get more speed / better quality on my Mac?

A:

  • Use lower steps (20–30) + good samplers (Euler a, DPM++ 2M Karras).
  • Start at 512–768 resolution → use hires fix/upscaler.
  • Close other apps (Chrome eats unified memory fast).
  • Prefer Core ML / MLX / Metal-optimized tools (Draw Things, ComfyUI-MLX).
  • For A1111/Forge: always use --opt-sdp-attention or similar Mac-friendly flags.

Q8: Is it worth buying a high-end Mac (M4 Max, 64 GB+) just for Stable Diffusion?

A: Not really — unless you do heavy Flux/ControlNet workflows or plan to train models. A mid-range M3/M4 with 16–32 GB gives great value. For ultimate speed, many Mac users still remote into a cheap Windows PC with an RTX 3060/4070/4090 via Parsec or similar for generation, while keeping the Mac for editing/prompting.

📚 Additional Resources

Master every aspect of Stable Diffusion with our complete guide collection:

Stable Diffusion Negative Prompts Guide

Master negative prompting to eliminate artifacts, fix anatomy issues, and generate perfect AI images. Get categorized prompt collections and advanced techniques.

Master Negative Prompts →

Stable Diffusion Models Guide

Learn where to download safe models (SD 3.5, Flux.1, custom checkpoints) and how to manage them. Includes direct download links for SD 1.5, SDXL, and realistic models.

Learn About Models →

Troubleshooting Common Issues

Fix black images, out of memory errors, slow generation, and other common problems. Step-by-step solutions for Windows, Mac, and Linux.

Troubleshooting Guide →

Prompting & Advanced Techniques

Master prompt engineering, ControlNet, LoRAs, upscaling, and other advanced features. Take your AI art to the next level with professional workflows.

Advanced Techniques →

⚡ Advanced ComfyUI Setup with MLX Acceleration for Mac

For maximum performance on Apple Silicon (M1/M2/M3/M4), especially with Flux models, use MLX acceleration. Tested on macOS Sonoma/Ventura and later; aim for at least 16 GB unified memory for optimal Flux runs (e.g., on M1 Max, expect 20-60 seconds per image at 768×1024).

Step 1: Install Dependencies

Install Homebrew (if needed)
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
Install Python 3.10 and tools
brew install python@3.10 git cmake protobuf wget

Step 2: Install PyTorch for Apple Silicon

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Pro tip: Use Python 3.10.x for compatibility. If needed, manage versions with pyenv:

brew install pyenv pyenv install 3.10.14 pyenv virtualenv 3.10.14 comfyui pyenv activate comfyui

Step 3: Clone ComfyUI

git clone https://github.com/comfyanonymous/ComfyUI.git cd ComfyUI

Step 4: Install Python Dependencies

pip install -r requirements.txt

Step 5: Set Up MLX for Flux Acceleration

pip install mlx mlx-lm

For Flux models, download quantized GGUF/FP8 versions from Hugging Face (lower RAM usage). Place them in:

ComfyUI/models/unet/

Step 6: Run ComfyUI

python main.py # Metal acceleration auto-enabled

Access at http://127.0.0.1:8188. Test on M1 Max or higher for Flux—lower chips may need FP8 quantization to avoid OOM errors.

⚠️ A1111/Forge Mac-Specific Notes (2026)

Sampler Compatibility

PLMS sampler does NOT work with Stable Diffusion 2.0+ models on Apple Silicon. Stick to these reliable samplers:

  • Euler a (fastest, good quality)
  • DPM++ 2M Karras (best quality)
  • UniPC (good balance)

Prompt Syntax on Mac

Attention/emphasis works as usual (word:1.1) to boost, [word:1.1] to reduce. However, Metal acceleration can be sensitive to extra spaces—ensure no trailing spaces in prompts.

Extension Compatibility

Most extensions work, but some (especially ControlNet versions) may require --medvram flag. Update regularly via git pull—current versions (as of 2026) have improved M4 support.

Official wiki: A1111 Apple Silicon Installation Guide

🆕 Stable Diffusion 3 (SD3) on Mac

SD3 Medium/Large excels in coherence and prompt adherence—gaining popularity in 2024–2026. It runs well on Mac via quantized versions (FP8 or 8-bit) to fit in 16–32 GB RAM.

In ComfyUI:

  • Use official SD3 workflow from ComfyUI examples
  • Download SD3 Medium from Hugging Face
  • Place in: ComfyUI/models/checkpoints/
  • Quantized FP8 versions reduce memory by ~50% (10–30 sec/image on M3/M4)
  • Enable via custom nodes like ComfyUI-SD3 for full compatibility

In Draw Things:

  • Native support for quantized SD3 via app’s model manager
  • Uses Core ML for “lossless” quantization—minimal quality drop
  • Great for iOS/Mac cross-use (15–45 sec on M2+)

Tips: SD3 requires positive/negative prompts; test with 20–50 steps at 512×512 then upscale.

🔒 Security Best Practices for Model Downloads

⚠️ Critical: Avoid Malware

Downloading models unsafely can expose your Mac to exploits.

Always Use .safetensors Format

Prefer .safetensors over .ckpt (pickled checkpoints)—the latter can contain executable code risks. Civitai and Hugging Face scan files with ClamAV and picklescan.

Trusted Sources Only

  • Civitai: Verified models with ratings and previews
  • Hugging Face: Official model cards from Stability AI
  • Avoid unverified sites: Check model hashes if provided

Scan Downloads

macOS built-in XProtect scans automatically. For extra safety, use VirusTotal for suspicious files.

Best Practice

Download directly via app managers (e.g., in Draw Things or ComfyUI Manager) to minimize exposure.

🧹 Uninstallation & Cleanup (Free Up Space)

Models can eat 10–50 GB of storage. Here’s how to clean up:

Remove Models Only

rm -rf ~/stable-diffusion-webui-forge/models/Stable-diffusion/ rm -rf ~/ComfyUI/models/checkpoints/

Full Uninstall

rm -rf ~/stable-diffusion-webui-forge/ # Delete A1111/Forge rm -rf ~/ComfyUI/ # Delete ComfyUI pip cache purge # Clear Python cache

App Cleanup

  • Draw Things: Delete app, then check ~/Library/Containers/com.draw-things/
  • Diffusion Bee: Drag to Trash, check ~/Library/Application Support/diffusion-bee/

Tip: Use App Cleaner for GUI apps; monitor storage via About This Mac > Storage.

📊 Mac Performance Benchmarks (2026)

Based on community tests with ComfyUI/Draw Things at 768×768, 20-30 steps. Times are approximate and vary by RAM/cores; use quantized models for speed.

Mac Chip (RAM) Flux FP8 (sec/image) SDXL (sec/image) SD 1.5 (sec/image) Notes
M1 (16 GB) 40–60 20–40 10–20 Base; slower on high-res
M2/M3 (16–24 GB) 20–40 10–25 5–15 Good for SD3/Flux with quantization
M3/M4 Max (32+ GB) 10–20 5–15 2–10 Handles Flux.dev well; M4 edges out M3 by 20–30%
M4 Pro (48 GB) 5–15 3–10 1–5 Fastest; ideal for batches/SD3 Large

Source: Tests on ComfyUI/Draw Things; higher cores/RAM reduce times.

🔄 Integration with Other Tools

Post-Editing with Photoshop/Affinity Photo

  • Export images and apply AI tools like Content-Aware Fill
  • Try plugins like InvokeAI for direct exports

Local LLMs for Prompt Generation

Combine with Ollama to run Llama 3 locally for prompt refinement:

ollama run llama3 'Improve this SD prompt: cyberpunk cat with neon lights, detailed'

Keeps everything offline and Mac-optimized.

Additional Easy-to-Use Apps

Diffusers (from Hugging Face):

A free Mac App Store app that’s straightforward to install and uses Core ML for acceleration. It’s great for basic text-to-image generation but lacks advanced features like ControlNet or extensive model support compared to Draw Things. Installation is just a download from the App Store, making it ideal for absolute beginners testing the waters. stable-diffusion-art.com

+1Mochi Diffusion:

Another free, open-source app optimized for Apple Silicon via Core ML. It supports text-to-image and some basic editing, with a clean interface. It’s lighter on features than ComfyUI but could appeal to users who want something between Diffusion Bee and A1111 in complexity.

basontech.comAmazing AI: A

simple App Store app focused on SD 1.5 models for quick text-to-image generation. It’s not as feature-rich but leverages M-series chips well for casual use, and it could be a good mention for artists experimenting with prompts without setup hassle