What is the Local Image Generation (ComfyUI + FLUX) stack for?

ComfyUI + FLUX.1 Dev = professional-grade image generation on your own GPU. Node-based workflow, photorealistic output, full privacy. Runs on a 12GB GPU for $0/mo. It is purpose-built for Generate photorealistic images locally with node-based AI workflows and runs entirely on your own hardware.

How much does the Local Image Generation (ComfyUI + FLUX) stack cost?

Local Image Generation (ComfyUI + FLUX) costs around $1,600 in hardware up front and $0/month to run, since everything is self-hosted — no per-token or subscription fees versus a cloud equivalent.

How long does it take to set up Local Image Generation (ComfyUI + FLUX)?

Plan for roughly 20 minutes. The stack is rated intermediate.

What do I need to run Local Image Generation (ComfyUI + FLUX)?

Local Image Generation (ComfyUI + FLUX) is built from 1 tool(s), 1 model(s), 2 hardware item(s). Each is listed below with a link.

ComfyUI + FLUX.1 Dev = professional-grade image generation on your own GPU. Node-based workflow, photorealistic output, full privacy. Runs on a 12GB GPU for $0/mo.

Local Image Generation (ComfyUI + FLUX)

A professional-grade image generation pipeline that runs entirely on your own hardware. ComfyUI is a powerful node-based workflow engine for AI image, video, and 3D generation. Paired with FLUX.1 Dev - Black Forest Labs' 12B parameter text-to-image model - you get output that rivals Midjourney and DALL-E, with complete creative control and zero data leaving your machine.

What you get

Node-based workflow editor - visually construct complex image generation pipelines
Photorealistic output - FLUX.1 Dev produces stunning 1024x1024 images from text prompts
5,000+ community nodes - LoRA, ControlNet, IP-Adapter, AnimateDiff, and more
Queue-based generation - batch multiple prompts with per-node caching
Workflow sharing - export/import workflows as JSON from the community
API mode - integrate with n8n, custom apps, or automation pipelines
$0/month - after the GPU, every image is free

Architecture

Component	Role
ComfyUI	Node-based workflow engine and UI
FLUX.1 Dev	12B text-to-image model, photorealistic output
Custom nodes (as needed)	LoRA, ControlNet, upscalers, video extensions

FLUX.1 Dev is a 12B parameter model that needs significant VRAM. Recommended: RTX 4090 24GB for comfortable use, RTX 5090 for faster generation. Can run on 12GB with quantized versions.

Prerequisites

A GPU with ≥12 GB VRAM (24GB recommended for FLUX at full quality)
20 GB free disk for the FLUX model files
Python 3.10+ or the ComfyUI desktop app
Git (for custom nodes)

Setup

Option A: Desktop App (Easiest)

Download the ComfyUI desktop app from comfy.org
Install and launch it
Use the built-in model manager to download FLUX.1 Dev

Option B: Manual Install

git clone https://github.com/Comfy-Org/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

Download the FLUX model files:

# Create model directories
mkdir -p models/unet models/clip models/vae
 
# Download FLUX.1 Dev (requires HuggingFace login)
# Models go in:
# - models/unet/flux1-dev.safetensors
# - models/clip/clip_l.safetensors
# - models/clip/t5xxl_fp16.safetensors
# - models/vae/ae.safetensors

Launch ComfyUI:

python main.py

Open http://localhost:8188 to access the ComfyUI interface.

Option C: Docker

docker run -d --gpus all -p 8188:8188 \
  --name comfyui \
  -v comfyui_models:/app/models \
  comfyui/comfyui:latest

Using the Basic FLUX Workflow

Open http://localhost:8188
Load the default workflow or create a new one
Add these nodes:
- Checkpoint Loader → load flux1-dev.safetensors
- CLIP Text Encoder → enter your prompt (e.g., "a photorealistic cat sitting on a vintage leather chair, warm lighting, depth of field")
- KSampler → connect model, CLIP, and empty latent
- VAE Decode → decode the latent to an image
- Save Image → save the result
Click Queue Prompt to generate

Prompt Tips

FLUX responds well to natural language descriptions
Add style cues: "photorealistic", "cinematic lighting", "8K", "macro photography"
Negative prompts work differently in FLUX - use shorter negative prompts than SD models

Advanced Workflows

LoRA Loading

Add a LoRA Loader node between the checkpoint and the model input. LoRA files go in models/loras/.

Image-to-Image

Replace the Empty Latent Image with a VAE Encode node connected to your input image.

ControlNet

Add a ControlNet Loader and Apply ControlNet node for pose/edge guidance.

Video Generation

Install Video Nodes (via ComfyUI Manager) and pair with models like Wan2.1 for AI video.

Cost vs cloud

	Local ComfyUI + FLUX	Midjourney / DALL-E
Monthly	$0	$10-60
Per image	$0	$0.04-0.12
Hardware	~$1600 once (4090)	$0
Data privacy	Stays on your GPU	Sent to cloud
Control	Full node-level	Limited
Batch gen	Unlimited, free	Rate-limited
Resolution	Any (VRAM permitting)	Fixed sizes

If you generate 100+ images/month, a 4090 pays for itself in about 2 years versus Midjourney Pro. For power users generating 1000+/month, it pays off in months.

Troubleshooting

Out of memory → FLUX needs ~20GB VRAM at full precision. Use the FP8 quantized version to fit in 12GB. Lower resolution to 768x768.
No images showing → Check the console output for errors. The VAE decoder step is often the bottleneck.
Slow generation → FLUX.1 Dev takes 30-60s per image on a 4090. For faster results, use FLUX.1 Schnell (4-step distilled version).
Missing model files → Use the ComfyUI Manager node to download models from within the interface.
CORS errors in API mode → Set --listen 0.0.0.0 and configure the API settings in extra_model_paths.yaml.

Swap components

Faster generation → Use FLUX.1 Schnell (4-step distilled, 4x faster)
Video generation → Add Wan2.1 or HunyuanVideo nodes
Alternative UI → Try SwarmUI for a simpler interface on top of ComfyUI
Lower VRAM → Use SDXL or SD3.5 for 8GB cards
n8n integration → Use ComfyUI's API mode with n8n for automated generation pipelines

Local Image Generation (ComfyUI + FLUX)

Local Image Generation (ComfyUI + FLUX)

What you get

Architecture

Prerequisites

Setup

Option A: Desktop App (Easiest)

Option B: Manual Install

Option C: Docker

Using the Basic FLUX Workflow

Prompt Tips

Advanced Workflows

LoRA Loading

Image-to-Image

ControlNet

Video Generation

Cost vs cloud

Troubleshooting

Swap components

Frequently asked

What is the Local Image Generation (ComfyUI + FLUX) stack for?

How much does the Local Image Generation (ComfyUI + FLUX) stack cost?

How long does it take to set up Local Image Generation (ComfyUI + FLUX)?

What do I need to run Local Image Generation (ComfyUI + FLUX)?