Every Local AI for Creative Professionals - a curated directory of image generation, video, music, and audio AI tools that run on your own hardware.
Local AI for Creative
Image generation, video synthesis, music composition, and speech - all running on your own GPU. No cloud, no subscriptions, no limits.
Image generation
ComfyUI
Node-based UI for Stable Diffusion, Flux, video generation, and the entire open-source image/video AI ecosystem.
Stable Diffusion WebUI (AUTOMATIC1111)
Feature-rich web interface for Stable Diffusion with extensive extensions and fine-tuning controls.
Fooocus
Image generation that reimagines Stable Diffusion with a minimal, Midjourney-like interface.
InvokeAI
Professional-grade Stable Diffusion toolkit with unified canvas, node workflow, and model management.
Stable Diffusion Forge
Enhanced fork of AUTOMATIC1111's WebUI with 2x faster inference, lower VRAM usage, and Flux support.
Video generation
Music & audio
MusicGen Small
MusicGen Small - 0.6B params
Stable Audio Open 1.0
Stable Audio Open 1.0 - 1B params
Kokoro TTS
High-quality, lightweight neural TTS with 10+ voices, 2x faster than real-time on CPU.
Kokoro 82M
Kokoro 82M - 0.08B params
Piper TTS
Fast, local neural text-to-speech that runs in real-time on CPU with 100+ voices across 20+ languages.
Speech & transcription
Whisper.cpp
High-performance, real-time speech-to-text using OpenAI's Whisper models in pure C/C++.
Whisper Large V3 Turbo
Whisper Large V3 Turbo - 0.8B params
Omni Voice
Omni Voice - 0.6B params
Sherpa Onnx
Cross-platform speech processing toolkit for STT, TTS, VAD, speaker recognition, and language ID.