For creative

Local AI for Creative

Image generation, video synthesis, music composition, and speech - all running on your own GPU. No cloud, no subscriptions, no limits.

Image generation

ComfyUI

Node-based UI for Stable Diffusion, Flux, video generation, and the entire open-source image/video AI ecosystem.

121.9k

Stable Diffusion WebUI (AUTOMATIC1111)

Feature-rich web interface for Stable Diffusion with extensive extensions and fine-tuning controls.

164.2k

Fooocus

Image generation that reimagines Stable Diffusion with a minimal, Midjourney-like interface.

51.2k

InvokeAI

Professional-grade Stable Diffusion toolkit with unified canvas, node workflow, and model management.

27.6k

Stable Diffusion Forge

Enhanced fork of AUTOMATIC1111's WebUI with 2x faster inference, lower VRAM usage, and Flux support.

12.9k

Video generation

HunyuanVideo 1.5

HunyuanVideo 1.5 - 13B params

Wan2.1 T2V 14B

Wan2.1 T2V 14B - 14B params

LTX 2.3

LTX 2.3 - 21B params

Music & audio

MusicGen Small

MusicGen Small - 0.6B params

Stable Audio Open 1.0

Stable Audio Open 1.0 - 1B params

Kokoro TTS

High-quality, lightweight neural TTS with 10+ voices, 2x faster than real-time on CPU.

8.1k

Kokoro 82M

Kokoro 82M - 0.08B params

Piper TTS

Fast, local neural text-to-speech that runs in real-time on CPU with 100+ voices across 20+ languages.

11.3k

Speech & transcription

Whisper.cpp

High-performance, real-time speech-to-text using OpenAI's Whisper models in pure C/C++.

52.2k

Whisper Large V3 Turbo

Whisper Large V3 Turbo - 0.8B params

Omni Voice

Omni Voice - 0.6B params

Sherpa Onnx

Cross-platform speech processing toolkit for STT, TTS, VAD, speaker recognition, and language ID.

13.7k