Private ChatGPT

Run a ChatGPT-style assistant entirely on your own machine - Ollama + Open WebUI, no data leaves your network, $0/mo. Works on a $300 used GPU.

Cost
~$300
$0/mo vs cloud
Difficulty
beginner
Setup time
~15 min
Use case
A private ChatGPT on your own GPU
HardwareRtx 3060 12gb

~$300 hardware · $0/mo vs cloud

Private ChatGPT

A ChatGPT-style chat assistant that runs entirely on your own hardware. Your conversations never leave your machine, there's no monthly fee, and it works offline. Two open-source components - Ollama to serve the model and Open WebUI for the familiar chat interface - get you there in about 15 minutes.

What you get

  • A polished, ChatGPT-like web UI with chat history, system prompts, and multi-model switching
  • 100% local inference - no data sent to any cloud, works on an air-gapped network
  • Multiple users / accounts on your LAN
  • $0/month - the only cost is the GPU you already own (or a ~$300 used one)

Architecture

ComponentRole
OllamaPulls and serves the model over a local API (port 11434)
Open WebUIBrowser chat front-end, talks to Ollama (port 3000)
Qwen3 14BThe default model - strong general chat, fits 12GB at Q4

For an 8GB card, swap in Llama 3.1 8B. Recommended GPU: RTX 3060 12GB - the cheapest card that runs a 14B model comfortably.

Prerequisites

  • A GPU with ≥12 GB VRAM (RTX 3060 12GB or better) - or run smaller on 8GB
  • Docker + Docker Compose, with the NVIDIA Container Toolkit for GPU passthrough
  • ~10 GB free disk for the model

Setup

Save this as docker-compose.yml:

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    volumes:
      - ollama:/root/.ollama
    ports:
      - "11434:11434"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
 
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    depends_on:
      - ollama
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open-webui:/app/backend/data
    ports:
      - "3000:8080"
    restart: unless-stopped
 
volumes:
  ollama:
  open-webui:

Bring it up and pull the model:

docker compose up -d
docker exec ollama ollama pull qwen3:14b

Open http://localhost:3000, create the first account (it becomes admin), pick qwen3:14b, and chat.

Use it

  • Daily assistant - drafting, summarizing, brainstorming, with full history
  • Private document Q&A - paste sensitive text you'd never send to a cloud API
  • Team chat - host it on a homelab box; everyone on the LAN gets an account

Cost vs cloud

Private ChatGPTChatGPT Plus
Monthly$0$20
Hardware~$300 once (RTX 3060 12GB)$0
Data privacyStays on your machineSent to OpenAI
Break-even~15 months, then free forever-

After ~15 months the GPU has paid for itself versus a single ChatGPT Plus seat - and it serves your whole household, runs offline, and keeps every conversation private. See the cost-vs-cloud calculator for your usage.

Troubleshooting

  • Ollama can't see the GPU → install the NVIDIA Container Toolkit and confirm docker run --rm --gpus all nvidia/cuda:12.4.0-base nvidia-smi works.
  • Open WebUI shows no models → the model pull (ollama pull) must finish first; refresh the model list.
  • Slow / CPU-only → check docker exec ollama ollama ps shows the GPU; without it you'll get 2-5 tok/s instead of 40+.

Swap components