Local Cursor (AI coding)

A Cursor/Copilot-style coding assistant that runs locally - Continue in VS Code + a Qwen Coder model on Ollama. Tab-complete and chat, $0/mo, your code never leaves your machine.

Cost
~$800
$0/mo vs cloud
Difficulty
beginner
Setup time
~20 min
Use case
A private AI coding assistant in your editor

Local Cursor (AI coding)

A Copilot/Cursor-style coding assistant - inline autocomplete plus an in-editor chat that knows your codebase - running entirely on your own GPU. Your proprietary code never touches a third-party server, and there's no per-seat subscription. Built from Continue (the open-source VS Code / JetBrains extension) and a Qwen Coder model served by Ollama.

What you get

  • Tab autocomplete in your editor, like Copilot
  • Chat with your code - ask about files, generate functions, write tests
  • Zero code exfiltration - everything runs on localhost
  • $0/month vs $20/mo for Cursor or Copilot

Architecture

ComponentRole
ContinueVS Code / JetBrains extension - autocomplete + chat UI
OllamaServes the coding model locally (port 11434)
Qwen2.5 Coder 14BStrong code model that fits a 24GB GPU at Q4

For more headroom and a sharper model, use Qwen3 Coder 30B A3B (MoE, fast). Recommended GPU: RTX 3090 (best value, 24GB) or RTX 4090.

Prerequisites

  • A GPU with ≥24 GB VRAM (RTX 3090 / RTX 4090) for the 14B at good speed
  • Ollama installed (native or Docker)
  • VS Code (or a JetBrains IDE)

Setup

  1. Pull the coding model with Ollama:
ollama pull qwen2.5-coder:14b
  1. Install the Continue extension in VS Code (Extensions → search "Continue").

  2. Point Continue at your local Ollama. Open ~/.continue/config.yaml:

models:
  - name: Qwen2.5 Coder 14B
    provider: ollama
    model: qwen2.5-coder:14b
    roles:
      - chat
      - edit
  - name: Qwen2.5 Coder (autocomplete)
    provider: ollama
    model: qwen2.5-coder:14b
    roles:
      - autocomplete
  1. Reload VS Code. You now have inline completions and a Continue chat panel - both fully local.

Use it

  • Autocomplete - write a comment, get the implementation; Tab to accept
  • Refactor - select code, "⌘L", ask Continue to rewrite/optimize
  • Tests + docs - "write unit tests for this file", "document these functions"

Cost vs cloud

Local CursorCursor / Copilot
Monthly$0$20/seat
Hardware~$800 once (used RTX 3090)$0
Code privacyNever leaves your machineUploaded to vendor
Break-even~40 months per seat - or instant for teams that can't upload code-

For solo devs it's a long payback, but for any team under a no-cloud-code policy (finance, defense, health) it's the only option that works at all. See the cost-vs-cloud calculator.

Troubleshooting

  • Autocomplete feels slow → a 14B at Q4 wants a 24GB card; on less VRAM use Qwen2.5 Coder 7B or a smaller quant.
  • Completions are off-topic → make sure the autocomplete role uses a base/instruct coder model, not a chat model.
  • Want repo-wide context → enable Continue's @codebase and index the workspace.

Swap components