claude-code-local social preview
tts-stt2,730MIT

claude-code-local

Run Claude Code 100% on-device with local AI on Apple Silicon. MLX-native Anthropic-API server, 65 tok/s Qwen 3.5 122B, Llama 3.3 70B, Gemma 4 31B. Private, of…

Updated Jun 8, 2026
Platforms
macos, web
Pricing
free-open-source
Status
active
License
MIT

What it does

Core capabilities at a glance

  • Abliterated
  • AI Privacy
  • Airgap
  • Ambient Computing
  • Anthropic
  • Apple Silicon
  • Browser Agent
  • Claude Code

Deep dive

The full breakdown - performance, comparisons, and setup

claude-code-local

claude-code-local is a speech (TTS/STT) tool - Run Claude Code 100% on-device with local AI on Apple Silicon. MLX-native Anthropic-API server, 65 tok/s Qwen 3.5 122B, Llama 3.3 70B, Gemma 4 31B. Private, offline, airgap-ready. Built for NDA / legal / healthcare workflows.

Overview

Three local AI brains. Four modes. One MacBook. Zero cloud. Pick your fighter and run Claude Code 100% on-device. 📍 Now with DeepSeek V4 Flash · 1M-token context · via Antirez's ds4 engine .

Built by Matt Macosko in Arcata, CA. Started with a chicken problem. Still figuring it out.

A real NDA. Llama 3.3 70B. Wi-Fi physically OFF. lsof running live. Watch a 70-billion-parameter model audit a confidential legal document, on-device, with the receipts on screen.

Built for lawyers, accountants, doctors, therapists, contractors — anyone handling other people's private stuff.

🖥️ Don't want to DIY? Get this stack on a Mac mini, ready to plug in.

The AirGap Box ships a pre-configured Mac mini to your office with this stack, a 31B-parameter language model, and three working agents already installed. One-time price. No subscription. Founding-customer pricing for the first 5 buyers.

Same prompt. Four engines. One MacBook. The new local challenger — Qwen3.6 27B — painted the best aurora, and never touched the internet.

Three AIs. One laptop. Same prompt. Live counters. Watch Gemma 31B local, Llama 70B local, and Claude cloud race the same HTML physics prompt on a MacBook.

Speak to Claude Code, hear replies in a cloned voice — 100% on-device. 2:31.

claude-code-local is open-source, written primarily in Python, with 2,730 GitHub stars under the MIT license. The latest release is v0.1.0 (2026-05-08).

Key capabilities

From the project's documentation:

  • 🐍 Python 3.12+ (for MLX)
  • 🤖 Claude Code (npm install -g @anthropic-ai/claude-code)
  • NarrativeGemma/CLAUDE.md — the narration persona itself (opt-in, sanitized, generic)
  • An AppleScript injector that writes transcribed utterances straight into the bound Terminal tab by window ID
  • 🟡 MLX_MODEL= — point at any HuggingFace repo and have the lineup auto-register a new fighter
  • 🟡 More fighters — open to PRs adding launchers for DeepSeek, Mistral, Phi, anything MLX-compatible

Install

A quick way to get started (always check the official docs for the latest):

pip install mlx-lm

How it fits a local-AI stack

claude-code-local runs on your own hardware, so pair it with a model and a GPU sized to your needs. Use the VRAM calculator to pick a model that fits your card, and see what you can run for hardware guidance. Related speech (TTS/STT) tools in the directory:

Sources

Stats from GitHub, 2026-06-08.

Frequently asked

Quick answers to common questions

What is claude-code-local?

claude-code-local is a tts-stt tool for local AI workloads. Run Claude Code 100% on-device with local AI on Apple Silicon. MLX-native Anthropic-API server, 65 tok/s Qwen 3.5 122B, Llama 3.3 70B, Gemma 4 31B. Private, of…

Is claude-code-local free and open source?

Yes, claude-code-local has 2,730 GitHub stars and is licensed under MIT. You can self-host it for free on macos, web.

What platforms does claude-code-local support?

claude-code-local runs on macos, web.

What hardware do I need for claude-code-local?

The hardware requirements depend on which models you run. Check our hardware directory for compatible GPUs and systems. claude-code-local has 2,730 GitHub stars and an active community.

Does claude-code-local support GPU acceleration?

claude-code-local supports GPU acceleration via CUDA, Metal, or Vulkan depending on your platform. For the best performance, pair it with an NVIDIA RTX 4090 or 5090.

What are the best alternatives to claude-code-local?

Popular alternatives include other tts-stt tools in our directory. Browse our full collection at /tool for comparisons, community reviews, and benchmark data to find the right fit for your workflow.

How much does claude-code-local cost?

claude-code-local is free-open-source. It is completely free and open source to self-host.

Pairs well with

Complementary tools, models, and hardware

Comments coming soon

Configure NEXT_PUBLIC_GISCUS_REPO_ID and NEXT_PUBLIC_GISCUS_CATEGORY_ID at giscus.app to enable.