llamafile social preview
tts-stt24,689Other

llamafile

Distribute and run LLMs with a single file.

Updated Jun 8, 2026
Platforms
macos, linux
Pricing
free-open-source
Status
active
License
Other

What it does

Core capabilities at a glance

  • Cross Platform
  • Gguf
  • Llama CPP
  • Local AI
  • Local Inference
  • Local LLM
  • Open Source AI
  • Single File Executable

Deep dive

The full breakdown - performance, comparisons, and setup

llamafile

llamafile is a speech (TTS/STT) tool - Distribute and run LLMs with a single file.

Overview

llamafile lets you distribute and run LLMs with a single file.

llamafile is a Mozilla Builders project (see its announcement blog post), now revamped by Mozilla.ai.

Our goal is to make open LLMs much more accessible to both developers and end users. We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most operating systems and CPU archiectures, with no installation.

llamafile also includes whisperfile, a single-file speech-to-text tool built on whisper.cpp and the same Cosmopolitan packaging. It supports transcription and translation of audio files across all the same platforms, with no installation required.

llamafile is open-source, written primarily in C++, with 24,689 GitHub stars under the Other license. The latest release is 0.10.3 (2026-06-02).

How it fits a local-AI stack

llamafile runs on your own hardware, so pair it with a model and a GPU sized to your needs. Use the VRAM calculator to pick a model that fits your card, and see what you can run for hardware guidance. Related speech (TTS/STT) tools in the directory:

Sources

Stats from GitHub, 2026-06-08.

Frequently asked

Quick answers to common questions

What is llamafile?

llamafile is a tts-stt tool for local AI workloads. Distribute and run LLMs with a single file.

Is llamafile free and open source?

Yes, llamafile has 24,689 GitHub stars and is licensed under Other. You can self-host it for free on macos, linux.

What platforms does llamafile support?

llamafile runs on macos, linux.

What hardware do I need for llamafile?

The hardware requirements depend on which models you run. Check our hardware directory for compatible GPUs and systems. llamafile has 24,689 GitHub stars and an active community.

Does llamafile support GPU acceleration?

llamafile supports GPU acceleration via CUDA, Metal, or Vulkan depending on your platform. For the best performance, pair it with an NVIDIA RTX 4090 or 5090.

What are the best alternatives to llamafile?

Popular alternatives include other tts-stt tools in our directory. Browse our full collection at /tool for comparisons, community reviews, and benchmark data to find the right fit for your workflow.

How much does llamafile cost?

llamafile is free-open-source. It is completely free and open source to self-host.

Pairs well with

Complementary tools, models, and hardware

Comments coming soon

Configure NEXT_PUBLIC_GISCUS_REPO_ID and NEXT_PUBLIC_GISCUS_CATEGORY_ID at giscus.app to enable.