What it does
Core capabilities at a glance
- Apple Silicon
- Inference Server
- Macos
- MLX
- Openai API
Deep dive
The full breakdown - performance, comparisons, and setup
omlx
omlx is a local inference server - LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar.
Overview
oMLX LLM inference, optimized for your Mac Continuous batching and tiered KV caching, managed directly from your menu bar.
Install · Quickstart · Features · Models · CLI Configuration · Benchmarks · oMLX.ai
Download the '.dmg' from Releases, drag to Applications, done. The app includes in-app auto-update, so future upgrades are just one click. The macOS app also installs a lightweight '~/.omlx/bin/omlx' CLI shim so terminal commands and Apple Shortcuts can control the app-managed server.
Launch oMLX from your Applications folder. The Welcome screen guides you through three steps - model directory, server start, and first model download. That's it. To connect OpenClaw, OpenCode, Codex, Hermes Agent, or Copilot, see Integrations.
The server discovers LLMs, VLMs, embedding models, and rerankers from subdirectories automatically. Any OpenAI-compatible client can connect to 'http://localhost:8000/v1'. A built-in chat UI is also available at 'http://localhost:8000/admin/chat'.
The service runs 'omlx serve' with zero-config defaults ('/.omlx/models', port 8000). 'omlx start', 'omlx stop', and 'omlx restart' are the portable lifecycle commands; Homebrew installs delegate them to 'brew services'. To customize, either set environment variables ('OMLX_MODEL_DIR', 'OMLX_PORT', etc.) or run 'omlx serve --model-dir /your/path' once to persist settings to '/.omlx/settings.json'.
Logs are written to two locations: - Service log: '$(brew --prefix)/var/log/omlx.log' (stdout/stderr) - Server log: '~/.omlx/logs/server.log' (structured application log)
omlx is open-source, written primarily in Python, with 16,209 GitHub stars under the Apache 2.0 license. The latest release is v0.4.2rc1 (2026-06-06).
Key capabilities
From the project's documentation:
- Service log: $(brew --prefix)/var/log/omlx.log (stdout/stderr)
- Server log: ~/.omlx/logs/server.log (structured application log)
- Hot tier (RAM): Frequently accessed blocks stay in memory for fast access.
- LRU eviction: Least-recently-used models are evicted automatically when memory runs low.
- Manual load/unload: Interactive status badges in the admin panel let you load or unload models on demand.
- Model pinning: Pin frequently used models to keep them always loaded.
Install
A quick way to get started (always check the official docs for the latest):
brew install omlxHow it fits a local-AI stack
omlx runs on your own hardware, so pair it with a model and a GPU sized to your needs. Use the VRAM calculator to pick a model that fits your card, and see what you can run for hardware guidance. Related local inference servers in the directory:
Sources
- Source code & docs: jundot/omlx
- Official website: https://omlx.ai
Stats from GitHub, 2026-06-08.
Frequently asked
Quick answers to common questions
What is omlx?
omlx is a inference-server tool for local AI workloads. LLM inference server with continuous batching & SSD caching for Apple Silicon — managed from the macOS menu bar
Is omlx free and open source?
Yes, omlx has 16,214 GitHub stars and is licensed under Apache 2.0. You can self-host it for free on macos.
What platforms does omlx support?
omlx runs on macos.
What hardware do I need for omlx?
The hardware requirements depend on which models you run. Check our hardware directory for compatible GPUs and systems. omlx has 16,214 GitHub stars and an active community.
Does omlx support GPU acceleration?
omlx supports GPU acceleration via CUDA, Metal, or Vulkan depending on your platform. For the best performance, pair it with an NVIDIA RTX 4090 or 5090.
What are the best alternatives to omlx?
Popular alternatives include other inference-server tools in our directory. Browse our full collection at /tool for comparisons, community reviews, and benchmark data to find the right fit for your workflow.
How much does omlx cost?
omlx is free-open-source. It is completely free and open source to self-host.
Pairs well with
Complementary tools, models, and hardware
Comments coming soon
Configure NEXT_PUBLIC_GISCUS_REPO_ID and NEXT_PUBLIC_GISCUS_CATEGORY_ID at giscus.app to enable.