What it does

Core capabilities at a glance

AI Gateway
Anthropic
Azure Openai
Bedrock
Gateway
Langchain
Litellm
LLM Gateway

Deep dive

The full breakdown - performance, comparisons, and setup

litellm

litellm is a model router/gateway - Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, VLLM, NVIDIA NIM].

Overview

Open Source AI Gateway for 100+ LLMs. Self-hosted. Enterprise-ready. Call any LLM in OpenAI format.

LiteLLM is an open source AI Gateway that gives you a single, unified interface to call 100+ LLM providers — OpenAI, Anthropic, Gemini, Bedrock, Azure, and more — using the OpenAI format.

Use it as a Python SDK for direct library integration, or deploy the AI Gateway (Proxy Server) as a centralized service for your team or organization.

Managing LLM calls across providers gets complicated fast — different SDKs, auth patterns, request formats, and error types for every model. LiteLLM removes that friction:

All Supported Endpoints - '/chat/completions', '/responses', '/embeddings', '/images', '/audio', '/batches', '/rerank', '/a2a', '/messages' and more.

You can use LiteLLM through either the Proxy Server or Python SDK. Both give you a unified interface to access multiple LLMs (100+ LLMs). Choose the option that best fits your needs:

Who Uses It? Gen AI Enablement / ML Platform Teams Developers building LLM projects

litellm is open-source, written primarily in Python, with 49,618 GitHub stars under the Other license. The latest release is v1.88.0 (2026-06-06).

Key capabilities

From the project's documentation:

Unified API — one interface for 100+ LLMs, no provider-specific SDK juggling
Drop-in OpenAI compatibility — swap providers without rewriting your code
8ms P95 latency at 1k RPS (benchmarks)
✅ Features under the LiteLLM Commercial License:
✅ Professional Support - Dedicated discord + slack
✅ Secure access with Single Sign-On

How it fits a local-AI stack

litellm runs on your own hardware, so pair it with a model and a GPU sized to your needs. Use the VRAM calculator to pick a model that fits your card, and see what you can run for hardware guidance. Browse the full tools directory for alternatives.

Sources

Source code & docs: BerriAI/litellm
Official website: https://docs.litellm.ai/docs/

Stats from GitHub, 2026-06-08.

Frequently asked

Quick answers to common questions

What is litellm?

litellm is a model-router tool for local AI workloads. Python SDK, Proxy Server (AI Gateway) to call 100+ LLM APIs in OpenAI (or native) format, with cost tracking, guardrails, loadbalancing and logging. [Bedrock,…

Is litellm free and open source?

Yes, litellm has 54,411 GitHub stars and is licensed under Other. You can self-host it for free on .

What hardware do I need for litellm?

The hardware requirements depend on which models you run. Check our hardware directory for compatible GPUs and systems. litellm has 54,411 GitHub stars and an active community.

Does litellm support GPU acceleration?

litellm's GPU support depends on your specific setup. Check the documentation for details. For the best performance, pair it with an NVIDIA RTX 4090 or 5090.

What are the best alternatives to litellm?

Popular alternatives include other model-router tools in our directory. Browse our full collection at /tool for comparisons, community reviews, and benchmark data to find the right fit for your workflow.

How much does litellm cost?

litellm is free-open-source. It is completely free and open source to self-host.

Pairs well with

Complementary tools, models, and hardware

Models

qwen3-30b-a3b llama3-3-70b mistral-7b-instruct-v0-2 gpt-oss-20b

Hardware

rtx-4090 rtx-5090 rtx-3090 mac-studio-m4-ultra

Similar tools

More tools like this one

AIClient2API - Same category: model-router ClawRouter - Same category: model-router PhoneClaw - Same category: model-router shimmy - inference-server RagaAI-Catalyst - observability