What it does

Core capabilities at a glance

Data Science
Deployment
Distributed
Hyperparameter Optimization
Hyperparameter Search
Large Language Models
LLM Inference
LLM Serving

Deep dive

The full breakdown - performance, comparisons, and setup

ray

ray is a local inference server - Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Overview

.. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/ray_header_logo.png

.. image:: https://readthedocs.org/projects/ray/badge/?version=master :target: http://docs.ray.io/en/master/?badge=master

.. image:: https://img.shields.io/badge/Ray-Join%20Slack-blue :target: https://www.ray.io/join-slack

.. image:: https://img.shields.io/badge/Discuss-Ask%20Questions-blue :target: https://discuss.ray.io/

.. image:: https://img.shields.io/twitter/follow/raydistributed.svg?style=social&logo=twitter :target: https://x.com/raydistributed

.. image:: https://img.shields.io/badge/Get_started_for_free-3C8AE9?logo=data%3Aimage%2Fpng%3Bbase64%2CiVBORw0KGgoAAAANSUhEUgAAABAAAAAQCAYAAAAf8%2F9hAAAAAXNSR0IArs4c6QAAAERlWElmTU0AKgAAAAgAAYdpAAQAAAABAAAAGgAAAAAAA6ABAAMAAAABAAEAAKACAAQAAAABAAAAEKADAAQAAAABAAAAEAAAAAA0VXHyAAABKElEQVQ4Ea2TvWoCQRRGnWCVWChIIlikC9hpJdikSbGgaONbpAoY8gKBdAGfwkfwKQypLQ1sEGyMYhN1Pd%2B6A8PqwBZeOHt%2FvsvMnd3ZXBRFPQjBZ9K6OY8ZxF%2B0IYw9PW3qz8aY6lk92bZ%2BVqSI3oC9T7%2FyCVnrF1ngj93us%2B540sf5BrCDfw9b6jJ5lx%2FyjtGKBBXc3cnqx0INN4ImbI%2Bl%2BPnI8zWfFEr4chLLrWHCp9OO9j19Kbc91HX0zzzBO8EbLK2Iv4ZvNO3is3h6jb%2BCwO0iL8AaWqB7ILPTxq3kDypqvBuYuwswqo6wgYJbT8XxBPZ8KS1TepkFdC79TAHHce%2F7LbVioi3wEfTpmeKtPRGEeoldSP%2FOeoEftpP4BRbgXrYZefsAI%2BP9JU7ImyEAAAAASUVORK5CYII%3D :target: https://www.anyscale.com/ray-on-anyscale?utm_source=github&utm_medium=ray_readme&utm_campaign=get_started_badge

Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI libraries for simplifying ML compute:

.. image:: https://github.com/ray-project/ray/raw/master/doc/source/images/what-is-ray-padded.svg

.. https://docs.google.com/drawings/d/1Pl8aCYOsZCo61cmp57c7Sja6HhIygGCvSZLi_AuBuqo/edit

'Tasks': Stateless functions executed in the cluster. - 'Actors': Stateful worker processes created in the cluster. - 'Objects'_: Immutable values accessible across the cluster.
Monitor Ray apps and clusters with the 'Ray Dashboard '. - Debug Ray apps with the 'Ray Distributed Debugger '.

Ray runs on any machine, cluster, cloud provider, and Kubernetes, and features a growing 'ecosystem of community integrations'_.

Install Ray with: ''pip install ray''. For nightly wheels, see the 'Installation page '__.

.. _'Serve': https://docs.ray.io/en/latest/serve/index.html .. _'Data': https://docs.ray.io/en/latest/data/data.html .. _'Workflow': https://docs.ray.io/en/latest/workflows/ .. _'Train': https://docs.ray.io/en/latest/train/train.html .. _'Tune': https://docs.ray.io/en/latest/tune/index.html .. _'RLlib': https://docs.ray.io/en/latest/rllib/index.html .. _'ecosystem of community integrations': https://docs.ray.io/en/latest/ray-overview/ray-libraries.html

Today's ML workloads are increasingly compute-intensive. As convenient as they are, single-node development environments such as your laptop cannot scale to meet these demands.

Ray is a unified way to scale Python and AI applications from a laptop to a cluster.

ray is open-source, written primarily in Python, with 42,807 GitHub stars under the Apache 2.0 license. The latest release is ray-2.55.1 (2026-04-22).

Key capabilities

From the project's documentation:

Data_: Scalable Datasets for ML
Tune_: Scalable Hyperparameter Tuning
RLlib_: Scalable Reinforcement Learning
Serve_: Scalable and Programmable Serving
Tasks_: Stateless functions executed in the cluster.
Actors_: Stateful worker processes created in the cluster.

Install

A quick way to get started (always check the official docs for the latest):

pip install ray

How it fits a local-AI stack

ray runs on your own hardware, so pair it with a model and a GPU sized to your needs. Use the VRAM calculator to pick a model that fits your card, and see what you can run for hardware guidance. Related local inference servers in the directory:

Sources

Source code & docs: ray-project/ray
Official website: https://ray.io

Stats from GitHub, 2026-06-08.

Frequently asked

Quick answers to common questions

What is ray?

ray is a inference-server tool for local AI workloads. Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.

Is ray free and open source?

Yes, ray has 43,318 GitHub stars and is licensed under Apache 2.0. You can self-host it for free on docker.

What platforms does ray support?

ray runs on docker.

What hardware do I need for ray?

The hardware requirements depend on which models you run. Check our hardware directory for compatible GPUs and systems. ray has 43,318 GitHub stars and an active community.

Does ray support GPU acceleration?

ray's GPU support depends on your specific setup. Check the documentation for details. For the best performance, pair it with an NVIDIA RTX 4090 or 5090.

What are the best alternatives to ray?

Popular alternatives include other inference-server tools in our directory. Browse our full collection at /tool for comparisons, community reviews, and benchmark data to find the right fit for your workflow.

How much does ray cost?

ray is free-open-source. It is completely free and open source to self-host.

Pairs well with

Complementary tools, models, and hardware

ray