
NVIDIA Tesla P40
NVIDIA Tesla P40
The Tesla P40 is a vintage datacenter GPU from 2016 that has found new life in the local AI community. With 24 GB of GDDR5 memory and used prices around $150, it's the absolute cheapest way to get 24 GB VRAM. The caveats are significant: no display outputs (passive server card), Pascal architecture (no FP16/BF16), and requires active cooling mods.
Quick verdict
The Tesla P40 is the ultimate budget local AI experimenter's card. 24 GB for $150 is unmatched VRAM-per-dollar. But it's slow (~5-8 tok/s on 30B models), doesn't support modern precision formats, and needs cooling mods. Perfect for learning or batch inference where speed isn't critical.
Spec breakdown
- VRAM: 24 GB GDDR5
- Memory bandwidth: 347 GB/s
- TDP: 250 W (passive - needs active cooling)
- PCIe: 3.0 ×16
- Architecture: Pascal GP102
- CUDA cores: 3,840
- Tensor cores: None (no TensorRT optimization)
Real-world AI inference
| Model | Tokens/sec | Source |
|---|---|---|
| Qwen3-30B Q4_K_M | ~5 tok/s | r/LocalLLaMA |
| Mistral Small 3 Q4_K_M | ~10 tok/s | Community |
| Llama 3.1 8B Q4_K_M | ~20 tok/s | Community |
| Dual P40 - 70B Q3_K_M | ~4 tok/s | Community |
Where to buy
- eBay: ~$150-200
- Requires: Active cooling fan mod (3D printed shroud)
Honest alternatives
- Used RTX 3060 12GB (~$200): Faster, easier to use, less VRAM
- Used RTX 3090 (~$700): Much faster, same VRAM, higher cost
- Intel Arc A770 16GB (~$200): Modern, easier setup
What the community says
"Bought two P40s for $120 each. Slapped 3D-printed fans on them. 48 GB total for $240. Runs 70B at Q3_K_M at 4 tok/s. It's janky but it works!"
- u/jank-ai on r/LocalLLaMA, 278 upvotes
Frequently asked
Quick answers to common questions
How much VRAM does the NVIDIA Tesla P40 have?
The NVIDIA Tesla P40 has 24 GB of VRAM with 347 GB/s memory bandwidth. MSRP was $5,999.
What local AI models can run on the NVIDIA Tesla P40?
The NVIDIA Tesla P40 with 24 GB VRAM can run many models depending on quantization. Models up to ~36B params may fit at Q4_K_M. Use our VRAM calculator to check specific models.
Is the NVIDIA Tesla P40 good for local AI inference?
NVIDIA Tesla P40 is best for llm-inference-entry, multi-gpu, budget-build. With ample VRAM it handles most open models well.
Where can I buy the NVIDIA Tesla P40?
Check our buy links above for the best current prices on Amazon, Newegg, and B&H. Prices vary by retailer and availability.
How does the NVIDIA Tesla P40 compare to other GPUs?
NVIDIA Tesla P40 has 24 GB VRAM and 347 GB/s bandwidth. This puts it in the high-end category, suitable for most open models. Browse our hardware directory for side-by-side comparisons.
Is the NVIDIA Tesla P40 worth buying right now?
The current price is $159 vs the MSRP of $5,999. The price has dropped below MSRP, making it a good time to buy.
What power supply do I need for the NVIDIA Tesla P40?
The NVIDIA Tesla P40 has a TDP of 250W. A standard quality PSU of 650W+ should suffice. Always check the manufacturer's recommendations for your specific build.
Nearby options
Similar hardware and models that fit
Similar hardware
Comments coming soon
Configure NEXT_PUBLIC_GISCUS_REPO_ID and NEXT_PUBLIC_GISCUS_CATEGORY_ID at giscus.app to enable.