NVIDIA GeForce RTX 5090 Founders Edition product photo
gpuFeaturedllm-inferenceimage-genvideo-gen

NVIDIA RTX 5090

Updated Jun 2, 2026
VRAM
32 GB
Bandwidth
1792 GB/s
TDP
575 W
MSRP
$1,999
Category
gpu

NVIDIA RTX 5090

The first prosumer GPU where Llama 3.3 70B at Q4_K_M fits and runs at usable speed. 32 GB GDDR7 at 1792 GB/s bandwidth - that's not just a 33% jump over the 4090 (24 GB at 1008 GB/s), it's a category change. Models that previously demanded dual-GPU setups now run on a single card.

Quick verdict

If you...Then...
...have a 4090 alreadyWait until Q5_K_M of 70B models gets you something the 4090 can't
...don't have a discrete GPUSkip the 4080 / 4090. Buy this. Price-perf-per-VRAM-GB is best in class.
...are running 30B models like Qwen3-30BA 4090 is fine and saves $1000. Don't overbuy.
...need 70B+ models in a single workstationThis is the answer.

Real-world AI inference

Tested by the community on common models (Q4_K_M, 2k context, single user):

ModelTokens/secSource
Qwen3-30B~38 tok/sr/LocalLLaMA bench
Llama 3.3 70B~14 tok/sr/LocalLLaMA bench
Mistral Small 3 (24B)~52 tok/scommunity
ComfyUI SDXL (1024x1024)~7.5s/imager/StableDiffusion
ComfyUI Flux Dev~22s/imagecommunity

Spec breakdown

  • VRAM: 32 GB GDDR7
  • Memory bandwidth: 1792 GB/s
  • TDP: 575 W (recommend 1000W+ PSU, ideally 1200W)
  • PCIe: 5.0 ×16
  • Slot count: 3.5-slot (will not fit in most cases without checking)
  • Power connector: 4× 8-pin or 1× 12V-2×6 (16-pin)

Best models that fit

At various quants:

  • FP16: only ~14-16B models. Use Q4/Q5 for everything else.
  • Q8_0: Qwen3-30B (32 GB), tight fit
  • Q5_K_M: Llama 3.3 70B at ~25 GB - comfortable
  • Q4_K_M: Llama 3.3 70B at 18 GB (plenty of context room), Qwen3-72B-Instruct at 22 GB
  • Q4_K_M, ample context: Qwen3-30B with 32k context still fits

Where to buy

Affiliate disclosure: links below earn us a small commission at no cost to you.

  • Amazon (often quickest delivery): see button above
  • Newegg (sometimes has better stock during shortage): see button above
  • B&H Photo (best for workstation builds, no tax in most states): linked above
  • Used market: don't bother yet - 5090 launched Jan 2025, used supply is thin and not discounted enough to justify

Cost vs cloud

If you currently spend $200/month on cloud LLM APIs, the 5090 pays back in ~10-12 months. After that it's pure savings. See our cost-vs-cloud calculator for your specific spend.

Honest alternatives

  • RTX 4090 ($1500-1800 used): if 70B isn't critical, save $400-600
  • Mac Studio M4 Ultra ($4000-7000): unified memory up to 192 GB → can run 70B at Q8 or 70B+ at Q4. Slower per-token but unbeatable for very large models
  • Dual RTX 3090 used build (~$1400-1800 total): same 48 GB VRAM, harder to set up, worse efficiency, but cheap

What the community says

"Coming from a 3090 the speedup on Llama 3.3 70B is night and day. Q4_K_M at 14 tok/s is finally usable for real work."

Considerations before buying

  • PSU: 1000W is minimum, 1200W safer. Older 850W builds will trip.
  • Case clearance: 3.5 slots is no joke. Measure first.
  • Power connector: ensure your PSU is ATX 3.0 compliant if using native 12V-2×6.
  • Driver maturity: as of mid-2026, drivers are stable; early 2025 had some inference quirks now resolved.

Frequently asked

Quick answers to common questions

How much VRAM does the NVIDIA RTX 5090 have?

The NVIDIA RTX 5090 has 32 GB of VRAM with 1792 GB/s memory bandwidth. MSRP was $1,999.

What local AI models can run on the NVIDIA RTX 5090?

The NVIDIA RTX 5090 with 32 GB VRAM can run many models depending on quantization. Models up to ~49B params may fit at Q4_K_M. Use our VRAM calculator to check specific models.

Is the NVIDIA RTX 5090 good for local AI inference?

NVIDIA RTX 5090 is best for llm-inference, image-gen, video-gen, training. With ample VRAM it handles most open models well.

Where can I buy the NVIDIA RTX 5090?

Check our buy links above for the best current prices on Amazon, Newegg, and B&H. Prices vary by retailer and availability.

How does the NVIDIA RTX 5090 compare to other GPUs?

NVIDIA RTX 5090 has 32 GB VRAM and 1792 GB/s bandwidth. This puts it in the high-end category, suitable for most open models. Browse our hardware directory for side-by-side comparisons.

Is the NVIDIA RTX 5090 worth buying right now?

The current price is $2199 vs the MSRP of $1,999. The price is at or above MSRP. Consider waiting for sales events like Prime Day or Black Friday.

What power supply do I need for the NVIDIA RTX 5090?

The NVIDIA RTX 5090 has a TDP of 575W. This requires a high-wattage PSU (850W+ recommended). Always check the manufacturer's recommendations for your specific build.

Nearby options

Similar hardware and models that fit

Comments coming soon

Configure NEXT_PUBLIC_GISCUS_REPO_ID and NEXT_PUBLIC_GISCUS_CATEGORY_ID at giscus.app to enable.