The NVIDIA L40S has been a popular universal data center GPU for inference, fine-tuning, and media work since 2023. OpenMetal carries its effective successor instead — the RTX Pro 6000 Blackwell Server Edition — which roughly doubles GPU memory (96GB vs 48GB), adds Blackwell-native FP4, and raises memory bandwidth. This comparison covers what changes between the Ada-generation L40S and the Blackwell RP6000 for training and inference, and why teams cross-shopping the L40S should look at the RP6000. See the RP6000 spec page for full details.

Key Takeaways

  • 2x GPU memory: 96GB GDDR7 on the RP6000 vs 48GB GDDR6 on the L40S — larger models and batches fit on a single card for both training and inference.
  • Newer generation: Blackwell (RP6000) vs Ada Lovelace (L40S) — the RP6000 adds native FP4 (NVFP4) on top of FP8, for higher low-precision inference throughput.
  • Higher memory bandwidth: the RP6000’s GDDR7 delivers 1.79 TB/s vs the L40S’s 864 GB/s — relevant for memory-bound inference and training.
  • L40S advantages: lower board power (~350W vs 600W) and typically lower cost and broader availability — a fit when 48GB is sufficient and power/density is the priority.
  • Same OpenMetal model: whichever GPU, OpenMetal delivers it as single-tenant bare metal with fixed monthly pricing and included egress — no per-GPU-hour “idle silicon tax” on sustained training or always-on serving.

Ready to Compare GPUs for Your Workload?

Tell us your model sizes and throughput targets and we’ll help you choose between the RP6000, the larger-memory H200, or a multi-GPU cluster.

Get an RP6000 Quote   Schedule a Consultation

Spec Comparison

SpecificationNVIDIA RTX Pro 6000 Blackwell SE (OpenMetal)NVIDIA L40S
ArchitectureBlackwellAda Lovelace
GPU Memory96GB GDDR748GB GDDR6
Memory Bandwidth1.79 TB/s864 GB/s
Lowest-Precision TensorFP4 (NVFP4)FP8
Max Board Power600W350W
NVLinkLimited / bridge per configNo NVLink
Carried by OpenMetalYes — available now (Ashburn)No

*Both GPUs train and infer; the RP6000 is the newer, larger-memory option, while the L40S is a lower-power Ada card. 

GPU Generation: Blackwell vs Ada

The RP6000 is a Blackwell-generation GPU; the L40S is Ada Lovelace (the prior generation). The headline functional difference is FP4 (NVFP4), a Blackwell-native 4-bit format that roughly doubles low-precision inference throughput over FP8 on supported stacks. For mixed-precision training (BF16/FP8) both are capable, but the RP6000’s newer tensor cores and larger memory give it more headroom per card. Where the L40S still appeals: it draws less power (~350W) and is widely available at a lower price point, which can matter for dense, power-constrained inference fleets where 48GB per card is enough.

GPU Memory

Memory is the clearest separator:

  • Capacity: 96GB vs 48GB. The RP6000 holds roughly double the model and batch size per card. For training and fine-tuning, that means fewer GPUs to hold the same model state; for inference, larger KV caches and batch sizes.
  • Bandwidth: GDDR7 vs GDDR6. The RP6000’s GDDR7 delivers 1.79 TB/s vs the L40S’s 864 GB/s, which helps both memory-bound inference and training throughput.

Neither card uses HBM — for the highest memory bandwidth (largest-scale training), OpenMetal’s H200 with 141GB HBM3e is the step up. The RP6000 sits between the L40S and the H200: more memory and a newer architecture than the L40S, at a lower cost than HBM-class cards.

Host Platform and Networking

On OpenMetal, the RP6000 runs on a single-tenant bare metal host with dual Intel Xeon 6530P (64C/128T), 1TB DDR5-6400, and a 6.4TB Micron 7500 MAX NVMe data drive, with 40 Gbps private and 10 Gbps public bandwidth. Full root access, IPMI, no hypervisor overhead, and included east-west traffic apply to every OpenMetal GPU server.

Security and Confidential Computing

Both GPUs run as single-tenant bare metal devices on OpenMetal — physical isolation, no shared hypervisor on the accelerator. Intel SGX is available on the host CPU. As on all OpenMetal GPU servers, Intel TDX and GPU passthrough cannot be combined in a single trust boundary. The RP6000 host is deployed in the HIPAA-compliant Ashburn (NTT DATA VA1) facility; OpenMetal offers BAAs at the organizational level.

When the RP6000 Wins — and When an L40S-Class Card Suffices

When the RP6000 is the right choice

  • Training, fine-tuning, or serving models that exceed 48GB on a single card
  • Inference where Blackwell FP4 throughput is a meaningful gain
  • Workloads benefiting from higher GDDR7 memory bandwidth
  • Consolidating multi-L40S deployments onto fewer, larger-memory cards

When an L40S-class card suffices

  • Inference and fine-tuning that fit comfortably in 48GB
  • Power- or density-constrained fleets where ~350W per card matters
  • Cost-sensitive deployments where the lower-priced Ada card is sufficient

Cost and Value

OpenMetal prices the RP6000 on a fixed monthly model with included egress — no per-GPU-hour metering. Because OpenMetal carries the RP6000 (not the L40S), the practical decision is whether the RP6000’s extra memory, bandwidth, and FP4 justify it over a smaller Ada card for your workload. For sustained training and always-on inference — high-utilization workloads — the fixed-cost dedicated model avoids the metered-cloud “idle silicon tax,” where every GPU-hour bundles elasticity premium you may not need. OpenMetal does not publish RP6000 pricing; contact OpenMetal for a custom quote.

Ready to Compare GPUs for Your Workload?

Tell us your model sizes and throughput targets and we’ll help you choose between the RP6000, the larger-memory H200, or a multi-GPU cluster.

Get an RP6000 Quote   Schedule a Consultation

Deployment Options

  • Dedicated GPU server — a single RP6000 (or dual-GPU) bare metal server with full root access and IPMI.
  • Dedicated GPU cluster — multiple GPU nodes (all-RP6000 or mixed with H200) on a private 40 Gbps mesh.
  • Attached to existing infrastructure — add RP6000 nodes to an existing OpenMetal Hosted Private Cloud or bare metal deployment.

Where to deploy

The RP6000 is available now in Ashburn, Virginia (US-East), with advance reservations available for Los Angeles, Amsterdam, and Singapore. Proof of Concept clusters are available for testing; ramp pricing is available for migrations from other providers.

Get an RP6000 Quote

Ready to deploy? Tell us about your AI/ML training or inference needs and we’ll provide a custom quote for the NVIDIA RTX Pro 6000 — as a single GPU server, a dedicated GPU cluster, or GPU nodes attached to an existing OpenMetal deployment.

  • Single GPU server: One or two RP6000 cards with full root access and IPMI
  • GPU cluster: Multi-node deployments (all-RP6000 or mixed with H200) on a private 40 Gbps mesh
  • Attached GPU: Add RP6000 capacity to your existing Hosted Private Cloud or bare metal footprint

All deployments include fixed monthly pricing, included egress, a 99.96%+ network SLA, and DDoS protection.



Product specifications, pricing, and availability may change due to market conditions and other factors. For the most current information, please contact the OpenMetal team directly.