Private GPU Servers and Clusters

Dedicated NVIDIA GPU acceleration on the OpenMetal v5 platform. Built for AI training, inference, and HPC, with transparent monthly pricing and no metered hours.

  • Fully dedicated, single-tenant GPUs with no shared hypervisor
  • Same Intel Xeon 6 and DDR5-6400 foundation as the rest of the v5 catalog
  • Delivered as single-tenant bare metal for full control
  • Deploy as standalone servers or interconnected multi-node clusters
  • Predictable monthly billing with fair, transparent egress

GPU Servers

Choose your GPU platform

Two dedicated GPU server lines on the same modern v5 foundation. One or two GPUs per server, scalable into multi-node clusters.

RP6000

Best for: inference, fine-tuning, rendering, and mixed AI and visualization pipelines. High VRAM at favorable cost/GB.

1 or 2x NVIDIA RTX PRO 6000

  • GPU memory: 96 GB GDDR7 per GPU
  • GPU memory bandwidth: 1.79 TB/s per GPU
  • CUDA cores: 24,064 per GPU
  • CPU: 2x Intel Xeon 6530P (64C / 128T)
  • Memory: 1 TB DDR5-6400 (to 2 TB)
  • Storage: Up to 24x NVMe (Micron 7500 MAX)
  • Private network: 20 Gbps standard, upgradeable to 40 Gbps
  • Public network: 10 Gbps

Contact Sales

H200

Best for: large-model training and memory-bound inference where bandwidth and capacity are the bottleneck.

1 or 2x NVIDIA H200 NVL (PCIe)

  • GPU memory: 141 GB HBM3e per GPU
  • GPU memory bandwidth: 4.8 TB/s per GPU
  • CPU: 2x Intel Xeon 6530P (64C / 128T)
  • Memory: 1 TB DDR5-6400 (to 2 TB)
  • Storage: Up to 24x NVMe (Micron 7500 MAX)
  • Private network: 20 Gbps standard, upgradeable to 40 Gbps
  • Public network: 10 Gbps

Contact Sales

Pricing, features, and availability are subject to change without notice. For GPU servers, all final prices need to be confirmed with the OpenMetal sales team. However, unlike many providers, OpenMetal honors written quotes for 30 days from the date issued. Because market conditions and hardware costs can fluctuate, any new or revised quotes will reflect current market pricing. 

With v5 we modernized the foundation of our bare metal and private cloud catalog. Adding the RP6000 and H200 was the natural next step. Customers running AI and HPC workloads get fully dedicated GPUs on the same modern Xeon 6000 platform, with transparent monthly billing and infrastructure they actually control, not throttled, metered slices of someone else’s cluster.

Jamie Tischart, CTO of OpenMetal

How is Private AI on OpenMetal Infrastructure Different?

It’s private, customizable, and our engineers are on your team.

Fully dedicated

Single-tenant bare metal with direct access to the GPU, CPU, memory, and storage. Nothing is virtualized or shared, so performance stays consistent and the hardware is entirely yours.

Built to order

The listed configurations are a starting point. Work with our team to design the deployment your workload needs, and we handle ordering, setup, and reliable operation.

 

Engineers on your team

Real infrastructure engineers help you size, deploy, and tune. For organizations in healthcare, finance, research, and SaaS that need data locality and compliance control, that support matters.

From a single server to a multi-node cluster

Deploy one GPU server or interconnect many over the v5 private network. Common AI and ML frameworks are supported out of the box.

Single server

One or two dedicated GPUs per server on the full v5 platform. Ideal for inference, fine-tuning, and focused training runs.

Multi-node clusters

Interconnect multiple GPU servers over a 20 Gbps private network to build training and inference clusters sized to your workload.

Bare metal, no layers

Every server is delivered as single-tenant bare metal, with direct access to the GPU, CPU, memory, and storage. No hypervisor sits between your workload and the hardware.

Contact Us for GPU Servers Pricing and Availability

Fill out the form below to connect with our team to discuss your requirements, delivery timelines, capabilities, and agreement pricing. Or email us at sales@openmetal.io.


FAQs

What GPU servers are available?

Two lines, both on the v5 platform. The RP6000 pairs dual Intel Xeon 6530P processors with one or two NVIDIA RTX PRO 6000 Blackwell Server Edition GPUs (96 GB GDDR7 each), suited for inference, fine-tuning, rendering, and mixed AI and visualization workloads. The H200 uses one or two NVIDIA H200 NVL GPUs (141 GB HBM3e each) and can be bundled with a five-year NVIDIA AI Enterprise subscription, targeting large-model training and memory-bound inference. Both are available now in Ashburn, VA.

What’s the difference between the RP6000 and H200?

The RP6000 is built for high-VRAM workloads at a favorable cost per GB: inference serving, fine-tuning, rendering, and pipelines that mix AI and visualization. The H200 is built for workloads where memory bandwidth and capacity are the bottleneck, such as large-model training, large-context inference, and memory-bound HPC. The H200 can also be bundled with a five-year NVIDIA AI Enterprise subscription. Talk to an account manager if you are unsure which fits your workload.

How are the GPU servers priced?

Fixed, transparent monthly pricing with no metered hours and no surprise egress charges. Configurations are built to order, so the final quote reflects your exact setup. OpenMetal honors written quotes for 30 days from the date issued.  Request a quote to get current pricing.

Can I build a multi-node GPU cluster?

Yes. Multiple GPU servers can be interconnected over OpenMetal’s 20 Gbps private network to build multi-node training or inference clusters. Talk to an account manager about cluster sizing and lead times.

Which frameworks are supported?

The servers support common AI and ML frameworks including PyTorch, TensorFlow, JAX, and Hugging Face Transformers, running directly on dedicated hardware with no virtualization layer in the way.

Is the hardware really dedicated?

Yes. Every GPU server is single-tenant bare metal. Your workload has direct access to the GPU, CPU, memory, and storage. No shared hypervisor, no noisy neighbors, and no metered slices of someone else’s cluster.

Can I start with one server and scale later?

Yes. Deploy a single GPU server to start, then add servers and interconnect them over the 20 Gbps private network as your workload grows. Talk to an account manager about scaling paths and lead times.

Is a proof of concept available?

Yes. PoC deployments let your team validate workloads on dedicated hardware before committing. 

Where are the GPU servers available?

The RP6000 and H200 are available now from OpenMetal US East in Ashburn, Virginia. Contact us about availability in other regions.

Design your GPU deployment

Tell us about your workload and we’ll build a quote around it, from a single server to a multi-node cluster. Proof-of-concept deployments are available on request.

Schedule a Meeting

Built on the OpenMetal v5 platform. See the full v5 hardware catalog