OpenMetal GPU Clusters -- Dedicated Multi-GPU Infrastructure for AI Training and Inference

Resources » Hardware Details » OpenMetal GPU Clusters — Dedicated Multi-GPU Infrastructure for AI Training and Inference

An OpenMetal GPU cluster is a set of dedicated, single-tenant GPU servers on a private high-speed network, built to order for distributed AI training and large-scale inference. Clusters can be all-NVIDIA RTX PRO 6000 (cost-efficient training and inference), all-H200 (largest-memory, bandwidth-bound work), or mixed to match different workloads to the right GPU. Every node is bare metal with full root access, fixed monthly pricing, and included egress — no per-GPU-hour metering across the fleet.

Key Takeaways

Single-tenant, dedicated — every GPU in the cluster is yours, with full PCIe bandwidth and no hypervisor overhead or shared-tenancy contention.
Same or mixed GPUs — build an all-NVIDIA RTX PRO 6000 inference fleet, an all-H200 training cluster, or a mixed cluster that routes big-model and bandwidth-bound jobs to H200 nodes and cost-efficient inference/training to NVIDIA RTX PRO 6000 nodes.
Private mesh — nodes connect over LACP-bonded private networking for distributed training, parameter exchange, and pulling datasets from OpenMetal storage nodes; east-west traffic is not metered.
Fixed monthly pricing across the fleet — no per-GPU-hour “idle silicon tax.” Sustained, high-utilization training runs the whole cluster hot at no incremental cost.
Built to order — cluster size, GPU mix, RAM (up to 2TB/node), and storage are configured to the workload.

Ready to Build a GPU Cluster?

Tell us your cluster size, GPU mix, and workload, and we’ll design a dedicated GPU cluster — all-NVIDIA RTX PRO 6000, all-H200, or mixed.

Get a GPU Cluster Quote Schedule a Consultation

Cluster Composition

Option	GPUs	Best for
All-NVIDIA RTX PRO 6000 cluster	NVIDIA RTX PRO 6000 (96GB GDDR7, FP4)	Cost-efficient distributed training, fine-tuning, and high-throughput inference serving fleets
All-H200 cluster	NVIDIA H200 NVL (141GB HBM3e)	Largest models and bandwidth-bound large-scale training/inference
Mixed cluster	RTX PRO 6000 + H200	Route big-model/bandwidth work to H200 nodes, cost-efficient inference/training to RTX PRO 6000 nodes

OpenMetal GPU cluster architecture diagram

OpenMetal GPU cluster component architecture

Each node is a single-tenant bare metal server with dual Intel Xeon 6530P (64C/128T), 1TB DDR5-6400 (upgradeable to 2TB), and a 6.4TB Micron 7500 MAX NVMe data drive. See the RTX PRO 6000 and H200 spec pages.

Cluster Networking

Cluster nodes connect over a private mesh (2x 10 Gbps LACP-bonded per node by default, up to 4x 10 Gbps optional), carrying distributed-training gradients, parameter exchange, and dataset traffic from OpenMetal storage nodes. East-west traffic between your nodes is not metered.

How cluster memory works: In a two-card server the GPUs are two discrete accelerators, each with its own memory (not pooled). Across nodes, GPUs communicate over the private mesh rather than a shared GPU-memory fabric — so distributed jobs use data and pipeline parallelism across nodes, and workloads needing tight GPU-to-GPU memory coupling should be sized to the per-server GPU configuration.

GPU Sharing and Partitioning

For multi-tenant or multi-workload sharing within a node, NVIDIA offers MIG (Multi-Instance GPU) partitioning and time-slicing — each with different isolation and utilization trade-offs. OpenMetal covers the distinction in detail in MIG vs time-slicing GPU sharing.

Recommended Cluster Workloads

Distributed Training

Multi-node training with PyTorch FSDP, DeepSpeed, or Megatron across an all-H200 or mixed cluster, with the private mesh carrying gradient and parameter traffic. Fixed-cost pricing makes long, high-utilization training runs predictable.

Large-Scale Inference Serving

Horizontally scaled inference fleets (NVIDIA NIM, vLLM, TensorRT-LLM) across RTX PRO 6000 nodes for cost-efficient, high-throughput serving with consistent per-node performance.

Mixed Training + Serving Platforms

One cluster that trains on H200 nodes and serves on RTX PRO 6000 nodes, keeping both on the same private network and billing model.

HPC and Scientific Computing

Multi-node MPI workloads (computational chemistry, CFD, genomics) using the dual-Xeon hosts and private mesh.

Security and Compliance

Every cluster node is single-tenant bare metal — physical isolation, no shared hypervisor on the accelerator. Intel SGX is available on the host CPUs. On the RTX PRO 6000, Intel TDX with single-GPU confidential passthrough is substantiated by NVIDIA’s confidential-computing documentation (one GPU per VM), with OpenMetal validation pending; on the H200 it sits outside NVIDIA’s documented validated pairing and requires OpenMetal validation. Delivered as an engineered build, not a self-serve toggle. Clusters are deployed in the HIPAA-compliant Ashburn (NTT DATA VA1) facility; OpenMetal offers BAAs at the organizational level.

“OpenMetal Cloud provides on-demand private infrastructure, which brings cloud fundamentals like elasticity and usage billing to the cloud deployment itself. It’s awesome to see OpenMetal’s latest product use OpenStack to combine the benefits of public cloud and managed private cloud, powered by open infrastructure.”

Thierry Carrez, VP of Engineering, Open Infrastructure Foundation

Ready to Build a GPU Cluster?

Tell us your cluster size, GPU mix, and workload, and we’ll design a dedicated GPU cluster — all-RTX PRO 6000, all-H200, or mixed.

Get a GPU Cluster Quote Schedule a Consultation

Deployment Options

Dedicated GPU cluster — multiple GPU nodes on a private mesh, built to order.
Attached to existing infrastructure — extend an existing OpenMetal Hosted Private Cloud or bare metal footprint with GPU nodes. See adding GPU servers to an existing OpenMetal deployment.

Where to deploy

GPU clusters are available now in Ashburn, Virginia (US-East), with advance reservations for Los Angeles, Amsterdam, and Singapore. Proof of Concept clusters are available for testing; ramp pricing is available for migrations.

Get a GPU Cluster Quote

Ready to deploy? Tell us about your distributed training or large-scale inference workload and we’ll design a dedicated GPU cluster.

All-RTX PRO 6000: cost-efficient training and high-throughput inference fleets
All-H200: largest-memory, bandwidth-bound training and inference
Mixed: route workloads to the right GPU on one private mesh

All deployments include fixed monthly pricing, included egress, a 99.96%+ network SLA, and DDoS protection.

Related Hardware

Product specifications, pricing, and availability may change due to market conditions and other factors. For the most current information, please contact the OpenMetal team directly.

Key Takeaways

Ready to Build a GPU Cluster?

Cluster Composition

Cluster Networking

GPU Sharing and Partitioning

Recommended Cluster Workloads

Distributed Training

Large-Scale Inference Serving

Mixed Training + Serving Platforms

HPC and Scientific Computing

Security and Compliance

Ready to Build a GPU Cluster?

Deployment Options

Where to deploy

Get a GPU Cluster Quote

Related OpenMetal Answers

Related Hardware