An OpenMetal GPU cluster is a set of dedicated, single-tenant GPU servers on a private high-speed network, built to order for distributed AI training and large-scale inference. Clusters can be all-RP6000 (cost-efficient training and inference), all-H200 (largest-memory, bandwidth-bound work), or mixed to match different workloads to the right GPU. Every node is bare metal with full root access, fixed monthly pricing, and included egress — no per-GPU-hour metering across the fleet.

Key Takeaways

  • Single-tenant, dedicated — every GPU in the cluster is yours, with full PCIe bandwidth and no hypervisor overhead or shared-tenancy contention.
  • Same or mixed GPUs — build an all-RP6000 inference fleet, an all-H200 training cluster, or a mixed cluster that routes big-model and bandwidth-bound jobs to H200 nodes and cost-efficient inference/training to RP6000 nodes.
  • Private 40 Gbps mesh — nodes connect over LACP-bonded private networking for distributed training, parameter exchange, and pulling datasets from OpenMetal storage nodes; east-west traffic is not metered.
  • Fixed monthly pricing across the fleet — no per-GPU-hour “idle silicon tax.” Sustained, high-utilization training runs the whole cluster hot at no incremental cost.
  • Built to order — cluster size, GPU mix, RAM (up to 2TB/node), and storage are configured to the workload.

Ready to Build a GPU Cluster?

Tell us your cluster size, GPU mix, and workload, and we’ll design a dedicated GPU cluster — all-RP6000, all-H200, or mixed.

Get a GPU Cluster Quote   Schedule a Consultation

Cluster Composition

OptionGPUsBest for
All-RP6000 clusterNVIDIA RTX Pro 6000 (96GB GDDR7, FP4)Cost-efficient distributed training, fine-tuning, and high-throughput inference serving fleets
All-H200 clusterNVIDIA H200 NVL (141GB HBM3e)Largest models and bandwidth-bound large-scale training/inference
Mixed clusterRP6000 + H200Route big-model/bandwidth work to H200 nodes, cost-efficient inference/training to RP6000 nodes

OpenMetal GPU cluster architecture diagram

OpenMetal GPU cluster component architecture

Each node is a single-tenant bare metal server with dual Intel Xeon 6530P (64C/128T), 1TB DDR5-6400 (upgradeable to 2TB), and a 6.4TB Micron 7500 MAX NVMe data drive. See the RP6000 and H200 spec pages.

Cluster Networking

Cluster nodes connect over a private 40 Gbps mesh (4x 10 Gbps LACP-bonded per node), carrying distributed-training gradients, parameter exchange, and dataset traffic from OpenMetal storage nodes. East-west traffic between your nodes is not metered.

How cluster memory works: GPU-memory pooling is available between GPUs within the same server (1–2 cards). Across nodes, GPUs communicate over the private 40 Gbps mesh rather than a shared GPU-memory fabric — so distributed jobs use data and pipeline parallelism across nodes, and workloads needing tight GPU-to-GPU memory coupling should be sized to the per-server GPU configuration.

GPU Sharing and Partitioning

For multi-tenant or multi-workload sharing within a node, NVIDIA offers MIG (Multi-Instance GPU) partitioning and time-slicing — each with different isolation and utilization trade-offs. OpenMetal covers the distinction in detail in MIG vs time-slicing GPU sharing.

Recommended Cluster Workloads

Distributed Training

Multi-node training with PyTorch FSDP, DeepSpeed, or Megatron across an all-H200 or mixed cluster, with the 40 Gbps private mesh carrying gradient and parameter traffic. Fixed-cost pricing makes long, high-utilization training runs predictable.

Large-Scale Inference Serving

Horizontally scaled inference fleets (NVIDIA NIM, vLLM, TensorRT-LLM) across RP6000 nodes for cost-efficient, high-throughput serving with consistent per-node performance.

Mixed Training + Serving Platforms

One cluster that trains on H200 nodes and serves on RP6000 nodes, keeping both on the same private network and billing model.

HPC and Scientific Computing

Multi-node MPI workloads (computational chemistry, CFD, genomics) using the dual-Xeon hosts and private mesh.

Security and Compliance

Every cluster node is single-tenant bare metal — physical isolation, no shared hypervisor on the accelerator. Intel SGX is available on the host CPUs. Intel TDX and GPU passthrough cannot be combined in a single trust boundary. Clusters are deployed in the HIPAA-compliant Ashburn (NTT DATA VA1) facility; OpenMetal offers BAAs at the organizational level.

“OpenMetal Cloud provides on-demand private infrastructure, which brings cloud fundamentals like elasticity and usage billing to the cloud deployment itself. It’s awesome to see OpenMetal’s latest product use OpenStack to combine the benefits of public cloud and managed private cloud, powered by open infrastructure.”

Thierry Carrez, VP of Engineering, Open Infrastructure Foundation

Ready to Build a GPU Cluster?

Tell us your cluster size, GPU mix, and workload, and we’ll design a dedicated GPU cluster — all-RP6000, all-H200, or mixed.

Get a GPU Cluster Quote   Schedule a Consultation

Deployment Options

  • Dedicated GPU cluster — multiple GPU nodes on a private 40 Gbps mesh, built to order.
  • Attached to existing infrastructure — extend an existing OpenMetal Hosted Private Cloud or bare metal footprint with GPU nodes. See adding GPU servers to an existing OpenMetal deployment.

Where to deploy

GPU clusters are available now in Ashburn, Virginia (US-East), with advance reservations for Los Angeles, Amsterdam, and Singapore. Proof of Concept clusters are available for testing; ramp pricing is available for migrations.

Get a GPU Cluster Quote

Ready to deploy? Tell us about your distributed training or large-scale inference workload and we’ll design a dedicated GPU cluster.

  • All-RP6000: cost-efficient training and high-throughput inference fleets
  • All-H200: largest-memory, bandwidth-bound training and inference
  • Mixed: route workloads to the right GPU on one private mesh

All deployments include fixed monthly pricing, included egress, a 99.96%+ network SLA, and DDoS protection.


Related OpenMetal Answers

  • What is an OpenMetal GPU cluster?
  • Can I mix RP6000 and H200 GPUs in one cluster?
  • How do OpenMetal GPU cluster nodes connect to each other?
  • Can OpenMetal GPU clusters pool GPU memory across nodes?
  • What is the difference between MIG and time-slicing for GPU sharing?

Product specifications, pricing, and availability may change due to market conditions and other factors. For the most current information, please contact the OpenMetal team directly.