Hosting a Powerful ClickHouse Deployment with a Mix of Bare Metal, OpenStack, and Ceph

This is a real-world architecture of a high-performance ClickHouse cluster, showcasing a combination of bare metal, OpenStack cloud, and Ceph object storage.

The challenge? Ingesting and analyzing huge streams of “hot” real time security event data while controlling the costs of an ever-growing historical set of “cool”, but critical, data.

The Architecture: A Hybrid Approach

Cybersecurity ClickHouse Cluster Diagram

The solution is built on a hybrid architecture, combining the strengths of bare metal servers, large scale S3 compatible object storage,  and OpenStack-powered private cloud infrastructure. This allowed the architect of this ClickHouse cluster to leverage the native functionality in ClickHouse to use the ultra high I/O of the bare metal NVMe for hot data, while connecting ClickHouse to the S3 compatible object gateway, directly in the same VLANs, for cool data. Let’s explore this more below.

Cluster 1: Bare Metal ClickHouse Cluster

The servers are interconnected with a 20Gbps private network for fast data exchange and replication within the ClickHouse cluster. Optionally it can be increased to 40Gbps.

Each of the six “XL v4” servers contains:

  • CPU: 2x Intel Xeon Gold 6530 (64 cores, 128 threads, 2.1/4.0 GHz)
  • Memory: 1024GB DDR5 4800MHz
  • OS: 2x 960GB Micron Pro, Raid 1
  • Working Storage: 7x 6.4TB high-performance Micron 7450 NVMe MAX – see specs
  • Operating System: Ubuntu 24.04

This cluster yields almost 6TB of usable RAM, 768 Threads, and 268TiB of ultra high I/O NVMe.

Cluster 1: The Engine – Bare Metal for Raw Power

The heavy lifting of data processing is handled by a cluster of six bare metal servers. These servers hold both the Kafka ingestion system and ClickHouse. The local “working drives” are directly managed by ClickHouse as the “hot” layer.

The configuration of sharding and replication is part of the secret sauce of this customer, so we won’t cover it specifically and instead supply our general guidance. What we can provide for customers is assignment to our big data database engineers and introductions to other architects, including the CTO that designed this. We hope you consider joining OpenMetal.

The below is based on our experience and recommendations issued by the ClickHouse team – a special thanks and credit to this video. Check it out for a great introduction to replication and sharding.  Get your favorite note taking method ready and look out for how to easily see you have configured it correctly.

With 6 servers the following is a simple possible setup:

  • Host 1 and Host 2 each carry a replica of Shard A
  • Host 3 and Host 4 each carry a replica of Shard B
  • Host 5 and Host 6 each carry a replica of Shard C

 

Cluster 2: The Cool Vault – Ceph for Cost-Effective Object Storage

For long-term data retention and cost-effective storage of “cool” data, the solution incorporates a Ceph-based object storage cluster.  The OpenMetal Storage Cluster offers out of the box compatibility with ClickHouse. In addition, specific tuning, including choices of erasure coding redundancy, is supported to align with your performance versus budget requirements.

This cluster consists of three “Storage Large v3 18T” servers:

  • CPU: 2x Intel Xeon Silver 4314 (32 cores, 64 threads, 2.4/3.4 GHz)
  • Memory: 256GB DDR4 3200MHz
  • Storage: 12x 18TB HDD (216TB total) + 4x 4TB NVMe SSD (16TB total)

It can provide up to:

  • 330 TiB available (at 85% hardware utilization) at 2/1 erasure coding (66.67% efficient)
  • 275 TiB available (at 85% hardware utilization) at Replica 2 (50% efficient)
  • 184 TiB available (at 85% hardware utilization) at Replica 3 (33.33% efficient)

Larger deployments are available and offer different levels of efficiency. Check our erasure coding guide and our base (and configurable) options for 330 TiB, 1.325 PiB, and 2.81 PiB on our object storage pricing page.

Cluster 2: Ceph Object Storage Cluster

Ceph provides an S3 compatible object storage gateway called RGW. The service is a horizontally scaling system that is independent of the storage layer. This allows for easy additions of Storage Large V3s to grow the cluster capacity. If an additional radosgw is needed above the default two, that is added and assigned to one of the new servers.

The servers are also connected with a 20Gbps uplink to support the needed replication between the cluster members. Optionally it can be increased to 40Gbps.

Cluster 3: OpenStack Private Cloud

NOTE: It is recommended to run three Zookeeper instances. This is an appropriate design for an HA cluster. Further, it is also recommended to use “anti-affinity” in your OpenStack cloud to force the VMs to run on separate hardware nodes. This is a feature typically not available on public clouds and is a benefit of running a private cloud.

Cluster 3: The Foundation – OpenStack for Management and Coordination

At the core of the deployment is a small (but mighty!) OpenStack private cloud. Our “Small Cloud Core” is built on three hyperconverged servers. Each server is equipped with:

  • CPU: 1x Intel Xeon D-2141l (8 cores, 16 threads, 2.20/3.00 GHz)
  • Memory: 128GB DDR4 2933MHz
  • Storage: 3.2TB NVMe SSD

This OpenStack cluster (with its own separate Ceph) hosts a set of VMs running the HA ZooKeeper service supporting the bare metal ClickHouse cluster. This is a general purpose private cloud so it is carrying many workloads and it was efficient to simply keep ZooKeeper here.

It is connected with the bare metal ClickHouse servers in Cluster 1 through a private VLAN for secure communication at 20Gpbs server to server over 100Gbps core switching.

Why This Architecture Works

This three-tiered architecture is designed for ideal ClickHouse performance and scalability:

  • Bare Metal for ClickHouse: Running ClickHouse directly on bare metal gets rid of the overhead of virtualization. It can then fully tap into all available hardware resources. This is important for achieving the extremely low latency required for real-time security analysis.
  • OpenStack for Control and Flexibility: The OpenStack private cloud provides a flexible and manageable environment for supporting services like ZooKeeper. It also hosts additional services like load balancers and supporting applications, all while remaining connected to the bare metal servers.
  • Ceph for Cost-Effective Scalability: Ceph’s object storage provides a scalable and cost-effective way to store large volumes of historical data. This lets the cybersecurity firm meet compliance requirements and perform long-term trend analysis.
  • Hybrid Network: Merging VLANs allows the virtual servers to talk directly to the bare metal for high bandwidth, low latency communication.

Real-World Results

 

This deployment is not a theoretical exercise! It’s a live production system currently supporting a major cybersecurity firm. With this hybrid approach of bare metal, OpenStack, and Ceph, OpenMetal is a great fit to power demanding big data solutions like ClickHouse. 

This architecture delivers a powerful platform for real-time analytics, providing insights for the client’s foundational security operations. Being able to mix, match, and connect the right infrastructure for each area ensures that our customer can easily deploy this powerful solution while keeping costs relatively low and performance high.

Interested in ClickHouse on OpenMetal but not sure where to start? Check out our quick start installation guide for ClickHouse on OpenMetal.

Does This Resonate With Your Business Needs?

Contact our cloud team to find out how OpenMetal can support your company’s goals and become your partner in success.