
Evaluating landing zones for your VMware workloads?
With OpenMetal, you get Hosted Private Cloud built with Ceph, and OpenStack, and the predictable cost model that makes migration planning feasible.
Moving workloads from VMware to OpenStack isn’t primarily a compute challenge—it’s a storage challenge. Your VMs can be re-instantiated quickly. Your networking can be reconfigured in an afternoon. But your storage layer—your persistent data, your stateful workloads, your multi-terabyte databases—that’s where migrations stall, fail, or drag on for months.
If you’re a storage architect or platform engineer tasked with migrating off VMware ESXi, vSAN, or VMFS to an OpenStack environment backed by Ceph, you’re facing a fundamentally different storage architecture. This isn’t a lift-and-shift. It’s a re-architecture of how block storage, shared filesystems, and object storage are provisioned, accessed, and managed. This guide walks through the migration methods, tooling, validation steps, and pitfalls that matter when you’re moving production storage workloads—not lab environments.
VMware Storage Model vs OpenStack + Ceph Storage Model
VMware’s storage stack is tightly integrated with ESXi’s hypervisor layer. Whether you’re using local VMFS datastores, shared NFS mounts, or vSAN’s distributed object store, the storage abstraction lives inside VMware’s control plane. Virtual disks are VMDK files. Storage policies are enforced by vCenter. Snapshots, clones, and thin provisioning are all managed through VMware’s APIs.
OpenStack with Ceph operates differently. Ceph is a software-defined storage system that provides block storage (RBD), shared filesystems (CephFS), and object storage (RADOS Gateway) through a unified cluster. OpenStack’s Cinder (block), Manila (file shares), and Swift/S3 (object) services interface with Ceph, but Ceph itself is hypervisor-agnostic. Virtual disks are stored as RADOS Block Device (RBD) images, not VMDK files. Snapshots are COW (copy-on-write) operations at the Ceph layer. Storage policies are defined in CRUSH maps and Ceph pools, not vCenter.
| Aspect | VMware (ESXi/vSAN/VMFS) | OpenStack + Ceph (RBD/CephFS/Object) |
| Virtual disk format | VMDK (monolithic or split) | RBD image (object-striped across OSDs) |
| Storage provisioning | vCenter datastores | Cinder volumes backed by Ceph pools |
| Shared file storage | NFS/vSAN file services | CephFS mounted via kernel or FUSE |
| Object storage | vSAN object store (limited) | RADOS Gateway (S3/Swift-compatible) |
| Snapshot mechanism | VMDK delta files | Ceph COW snapshots at RBD layer |
| Thin provisioning | VMDK thin disks | RBD thin provisioning (default) |
| Replication/HA | vSAN erasure coding or mirroring | Ceph replica pools or erasure coding |
| CLI tooling | vmkfstools, esxcli | rbd, ceph, rados |
Ceph is not VMFS with different branding. It’s a distributed object store that exposes block and file interfaces on top of RADOS. You’ll need to adjust your mental model for how storage is allocated, how data is replicated, and how failure domains are defined. VMware admins expect storage to be “attached” to a cluster. Ceph storage is distributed across nodes, and failure domains are defined by CRUSH topologies—not vSphere clusters.
Migration Method Options
You have four primary approaches when migrating storage from VMware to OpenStack. Each has different downtime requirements, tooling complexity, and risk profiles.
- Cold migration involves shutting down the VM, exporting the VMDK, converting it to a raw or qcow2 image, and importing it into Ceph as an RBD volume. This is the simplest method, but it requires full downtime for the workload. Acceptable for dev/test environments or workloads with scheduled maintenance windows.
- Live migration uses tools like virt-v2v or commercial platforms (Hystax, Trilio) to sync block-level changes while the VM remains running in VMware. A cutover window is still required, but it’s measured in minutes rather than hours. This method requires network bandwidth, intermediate storage, and careful handling of I/O consistency.
- Block streaming involves attaching the source VMDK as a backing file to the destination RBD volume and streaming blocks on-demand as they’re accessed. This minimizes initial downtime but can cause performance degradation during the migration window. Rarely used in production due to complexity.
- Rebuild means standing up a new VM in OpenStack, installing the OS, and migrating application data separately (rsync, database replication, object sync). This is the cleanest method for stateless workloads or when you’re modernizing the stack during migration. It’s also the most time-consuming.
| Method | Downtime | Complexity | Best For |
| Cold migration | Hours to days | Low | Non-critical workloads, scheduled maintenance windows |
| Live migration | Minutes | Medium | Production databases, stateful apps |
| Block streaming | Seconds (initial) | High | Experimental; rarely used |
| Rebuild | Variable | Medium | Stateless apps, modernization efforts |
- Choose cold migration when downtime is acceptable and tooling simplicity matters.
- Choose live migration when uptime SLAs are strict and you have the bandwidth to sync deltas.
- Choose rebuild when you’re refactoring the application stack or when the workload doesn’t justify VMDK conversion.
Tools
The tooling landscape for VMware-to-OpenStack storage migration ranges from open-source CLI utilities to commercial platforms. Your choice depends on scale, automation requirements, and tolerance for manual intervention.
- qemu-img is the workhorse for VMDK-to-raw or VMDK-to-qcow2 conversion. It’s free, well-documented, and handles most disk formats. It doesn’t migrate metadata (VM config, network settings), so you’ll need to recreate those in OpenStack manually or via scripting.
- virt-v2v (part of libguestfs) automates more of the process. It converts VMDKs, injects virtio drivers (critical for performance on KVM), and can push images directly to OpenStack via Glance. It’s purpose-built for VMware-to-KVM migrations, but it requires access to the VMware API or exported OVF files.
- rbd is Ceph’s native block device CLI. You’ll use it to import raw disk images into Ceph pools, create snapshots, clone volumes, and manage RBD mappings. It’s fast, but you need to ensure your disk images are in raw format before importing. On modern Ceph deployments managed by cephadm (standard on OpenMetal v3.0.0+ environments), Ceph services run in Docker containers. You can execute rbd commands either from the host if ceph-common is installed, or via the containerized environment using cephadm shell — rbd <command>.
- ovftool (VMware’s OVF export utility) packages VMDKs and VM metadata into OVF/OVA archives. Useful when you need to export VMs from vCenter in a structured format before conversion. It doesn’t handle the Ceph import step—just the export from VMware.
- Hystax, Trilio, Storware are commercial migration and disaster recovery platforms. They offer live migration capabilities, automated cutover, and delta sync. They’re expensive, but they reduce manual labor for large-scale migrations (50+ VMs). Hystax specifically supports VMware-to-OpenStack workflows.
| Tool | Use Case | License | Live Migration? | Ceph Integration |
| qemu-img | VMDK conversion | Open source | No | Manual rbd import |
| virt-v2v | Automated V2V conversion | Open source | No | Via Glance/Cinder |
| rbd | Ceph block device mgmt | Open source | No | Native |
| ovftool | VMware VM export | Free (VMware) | No | None |
| Hystax | Enterprise migration | Commercial | Yes | Via OpenStack APIs |
Trilio | Backup and migration | Commercial | Yes | Native Ceph support |
| Storware | Backup and DR | Commercial | Yes | Ceph plugin available |
- For small migrations (under 20 VMs), stick with qemu-img and rbd.
- For mid-size migrations (20–100 VMs), virt-v2v will save time.
- For large migrations (100+ VMs) or when you need live cutover, evaluate Hystax or Trilio. Don’t assume a commercial tool will solve architectural mismatches—they won’t convert VMFS-specific features (like Storage DRS policies) into Ceph equivalents.
Example Command Workflows
Here’s a typical cold migration workflow using open-source tools. This assumes you’ve already exported the VM from VMware and have SSH access to a machine with Ceph client tools installed.
Note for OpenMetal v3.0.0+ deployments: Ceph services run in containers managed by cephadm. Execute rbd commands via cephadm shell — <command> or install ceph-common on the host for direct CLI access. The examples below show direct CLI usage for clarity—prefix with cephadm shell — if working in a containerized environment.
Export VM from VMware with ovftool
ovftool vi://vcenter.example.com/Datacenter/vm/production-db01 \ /mnt/staging/production-db01.ova
This exports the VM as an OVA file. Extract the VMDK from the OVA:
tar -xvf /mnt/staging/production-db01.ova
Convert VMDK to raw format with qemu-img
qemu-img convert -f vmdk -O raw \ production-db01-disk1.vmdk \ production-db01-disk1.raw
Check the converted image size and format:
qemu-img info production-db01-disk1.raw
Import raw image into Ceph RBD
rbd import --pool openstack-volumes \ production-db01-disk1.raw \ production-db01-disk1
Verify the RBD image exists:
rbd ls openstack-volumes rbd info openstack-volumes/production-db01-disk1
Create a Cinder volume from the RBD image
openstack volume create \ --size 100 \ --image production-db01-disk1 \ production-db01-volume
Attach the volume to a new OpenStack instance or boot directly from the volume using Nova.
Benchmarking
Before you migrate production workloads, benchmark your Ceph cluster to confirm it meets performance expectations. VMware admins are accustomed to vSAN’s predictable latency profiles. Ceph performance depends on OSD count, network topology, disk types (NVMe vs SSD vs HDD), and CRUSH map configuration.
Use fio to test block-level I/O performance on an RBD volume:
fio --name=rbd-randwrite \ --ioengine=rbd \ --pool=openstack-volumes \ --rbdname=test-volume \ --rw=randwrite \ --bs=4k \ --iodepth=32 \ --numjobs=4 \ --runtime=60 \ --group_reporting
This tests random 4K writes with 32 outstanding I/Os. Compare the IOPS and latency results to your VMware baseline. If you’re seeing >10ms p99 latency on NVMe-backed Ceph, investigate network bottlenecks or OSD configuration.
Use rados bench to test raw Ceph cluster performance (bypassing RBD):
rados bench -p openstack-volumes 60 write --no-cleanup rados bench -p openstack-volumes 60 seq
This writes objects directly to the pool for 60 seconds, then reads them back sequentially. It helps isolate whether performance issues are in Ceph itself or in the RBD/Cinder layer.
Data Integrity Validation
Migrating storage without validating data integrity is asking for corruption issues weeks after cutover. Always checksum your data before and after migration.
Generate SHA256 hash of source VMDK
sha256sum production-db01-disk1.vmdk > vmdk-hash.txt
After converting to raw and importing to Ceph, map the RBD volume and hash it:
rbd map openstack-volumes/production-db01-disk1 sha256sum /dev/rbd0 > rbd-hash.txt
Compare the hashes:
diff vmdk-hash.txt rbd-hash.txt
If the hashes don’t match, you have a corruption or conversion issue. Don’t proceed to cutover until you’ve identified the cause. Common culprits include incomplete VMDK exports, qemu-img version mismatches, or network interruptions during rbd import.
For large volumes, consider block-level validation tools like virt-diff (part of libguestfs) or filesystem-level checksums (e.g., ZFS checksums if your source datastore supports it).
Rollback Planning
No rollback = no migration. You need a tested rollback path before you cut over production workloads. Ceph snapshots make this straightforward, but you need to plan the workflow in advance.
Before cutover, take a snapshot of the original VMDK in VMware. Keep the VM powered off but don’t delete it. In OpenStack, create a Ceph snapshot of the newly imported RBD volume immediately after import:
rbd snap create openstack-volumes/production-db01-disk1@pre-cutover
If the cutover fails (application doesn’t start, data corruption discovered, performance unacceptable), you have two rollback options:
- Roll back to VMware: Power on the original VM in vCenter. You’re back to the pre-migration state within minutes.
- Roll back the Ceph volume: Revert the RBD image to the snapshot, detach it from the OpenStack instance, and troubleshoot offline.
rbd snap rollback openstack-volumes/production-db01-disk1@pre-cutover
Define your rollback SLA before migration. For Tier 1 workloads, you should be able to roll back within 15 minutes. Test the rollback procedure in a dev environment before attempting it in production. Keep the source VMDKs and VMware VMs intact for at least 30 days post-migration.
Common Pitfalls
| Pitfall | Symptom | Solution |
| Missing virtio drivers | VM boots slowly or not at all | Inject virtio drivers via virt-v2v or install manually |
Thin VMDK converted to thick | Ceph volume consumes full allocated size | Use qemu-img with sparse flag; preallocate=off |
| Network MTU mismatch | High packet loss during migration | Set jumbo frames (MTU 9000) on migration network |
| Ceph replication lag | RBD import stalls or times out | Check OSD health; reduce concurrent migrations |
| Incorrect CRUSH map | Data on wrong failure domain (e.g., all on one rack) | Review CRUSH rules before migration; reweight OSDs |
| No I/O scheduler tuning | Poor performance post-migration | Set mq-deadline or none scheduler on Ceph OSD nodes |
| Cinder volume type mismatch | Volume created in wrong pool or replication tier | Define Cinder volume types that map to correct Ceph pools |
| Incomplete VM metadata | VM boots but network/hostname wrong | Export and parse VMX file; recreate metadata in OpenStack |
The most common failure mode isn’t corruption—it’s performance degradation. Your workload boots, runs, but responds 2x slower than it did in VMware. This usually points to missing virtio drivers, suboptimal Ceph pool configuration (e.g., replica 2 instead of 3), or network bottlenecks (1Gbps instead of 10Gbps+). Benchmark early, benchmark often, and compare against your VMware baselines before declaring success.
Another frequent issue: migrating VMDKs with snapshots or linked clones. qemu-img and virt-v2v don’t handle VMDK snapshots gracefully. Consolidate all snapshots in VMware before exporting the VM. If you have linked clones, convert them to full clones first.
Migration Timeline
A realistic storage migration timeline for a 50-VM production environment looks like this:
- Weeks 1–2: Inventory and discovery. Identify VMDK sizes, snapshot dependencies, application dependencies, and downtime windows.
- Weeks 3–4: Pilot migration of 5 non-critical VMs. Test tooling, validate performance, document workflows.
- Weeks 5–8: Migrate dev/test workloads (20 VMs). Refine scripts, train team, identify performance gaps.
- Weeks 9–12: Migrate Tier 2 production workloads (15 VMs). Schedule downtime windows, execute cold migrations, validate data integrity.
- Weeks 13–16: Migrate Tier 1 production workloads (10 VMs). Use live migration tools if available, or schedule extended maintenance windows.
- Weeks 17–20: Decommission VMware infrastructure. Archive VMDKs, power off ESXi hosts, reclaim licenses.
This timeline assumes you have a functioning Ceph cluster, competent OpenStack operators, and no major architectural surprises. If you’re also deploying Ceph and OpenStack from scratch, add 8–12 weeks to the front end. If you’re migrating 500+ VMs, scale the timeline linearly but add buffer for coordination overhead and troubleshooting.
Don’t rush the pilot phase. A poorly executed pilot will cascade into production failures. Use the pilot to identify gaps in your tooling, networking, or Ceph configuration—not to declare victory and accelerate the timeline.
Example Storage Migration Checklist
Task | Owner | Status |
| ☐ Inventory all VMs, VMDK sizes, snapshot dependencies | Platform team | |
| ☐ Benchmark Ceph cluster (fio, rados bench) | Storage architect | |
| ☐ Test qemu-img/virt-v2v tooling on dev VM | Migration engineer | |
| ☐ Define rollback procedure and test in dev | Operations team | |
| ☐ Export VMDKs from vCenter with ovftool | VMware admin | |
| ☐ Convert VMDKs to raw format | Migration engineer | |
| ☐ Import raw images to Ceph RBD | Storage engineer | |
| ☐ Create Cinder volumes from RBD images | OpenStack operator | |
| ☐ Take pre-cutover snapshots (VMware + Ceph) | Operations team | |
| ☐ Boot OpenStack instance from migrated volume | Platform team | |
| ☐ Validate data integrity (checksums) | Storage engineer | |
| ☐ Run application smoke tests | Application owner | |
| ☐ Monitor performance for 48 hours post-cutover | Operations team | |
| ☐ Archive source VMDKs for 30 days | VMware admin | |
| ☐ Decommission VMware hosts after 30-day retention | Platform team |
Why OpenMetal’s Hosted Private Cloud Works for VMware Migrations
If you’re planning a VMware-to-OpenStack migration, you need a stable, performant Ceph-backed landing zone. OpenMetal’s Hosted Private Cloud provides exactly that—without the operational burden of deploying and managing Ceph yourself.
OpenMetal’s infrastructure is built on NVMe storage, 25–100Gbps networking, and Ceph pools configured for production workloads. Starting with OpenMetal v3.0.0, deployments use cephadm for simplified cluster lifecycle management—making it easier to add OSDs, replace disks, or enable CephFS during or after your migration. You’re not inheriting someone else’s underprovisioned cluster. You get dedicated hardware with predictable, fixed-cost pricing—no surprise egress fees or noisy-neighbor performance drops.
For storage architects migrating off VMware, this means you can focus on the migration process itself—VMDK conversion, data validation, application cutover—rather than tuning CRUSH maps or troubleshooting OSD failures at 2 AM. You still get root access to the OpenStack control plane and Ceph cluster, so you maintain full operational control when you need it.
If you’re evaluating landing zones for your VMware workloads, consider OpenMetal as an alternative to hyperscaler cloud, DIY OpenStack, or proprietary converged infrastructure. You get Ceph, OpenStack, and the predictable cost model that makes storage migration planning feasible.
Read More Blog Posts
Works Cited
- Ceph Documentation. “RADOS Block Device (RBD).” Ceph.io, https://docs.ceph.com/en/latest/rbd/. Accessed 6 Nov. 2025.
- Ceph Documentation. “Cephadm — Ceph Orchestrator.” Ceph.io, https://docs.ceph.com/en/latest/cephadm/. Accessed 6 Nov. 2025.
- Red Hat. “Converting Virtual Machines from Other Hypervisors to KVM with virt-v2v.” Red Hat Customer Portal, https://access.redhat.com/articles/1351473. Accessed 6 Nov. 2025.
- QEMU Project. “QEMU Disk Image Utility.” QEMU Documentation, https://www.qemu.org/docs/master/tools/qemu-img.html. Accessed 6 Nov. 2025.
- OpenStack Foundation. “Cinder Volume Drivers: Ceph RBD.” OpenStack Docs, https://docs.openstack.org/cinder/latest/configuration/block-storage/drivers/ceph-rbd-volume-driver.html. Accessed 6 Nov. 2025.
- VMware. “OVF Tool User’s Guide.” VMware Technical Documentation, https://developer.vmware.com/web/tool/ovf/. Accessed 6 Nov. 2025.


































