How to Handle a Ceph Cluster Going Read-Only Due to Full OSDs

Author: Nemanja Ilic

In a Ceph cluster, storage capacity reaching critical thresholds can cause the cluster to enter a read-only (RO) state. This typically occurs when OSDs exceed the full-ratio limit, which is set to 0.95 (95%) by default. Once this threshold is crossed, Ceph blocks write operations to prevent potential data corruption or inconsistency. While this is a protective measure, it can bring production workloads to a halt if it happens unexpectedly.

Prerequisites

Access to your hardware nodes.

Because critical operations will take place on your physical servers, make sure you have SSH access.

What to Do When Ceph Goes RO

When this happens, you have two immediate options (run these from your physical nodes):

1. Temporarily increase the full-ratio (safer option)

ceph osd set-full-ratio 0.97

This gives you a buffer to delete unnecessary data or snapshots and restore normal operations.

2. Reduce the replication factor

ceph osd pool set images size 2

This changes the replication factor only for the images pool, reducing it from 3 replicas to 2. It can free up space for that specific pool.

danger

Important: The number of replicas includes the object itself, see the docs.

danger

Only use this in non-production or emergency scenarios, as it compromises data durability.

After applying either change, Ceph will lift the write block and move out of RO mode. In some cases, you may need to restart affected instances if operating systems or applications don't handle the transition well. This guide can help if that happens.

Post-Recovery Cleanup Steps

Once the cluster is writable again, it's crucial to act quickly. The primary goal is to reduce OSD usage below critical thresholds. Ideally, bring OSD usage below 80%, or at least under 85%, to prevent another full state.

Focus on:

Deleting unused volumes and snapshots
Cleaning up abandoned or temporary data
Working with application teams to offload unnecessary storage

Use ceph osd df to monitor usage and ceph -s to check for any remaining health warnings during cleanup. This article provides additional information.

Real-World Example: A Ceph Cluster Hitting RO State

In this example, a Ceph cluster entered a read-only (RO) state due to one or more OSDs crossing the full-ratio threshold (default: 0.95). The issue was identified and mitigated by temporarily increasing the full-ratio, but not without immediate cleanup requirements.

1. OSDs Approaching or Exceeding 95% Usage

[root@shiny-jacana ~]# ceph osd df
ID  CLASS  WEIGHT   REWEIGHT  SIZE     RAW USE  DATA     OMAP      META     AVAIL    %USE   VAR   PGS  STATUS
  ssd  2.91100   1.00000  2.9 TiB  2.8 TiB  2.8 TiB   721 KiB  5.5 GiB  149 GiB  95.01  1.01  135      up
  ssd  2.91100   1.00000  2.9 TiB  2.7 TiB  2.7 TiB   333 KiB  5.2 GiB  166 GiB  94.43  1.01  134      up
  ssd  2.91089   1.00000  2.9 TiB  2.8 TiB  2.7 TiB   572 KiB  5.6 GiB  163 GiB  94.53  1.01  129      up
  ssd  2.91089   1.00000  2.9 TiB  2.8 TiB  2.8 TiB   729 KiB  5.7 GiB  157 GiB  94.72  1.01  130      up
  ssd  2.91089   1.00000  2.9 TiB  2.7 TiB  2.7 TiB   998 KiB  5.6 GiB  169 GiB  94.35  1.01  126      up
  ssd  2.91089   1.00000  2.9 TiB  2.7 TiB  2.7 TiB   474 KiB  5.7 GiB  198 GiB  93.35  1.00  126      up

The first OSD (ID 1) has crossed the critical 95% usage threshold, which caused Ceph to enter a read-only state to prevent data loss or corruption.

2. Inspecting Cluster-Wide Usage

Next, we ran ceph df to assess space usage across pools:

[root@shiny-jacana ~]# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
ssd    140 TiB  8.7 TiB  131 TiB   131 TiB      93.74

--- POOLS ---
POOL        ID   PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
volumes      3  2048   65 TiB   17.09M  129 TiB  97.90    1.4 TiB
images       2   512  193 GiB   24.79k  385 GiB  11.92    1.4 TiB
vms          4    64   60 GiB   15.46k  120 GiB   4.05    1.4 TiB

The volumes pool is using almost 98% of its quota, indicating that it’s the main driver of OSD saturation.

3. Mitigating the RO State

[root@shiny-jacana ~]# ceph osd set-full-ratio 0.97
osd set-full-ratio 0.97

This allowed the cluster to resume write operations. Keep in mind, this is only a temporary solution. The underlying issue lack of space still needs to be resolved.

How to Prevent This in the Future

Set up proactive monitoring and use tools like DataDog to alert when OSD usage hits 80% or more. See this article to enable Datadog Cloud Monitoring in OpenMetal Central.
Automate snapshot cleanup by implementing periodic audits for unused volumes and snapshots.
Plan capacity with headroom and avoid running clusters above 80–85% usage for extended periods.
Scale out before you're at risk. If usage trends consistently high, add more OSDs or expand storage.

tip

Need to remove stuck images? Check out this Ceph cleanup guide to break parent-child dependencies and clear them out.

In Summary

When Ceph enters a read-only state due to full OSDs, quick and informed action is essential. Temporarily adjusting the 'full-ratio' buys time, but it’s not a long-term fix. Clean up storage immediately and make sure you have monitoring and capacity planning in place to avoid recurrence. The earlier you respond to rising usage, the less disruptive the recovery will be.

How to Handle a Ceph Cluster Going Read-Only Due to Full OSDs

Prerequisites​

What to Do When Ceph Goes RO​

1. Temporarily increase the full-ratio (safer option)​

2. Reduce the replication factor​

Post-Recovery Cleanup Steps​

Real-World Example: A Ceph Cluster Hitting RO State​

1. OSDs Approaching or Exceeding 95% Usage​

2. Inspecting Cluster-Wide Usage​

3. Mitigating the RO State​

How to Prevent This in the Future​

In Summary​