Alert rule: CephOSDBackfillFull
Please consider opening a PR to improve this runbook if you gain new information about causes of the alert, or how to debug or resolve the alert. Click "Edit this Page" in the top right corner to create a PR directly on GitHub. |
Overview
An OSD has reached the BACKFILL FULL threshold.
This will prevent rebalance operations from completing.
Use ceph health detail
and ceph osd df
to identify the problem.
To resolve, add capacity to the affected OSD’s failure domain, restore down/out OSDs, or delete unwanted data.
Steps for debugging
Check current capacity utilisation
$ ceph_cluster_ns=syn-rook-ceph-cluster
$ kubectl -n ${ceph_cluster_ns} exec -it deploy/rook-ceph-tools -- ceph df
--- RAW STORAGE ---
[...]
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 8 937 KiB 3 2.7 MiB 0 93 GiB
fspool-metadata 2 32 23 MiB 29 69 MiB 0.02 93 GiB
fspool-data0 3 32 0 B 0 0 B 0 93 GiB
storagepool 4 32 20 KiB 5 71 KiB 0 93 GiB