Alert rule: TopoLVMVolumeGroupNearFull
Please consider opening a PR to improve this runbook if you gain new information about causes of the alert, or how to debug or resolve the alert. Click "Edit this Page" in the top right corner to create a PR directly on GitHub. |
Overview
This alert fires when the utilization of a TopoLVM volume group is higher than 95% of its capacity. To resolve this alert, unused pv should be deleted or the volume group size must be increased.
Otherwise, investigate if this node has logical volumes not associated with a persistent volume.
Steps for debugging
Check for lingering volumes
Check if there are logicalvolumes
that aren’t associated with a persistentvolume
.
$ diff -y <(k get logicalvolumes -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}') <(k get pv -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}')
> local-pv-14024b59 (1)
> local-pv-23eec265 (1)
> local-pv-9ae3e5a (1)
pvc-379faad7-edf7-4cbd-81e0-08d0f2207c36 pvc-379faad7-edf7-4cbd-81e0-08d0f2207c36
pvc-3a4e6c1b-b2c3-4cbe-a2a9-e6418ad8bcef pvc-3a4e6c1b-b2c3-4cbe-a2a9-e6418ad8bcef
pvc-3fc91fd8-aeff-4adb-8135-327302dbc777 pvc-3fc91fd8-aeff-4adb-8135-327302dbc777
> pvc-05089699-18eb-4959-83ee-f35bd9f2d7e2 (2)
pvc-46411d9f-9bac-43fb-8bd5-583c2a7c5703 pvc-46411d9f-9bac-43fb-8bd5-583c2a7c5703
pvc-481d306b-5681-4e59-904b-6956192a4622 pvc-481d306b-5681-4e59-904b-6956192a4622
> pvc-d93cd532-2c12-4864-a826-df8a5359daf9 (3)
pvc-4c44f64a-287b-4123-abb9-39e76e8eb25e pvc-4c44f64a-287b-4123-abb9-39e76e8eb25e
[ ... remaining output omitted ... ]
1 | local-pv-* are from the Openshift Local Storage Operator (usually created by rook-ceph cluster) |
2 | PV only listed on the "right" side are from non-TopoLVM storage classes. |
3 | PV only listed on the "left" side are logical volumes that aren’t associated with a persistent volume anymore. |
Unassociated logicalvolumes
<3> might be deleted, best to check with customer first.