Monitor PVC Disk Usage
To get notified about firing alerts you need to setup an alert receiver. See Configure Alert Receivers for more information. |
Overview
APPUiO Cloud exposes kubelet
metrics that can be used to monitor PVC disk usage.
The following volume metrics are available:
kubelet_volume_stats_available_bytes
kubelet_volume_stats_capacity_bytes
kubelet_volume_stats_used_bytes
kubelet_volume_stats_inodes
kubelet_volume_stats_inodes_free
kubelet_volume_stats_inodes_used
Create a Disk Usage Alert
The following instructions create two alerts that trigger when the disk usage or used inodes of any PVC in the given namespace exceed 90%.
-
Set the namespace your application is running in
APP_NAMESPACE="my-app"
-
Create the alert
kubectl -n "${APP_NAMESPACE}" apply -f - <<YAML apiVersion: monitoring.coreos.com/v1 kind: PrometheusRule metadata: name: my-app-volume-alerts spec: groups: - name: my-app-volume.rules rules: - alert: VolumeAvailableSpaceLow expr: | ( sum(kubelet_volume_stats_available_bytes{persistentvolumeclaim!=""}) by (persistentvolumeclaim) / sum(kubelet_volume_stats_capacity_bytes{persistentvolumeclaim!=""}) by (persistentvolumeclaim) ) < 0.1 (1) for: 5m labels: severity: critical (2) annotations: summary: "PVC available space low" description: | PVC {{ \$labels.persistentvolumeclaim }} available space {{ \$value | humanizePercentage }} is low. Your application may become unable to write data to disk. Either increase the PVC size or delete files to free up space. - alert: VolumeFreeInodesLow expr: | ( sum( kubelet_volume_stats_inodes_free{persistentvolumeclaim!=""} * on(persistentvolumeclaim) kube_persistentvolumeclaim_info{storageclass!="cephfs-fspool-cluster"} (3) ) by (persistentvolumeclaim) / sum( kubelet_volume_stats_inodes{persistentvolumeclaim!=""} * on(persistentvolumeclaim) kube_persistentvolumeclaim_info{storageclass!="cephfs-fspool-cluster"} ) by (persistentvolumeclaim) ) < 0.1 (4) for: 5m labels: severity: critical (2) annotations: summary: "PVC free inodes low" description: | PVC {{ \$labels.persistentvolumeclaim }} free inodes {{ \$value | humanizePercentage }} is low. Your application may become unable to create new files and directories. Inodes are used to track files and directories. When the number of inodes is low, you may not be able to create new files or directories. A high inode usage may stem from many small files. Increasing the PVC size will scale up inodes proportionally. Either increase the PVC size or delete files to free up inodes. YAML
1 Alert if available space is less than 10%. 2 You might want to add a higher threshold with a warning
severity.3 CephFS doesn’t store metadata in inode structures and always reports zero free inodes. This filter excludes CephFS PVCs from the alert. 4 Alert if free inodes are less than 10%. It’s possible to alert on absolute values, for example if the available space is less than 10GiB.
expr: | max by(persistentvolumeclaim) (kubelet_volume_stats_available_bytes{persistentvolumeclaim!=""}) < 10 * 1024 * 1024 * 1024 annotations: description: | PVC {{ $labels.persistentvolumeclaim }} available space {{ $value | humanize1024 }} is low. ...
Predicting Disk Usage
Prometheus can be used to predict disk usage based on the current rate of change.
The following alert triggers when the available space is predicted to be full in less than 4 hours.
It can be difficult to predict disk usage for a given application. The following alert is only a starting point and should be adjusted to your specific use case. Both a shorter historical time frame and a longer prediction time can lead to false positives.
False positives for short spikes can be minimized using the A longer historical time frame or a shorter prediction time can delay alerting until the volume is full or almost full if spikes or sudden, unusual fill patterns occur. |
kubectl -n "${APP_NAMESPACE}" apply -f - <<YAML
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
name: my-app-volume-prediction-alerts
spec:
groups:
- name: my-app-volume-prediction.rules
rules:
- alert: VolumeAvailableSpaceFullIn4Hours
expr: |
min by(persistentvolumeclaim) (
predict_linear(kubelet_volume_stats_inodes_free{persistentvolumeclaim!=""}[1h], 4 * 3600) (1)
)
< 0
for: 5m
labels:
severity: warning
annotations:
summary: "PVC available space predicted to run out in 4 hours"
description: |
PVC {{ \$labels.persistentvolumeclaim }} available space is predicted to run out within the next 4 hours.
Your application may become unable to write data to disk.
Either increase the PVC size or delete files to free up space.
- alert: VolumeFreeInodesFullIn4Hours
expr: |
min by(persistentvolumeclaim) (
predict_linear(kubelet_volume_stats_inodes_free{persistentvolumeclaim!=""}[1h], 4 * 3600) (1)
)
* on(persistentvolumeclaim) (2)
min by(persistentvolumeclaim) (kube_persistentvolumeclaim_info{storageclass!="cephfs-fspool-cluster"})
< 0
for: 5m
labels:
severity: warning
annotations:
summary: "PVC free inodes predicted to run out in 4 hours"
description: |
PVC {{ \$labels.persistentvolumeclaim }} free inodes are predicted to run out within the next 4 hours.
Your application may become unable to create new files and directories.
Inodes are used to track files and directories. When the number of inodes is low, you may not be able to create new files or directories.
A high inode usage may stem from many small files. Increasing the PVC size will scale up inodes proportionally.
Either increase the PVC size or delete files to free up inodes.
YAML
1 | Use the last 1 hour of data to predict the next 4 hours. |
2 | CephFS doesn’t store metadata in inode structures and always reports zero free inodes. This filter excludes CephFS PVCs from the alert. |