2024-08-09 · Mira Lee
Drills for Disk Alerts That Cry Wolf
Disk alerts rot because growth is slow until it is not. Run drills that include gradual fill and sudden spike scenarios. Measure how long detection to mitigation takes with real humans, not scripts pretending to be humans.
Second, chart growth weekly even when quiet. Linear projections lie on bursty workloads, but they still expose teams that never look until pager noise returns.
Third, document when an alert should wake someone versus wait for business hours. Not every partition is equal; treat caches differently from durable data volumes.
Limitation: cloud object storage classes are mentioned only briefly. This piece focuses on host-local filesystem behavior.
Tags: storage, alerting