When an alert fires (Critical or Warning), this guide tells you what to do so that **any team member** can react quickly, even if the admin is not available.
| [Critical Service Failure](#critical-service-failure-) | 🔚 | A key service (like Mastodon, MinIO) is **down** | SSH into the server, try `systemctl restart <service>`. | A key service (like Mastodon, MinIO) is **down** | SSH into the server, try `systemctl restart <service>`. |
| [Disk Filling Up](#disk-filling-up-) | 📈 | Disk space critically low (under 10%) | SSH in and delete old logs/backups. Free up space **immediately**. | Disk space critically low (under 10%) | SSH in and delete old logs/backups. Free up space **immediately**. |
| [Rclone Mount Error](#rclone-mount-error-) | 🐢 | Cache failed, mount not healthy | Restart the rclone mount process. (Usually a `systemctl restart rclone@<mount>`, or remount manually.) | Cache failed, mount not healthy | Restart the rclone mount process. (Usually a `systemctl restart rclone@<mount>`, or remount manually.) |
| [PostgreSQL Replication Lag](#postgresql-replication-lag-) | 💥 | Database replicas are falling behind | Check database health. Restart replication if needed. Alert admin if lag is >5 minutes. | Database replicas are falling behind | Check database health. Restart replication if needed. Alert admin if lag is >5 minutes. |
| [RAID Degraded](#raid-degraded-) | 🧸 | RAID array is degraded (missing a disk) | Open server console. Identify failed drive. Replace drive if possible. Otherwise escalate immediately. | RAID array is degraded (missing a disk) | Open server console. Identify failed drive. Replace drive if possible. Otherwise escalate immediately. |
| [Log File Warnings](#log-file-warnings-) | ⚠️ | Error patterns found in logs | Investigate. If system is healthy, **log it for later**. If errors worsen, escalate. | Error patterns found in logs | Investigate. If system is healthy, **log it for later**. If errors worsen, escalate. |