Triage a fleet of VPS servers from chat
Quand l'utiliser : You manage 5-20 VPS boxes and need to eyeball them all quickly.
Prérequis
- SSH key in agent —
ssh-add ~/.ssh/id_ed25519 - Named host list — TOML config with name → host/user/key path
Déroulement
-
Run health checks in parallelFor each server in my fleet, run: uptime, df -h, free -h, last error in journalctl. Summarize anything concerning.✓ Copié→ Per-host summary + flagged issues
-
Drill into problemsOn server X where disk is 95% full, find the top 10 largest directories under /var.✓ Copié→ du output
-
Fix or escalateIs it safe to delete /var/log/old-*.gz? Confirm with me before running.✓ Copié→ Plan + waits for confirm
Résultat : Fleet-wide triage in 5 minutes.
Pièges
- Command timeouts are advisory — a hung command can leave processes running — Use
timeout 30 <cmd>explicitly for anything that could hang