Triage a fleet of VPS servers from chat
Cuándo usarlo: You manage 5-20 VPS boxes and need to eyeball them all quickly.
Requisitos previos
- SSH key in agent —
ssh-add ~/.ssh/id_ed25519 - Named host list — TOML config with name → host/user/key path
Flujo
-
Run health checks in parallelFor each server in my fleet, run: uptime, df -h, free -h, last error in journalctl. Summarize anything concerning.✓ Copiado→ Per-host summary + flagged issues
-
Drill into problemsOn server X where disk is 95% full, find the top 10 largest directories under /var.✓ Copiado→ du output
-
Fix or escalateIs it safe to delete /var/log/old-*.gz? Confirm with me before running.✓ Copiado→ Plan + waits for confirm
Resultado: Fleet-wide triage in 5 minutes.
Errores comunes
- Command timeouts are advisory — a hung command can leave processes running — Use
timeout 30 <cmd>explicitly for anything that could hang