Skip to content
Tutorial

VPS Not Responding? A 15-Minute Triage Checklist Before You Reboot

A practical first-response checklist to diagnose an unresponsive VPS without making recovery harder.

Published:
Data notes

VPS Not Responding? A 15-Minute Triage Checklist Before You Reboot

When a VPS appears down, the fastest action often feels like a reboot. That works sometimes, but it can also destroy useful evidence and extend downtime if the root cause is still present after boot.

This checklist is built for the first 15 minutes of response, when clarity matters more than speed theater.

Minute 0-2: Confirm scope, not panic

Start by answering two questions:

  1. Is this host-level downtime or only one service path?
  2. Is impact global or region/user-segment specific?

Check an external probe first, then confirm from a second network. A local office DNS issue has fooled many teams into restarting healthy servers.

Minute 2-5: Test control plane access

Try access methods in this order:

  • Provider console / serial console
  • SSH from a known-good bastion
  • Internal service checks from adjacent hosts (if available)

If SSH fails but provider console works, your issue is likely network, firewall, SSH daemon, or auth path. If even console is unstable, suspect host pressure or hypervisor-side events.

Minute 5-8: Capture quick host state

If you can reach the shell, record:

  • uptime
  • free -m
  • df -h
  • top -b -n1 | head -40
  • dmesg | tail -100

You are looking for obvious pressure signatures: full disk, swap thrash, runaway CPU, OOM kills, or filesystem errors.

Do not spend this window on deep forensics. Gather enough to decide the next safe move.

Minute 8-11: Check edge and routing basics

A surprising number of “dead server” incidents are path issues:

  • Expired DNS records after migration
  • Broken security group / ACL updates
  • MTU mismatch after tunnel or provider changes
  • Origin-only firewall rules accidentally blocking edge traffic

Validate the path from client edge to origin, not just process health on the box.

Minute 11-13: Decide stabilize vs reboot

Reboot only after one of these conditions is true:

  • Kernel/host is clearly wedged and non-recoverable in place
  • Service restart cannot proceed due to stuck system state
  • You captured enough diagnostic context for postmortem

If you can safely restart only the impacted service, do that first.

Minute 13-15: Communicate and assign ownership

Before the next action, publish:

  • Current impact statement
  • What has been verified
  • Next technical action
  • Next update time

This keeps incident comms aligned and prevents duplicated blind actions.

Final note

The goal of early triage is not to be heroic. It is to avoid making a bad event worse. A structured 15-minute routine will outperform ad-hoc intuition almost every time.

Next steps

Jump into tools and related pages while the context is fresh.

Ready to choose your VPS?

Use our VPS Finder to filter, compare, and find the perfect plan for your needs.