How Do We Monitor Servers?

Reclaim Hosting currently has three methods:

  1. Uptime Robot checks servers at for the keyword “hello world” and reports any possible downtime in the #monitoring channel in slack.

  2. We have monit running on the server which notifies us in Slack when the load average is elevated. This isn’t always indicative of downtime but it’s a good early warning for us that something is up if the load starts going up on a server.

  3. New Relic provides infrastructure management, monitoring, and reporting. It allows you to see response times, disk usage, diagnose errors, and more.
    • Within the Entity Explorer tab you will find CPU, memory, and disk usage available. Servers using a critical amount of resources are put at the top of the list designated by a red box rather than green.
    Screen Shot 2020-10-02 at 3.21.28 PM
    • The APM tab follows specific applications, with additional information for each including transaction history and breaking down theme and plugin response times.

Good example conversation in Ticket #31151.