It’s a common problem we see here at HostCube. A customer’s instance gets overloaded and then is unresponsive. The instance didn’t crash, but is overloaded. In most cases, we then must reboot the instance to make it available again. The question is what caused the overload?
It can be because of a number of issues such as:
- Apache took up too much memory and was thrashing to swap
- A hacker’s botnet made the site unresponsive
- Or some other rogue process
We already monitor over 20 critical services on every customer instance. The problem is with our existing monitoring couldn’t catch when an instance is in that critical state. If the instance is unresponsive, there’s no way for us to gain access to the instance. This then means we don’t know the cause of the overload.
We’ve just deployed out to every customer instance local monitoring that should solve this problem. When the instance load reaches a specific threshold, it will record a snapshot of what’s going on with your instance and E-mail us the details.
We’ll be able to share this information with you and hopefully rectify your server’s overload quicker. We may eventually add an option to directly E-mail customers this detail. For now, just contact us via customer support and if we have a report we’ll send you the details.