Sunday, January 23, 2011

How to troubleshoot web server lock-up (Debian Squeeze)

Every once in a while, my web server slows so significantly, it seems locked up. Can't SSH in, no sites being served. It's a VPS that started out as Debian 5 which I upgraded to testing (squeeze). It's a typical LAMP set-up with the sole purpose of running a couple of wordpress sites. One time when it locked up, I got to one of the sites, but it was wordpress complaining it couldn't establish a database connection. So it seemed as if something was really chewing up the CPU and mysqld either timed out, or possibly failed and couldn't restart. But since I couldn't SSH in I feel more inclined to attribute it to CPU. But the only processes running now, aside from OS and kernel stuff:

  • apache
  • mysqld
  • python (for fail2ban)
  • sshd
  • exim4

It has 512M of RAM and 1.5 GB of swap. Every time I check on it, it has plenty of free memory and is using virtually no swap (usually 2-3M). And since I am running fail2ban I don't think I'm getting ddosed.

I did find this in my logwatch email this morning (it locked up late last night, when there would have been very little traffic):

6 Time(s):  [<ffffffff810a0ebc>] ? oom_kill_process+0x7e/0x23d
6 Time(s):  [<ffffffff810a1505>] ? __out_of_memory+0x12a/0x141
6 Time(s):  [<ffffffff810a1586>] ? out_of_memory+0x6a/0x94

I didn't find anything else suspicious. It can't be my provider's host because I can SSH in and restart the VM, and everything seems fine.

Anybody know which logs I should start poring through to find the core of my problem?

Thanks guys.

  • The messages are pretty clear. The system is running out of memory and swap space so the kernel is killing processes trying to free memory. You should see OOM messages in your log files in /var/logs.

    The next step is to find the process(es) that are using up the memory. I've found the process(es) getting killed are not usually the ones taking up the memory. For this you need to setup some form of monitoring.

    An alternative is to install something along the lines of ps-watcher to kill processes that take up too much RAM. You can then check the log for which process ps-watcher is killing most often to identify the culprit.

    From David

0 comments:

Post a Comment