Wednesday, January 19, 2011

Linux / Xen Server freezes

Hi,

An Ubuntu LTS server runs xen with dom0 and one virtual machine. The server is freezing permanently on a number of seemingly unrelated operations, such as:

  • Creation of a new file system with mkfs.ext3 on a LVM device. (this is consistent).
  • Restart of xend via /etc/init.d/xend restart
  • apt-get dist-upgrade on configuration phase of some fairly innocent stuff.

Also, yesterday I noticed that virtualized imagine had lost time synch and complained about backwards clock in dmesg.

Unfortunately, I don't have the screen shots on what happens actually on the console of the server (it is co-located).

I want to blame ram, but do You have other suggestions?

UPDATE: After further investigation, it appears that all those actions only kill network. When I visited the server in data center and logged onto console, I wasn't able to reach my router/gateway. How bizare.

  • Yeah, I'd be running a lengthy memtest first up, but there's a decent possibility it's something else hardwareish -- run a complete burnin of all system components, and monitor all your temperatures.

    The clock thing is unrelated, Xen just can't keep good time. NTP forever.

    Konrads : What would You reccommend for complete burning?
    womble : We use breakin from http://www.advancedclustering.com/view-document-details/23-bootimage-includes-breakin-v3.0-for-x86_64-kernel-and-initrd-only.html and it works pretty good.
    From womble
  • That does look like hardware failure. Test the ram and also check the hdd for bad sectors. Also check the log files for any warnings.

    Konrads : Logs are always clean :(
    The_cobra666 : And the system completely freezes? I still recommend checking for bad ram and bad hdd sectors. If it's possible, run something like Prime95. When it's a CPU or RAM problem, it will show. Also, what are the temps?
  • I've had issues with xen's bridge support messing up my dom0's eth device, knocking the dom0 offline from net access.

  • for the network issue, xen works better if you don't let it set up the bridge...

    for /etc/network/interfaces

    auto xen-br0
    iface xen-br0 inet static
            bridge_ports eth0
            bridge_stp off
            bridge_fd 0
            address 10.2.2.2
            gateway 10.2.2.1
            netmask 255.255.255.0
    

    /etc/xen/xend-config.sxp:

    (vif-script vif-bridge bridge=xen-br0)
    

    this way starting and stopping xen won't mess with your network interface.

    From Justin
  • This exact issue plagued me a while back running both in bridged and routed mode IIRC. Hardware was fine-- tried multiple NICs, memory. Nothing would ever be in syslog. Unfortunately I don't have that system anymore and have since moved to KVM. I'm curious what others will say.

    From

0 comments:

Post a Comment