in Linux, UNIX

64-bit Xen considered harmful

Recently at work, we tried to implement Xen on Intel Xeon, running a 64-bit dom0/domU. I have to say that this failed horribly, so I’m writing this post to warn others off it. My colleague Gabriel worked hard to migrate everything back to a 32-bit environment, so kudos to him.

The specific symptoms we experienced while running 64-bit Xen is that the domU’s would crash and reboot randomly under (or after) high load. One of our domU’s is a development server, which also runs a CruiseControl, a continuous integration system. This means that every minute, CruiseControl wakes up, does a cvs update to see if there are any changes, and then recompiles the project(s) if needed. Periodically we started to see error messages like

Bad pte = 32971e067, process = cvs, vm_flags = 100077, vaddr = b7f34000
[] vm_normal_page+0xb7/0xd3
[] unmap_vmas+0x3d1/0x761
[] unmap_region+0x8a/0xf0
[] do_munmap+0x148/0x19b
[] sys_munmap+0x33/0x41
[] syscall_call+0x7/0xb
=======================

After a few of these, domU would reboot. It seems like others are having the same problem on 64-bit Xen. This user was running CentOS 5.1, which is basically what we’re running (we have the real deal Red Hat Enterprise LInux 5.1).

As I said, migrating the domU back to a 32-bit dom0 seemed to fix this, so let this be a fair warning to others thinking of running a 64-bit dom0.