CentOS 7 Xen PV guests failing to boot with kernel 3.10.0-693.17.1.el7

Submitted by kevin on Mon, 02/12/2018 - 20:57

After updating one of my VPS's I experienced the adrenaline rush of rebooting, waiting, panicking and rolling back a kernel update. Normally updates on CentOS are stable and nothing to worry about. However after all the fast patches to the kernel due to the new Intel/CPU based vulnerabilities I was rightly cautious to run the updates on a non-vital server first.

Kernel Update 3.10.0-693.17.1.el7 was the culprit, and as of today (2018-02-12) the CentOS repos including Plus still haven't pushed a fixed version.

UPDATE: RHEL 7.5 is out (kernel-3.10.0-862.el7). The patch in this bug report is now in this kernel, so will be removed from the centosplus kernel.

After a good dose of research and beating the server with a stick I've finally got a running server again. Here is my process for updating our servers to avoid this issue:

1. Enable CentOS Plus Repo - this will eventually provide a fixed version of the kernel quicker than the 9 month potential wait from the normal Updates Repo.

sudo yum install yum-utils
sudo yum-config-manager --enable centosplus

2. Update the system config to use the kernel-plus package

sudo vim /etc/sysconfig/kernel
DEFAULTKERNEL=kernel-plus

3. Normally we would next install the kernel-plus package using sudo yum install kernel-plus. However this would currently lead to a crashed server on reboot, so we will install the patched version from here:

wget https://people.centos.org/toracat/kernel/7/plus/bug14347/kernel-plus-3.10.0-693.17.1.el7.bug14347.centos.plus.x86_64.rpm
sudo rpm -ivh kernel-plus-3.10.0-693.17.1.el7.bug14347.centos.plus.x86_64.rpm

For more information on this patched version please see https://bugs.centos.org/view.php?id=14347 and make your own decision on if to install from this source.

4. To stop another run of yum update wiping this patched version with the currently bugged version, open and edit /etc/yum.repos.d/CentOS-Base.repo and alter the Plus repo section:

#additional packages that extend functionality of existing packages
[centosplus]
name=CentOS-$releasever - Plus
mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=centosplus&infra=$infra
#baseurl=http://mirror.centos.org/centos/$releasever/centosplus/$basearch/
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-7

Add the following two line:

includepkgs=kernel*
exclude=kernel-plus.x86_64 0:3.10.0-693.17.1.el7.centos.plus

This will ensure that 1. only the kernel packages are pulled from the Plus repo and 2. the currently installed patched version will not be reverted to the latest broken version.

Run a yum update to make sure the kernel package doesn't try to update!

5. Lastly check /boot/grub2/grub.cfg to make sure the grub config has picked up the correct patched kernel version BEFORE rebooting. 

menuentry 'CentOS Linux (3.10.0-693.17.1.el7.bug14347.centos.plus.x86_64) 7 (Core)' --class rhel fedora --class gnu-linux --class gnu --class os --unrestricted $menuentry_id_option 'gnulinux-3.10.0-229.el7.x86_64-advanced-140c4c
a1-0cba-4c8f-b487-0ae41f264a23' {
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_msdos
        insmod ext2
        if [ x$feature_platform_search_hint = xy ]; then
          search --no-floppy --fs-uuid --set=root  140c4ca1-0cba-4c8f-b487-0ae41f264a23
        else
          search --no-floppy --fs-uuid --set=root 140c4ca1-0cba-4c8f-b487-0ae41f264a23
        fi
        linux16 /boot/vmlinuz-3.10.0-693.17.1.el7.bug14347.centos.plus.x86_64 root=/dev/xvda1 ro crashkernel=auto rhgb quiet LANG=en_GB.UTF-8
        initrd16 /boot/initramfs-3.10.0-693.17.1.el7.bug14347.centos.plus.x86_64.img
}

If the rpm installation hasn't updated the grub2 configuration file, you may need to run this manually:

grub2-mkconfig -o /boot/grub2/grub.cfg


Reboot and Good Luck!