NVMe Base Duo, 6.18.29 & Trixie 13.5

I’ve been using the Base Duo sucessfully for about a year without any problems. The boot and root partitions are on one drive and a LVM volume spans both drives.

Last week I upgraded the kernel to 6.18.29 and rebooted. The boot sequence failed when attempting to mount the LVM volume. The update had pulled the new kernel and a new bootloader (rpi-eeprom) but I did not update the bootloader prior to restarting the system.

Recovery involved rebooting via SD card and updating the bootloader.

This morning I updated the system to the latest release (13.5). This automatically updated the bootloader. The system hung again on reboot.

I have since reverted to the 6.12.75 kernel.

In the absence of a fix, I would strongly advise anyone to avoid updating to 6.18.29 until the root cause is identified and a fix made available.

I have raised a ticket on the raspberrypi/linux devs GitHub. ( Kernel 6.18.29 - NVMe boot failure · Issue #7367 · raspberrypi/linux · GitHub )

The bootloader is one of the weakest components during boot IMHO. This is the reason I always keep the boot-partition (FAT) on the SD, while booting the system from the NVMe. You have to edit the cmdline.txt for this to work.

It seems like a waste to have an additional SD-card, but I also keep a backup of the system-partition there.

It’s beginning to look more like a problem relating to 6.18 and lvm

I disabled the LVM volume in /etc/fstab. Added ‘nvme.max_host_mem_size=0’ to config.txt, and booted into the 6.18 kernel from the NVMe drive. Success.

Unfortunately, there’s no trace of the LVM volume when you run ‘lsblk -f’

It shows up when you run…

sudo pvs
sudo vgs
sudo lvs

sudo lvchange -ay

I sense a fix will be available soon.

Have you checked the device-mapper rules? Maybe something was messed up there. Or the lvm.conf has changed (unlikely, updates don’t do that). Since they show up when you manually execute various commands, there seems to be a problem with autoactivation which is driven by udev+device-mapper.

Regenerating the initramfs having first ensured the lvm2-monitor.service was up and runnning has fixed everything. My real concern is that things went pear-shaped when an OS update automatically generated a new initramfs.