Pi5 NVME Base Challenges

Hi there,

First off thank you for the info in this forum it’s incredibly useful! The support is amazing and as such I’m hoping someone can chime in here as I’m at a bit of a loss now.

TLDR: Cannot get my NVME base to see the nvme drive in Raspberry Pi os 64 (lite or full)

Originally got a unit a week or so ago - worked for about an hour and then disappeared. Literally rebooted and could not see the drive again in the OS (the drive that came with it - an ADATA drive).

I’ve tried countless approaches to resolving this including:

  • A new drive - purchased a Samsung 980 to no avail - still no dice
  • Tried all the config options in /boot/firmware/config.text - each an every one. To a point I wasn’t convinced they were being read then enabled PCIe 3 and sure enough it says that but still fails
  • Bootloader is up to date (on both lite and full)
  • Tried rpi-update to get the LATEST boot loader but no luck
  • Added PCI_PROBE into my repi-eep-rom config but no luck
  • reseated the cable / drives countless times thinking it’s user error

Each and every time I get pcie link is down when I either dmesg or use journal to debug what’s going on. The link is never up.

Now for the fun bit - I bought another base and having the exact same issues with that one. Same as above - I wanted to rule it out being hardware (either the base or the cable) and seemed to have done the opposite and kind of ruled the base out of being the problem!

Setup is a standard RPI 5 with the heatsink/fan on (official one) and official power supply (5/5)

Interestingly (or not) but I’ve noticed even when I do a clean SD install I’ve noticed in the journal it says link it down and that’s even before I’ve tried to enable it… I might be going slightly mad here but I’m sure that didn’t even appear in the logs until it was in the config but now it appears on a completely clean install. Not sure if this is a false steer or not but thought I’d mention it.

I’m very tempted to get another pi now thinking that’s the only last logical piece that could be wrong.

It’s frustrating as I’ve seen it work for a bit then disappeared for good. I’m certainly not blaming Pimoroni or the hardware here - on the contrary I think it’s something else. Just hoping this wonderful hive mind has some ideas I haven’t thought of yet. Starting to scrape the barrel on those now…

Anyways thanks in advance for any advice here.

Thanks

Andy

This is the only part I don’t understand. Updating the bootloader is nothing you need luck for. Please run

sudo apt-get update
sudo apt-get upgrade
sudo reboot
sudo rpi-eeprom-update

and then post the result of the last command.

Maybe my Mancunian slang doesn’t translate so well, I didn’t literally mean needing luck - was saying ‘it didn’t work’. The reason I went down the rpi-update route was a last ditch effort to see if the cutting edge bootloader / kernel would help but no it unfortunately didn’t.

I’ve done all the above but for clarity’s sake:

This is my current boot loader:

BOOTLOADER: up to date
   CURRENT: Wed 24 Jan 12:16:01 UTC 2024 (1706098561)
    LATEST: Wed 24 Jan 12:16:01 UTC 2024 (1706098561)
   RELEASE: latest (/lib/firmware/raspberrypi/bootloader-2712/latest)
            Use raspi-config to change the release.

sudo apt-get update and sudo apt-upgrade yields:

user@raspberrypi:~ $ sudo apt-get update
Hit:1 http://deb.debian.org/debian bookworm InRelease
Hit:2 http://deb.debian.org/debian-security bookworm-security InRelease
Hit:3 http://deb.debian.org/debian bookworm-updates InRelease
Hit:4 http://archive.raspberrypi.com/debian bookworm InRelease
Reading package lists... Done
user@raspberrypi:~ $ sudo apt-get upgrade
Reading package lists... Done
Building dependency tree... Done
Reading state information... Done
Calculating upgrade... Done
0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded.

If I revert the bleeding edge kernel that rpi-update set by running:

sudo apt-get install --reinstall raspberrypi-bootloader raspberrypi-kernel

It gets us nicely back inline - which then the output of sudo rpi-eeprom-update is:

BOOTLOADER: up to date
   CURRENT: Fri  5 Jan 15:57:40 UTC 2024 (1704470260)
    LATEST: Fri  5 Jan 15:57:40 UTC 2024 (1704470260)
   RELEASE: default (/lib/firmware/raspberrypi/bootloader-2712/default)
            Use raspi-config to change the release.

Output of lspci:

0001:00:00.0 PCI bridge: Broadcom Inc. and subsidiaries Device 2712 (rev 21)
0001:01:00.0 Ethernet controller: Device 1de4:0001

output of lsblk:

NAME        MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
mmcblk0     179:0    0 14.8G  0 disk
├─mmcblk0p1 179:1    0  512M  0 part /boot/firmware
└─mmcblk0p2 179:2    0 14.3G  0 part /

and output of sudo rpi-eeprom-config

[all]
BOOT_UART=1
POWER_OFF_ON_HALT=0
BOOT_ORDER=0xf461
PCI_PROBE=1

And just for completeness the output of dmesg | grep pcie

[    0.393759] brcm-pcie 1000110000.pcie: host bridge /axi/pcie@110000 ranges:
[    0.393766] brcm-pcie 1000110000.pcie:   No bus range found for /axi/pcie@110000, using [bus 00-ff]
[    0.393778] brcm-pcie 1000110000.pcie:      MEM 0x1b00000000..0x1bfffffffb -> 0x0000000000
[    0.393784] brcm-pcie 1000110000.pcie:      MEM 0x1800000000..0x1affffffff -> 0x0400000000
[    0.393790] brcm-pcie 1000110000.pcie:   IB MEM 0x0000000000..0x0fffffffff -> 0x1000000000
[    0.394971] brcm-pcie 1000110000.pcie: setting SCB_ACCESS_EN, READ_UR_MODE, MAX_BURST_SIZE
[    0.394978] brcm-pcie 1000110000.pcie: Forcing gen 2
[    0.395014] brcm-pcie 1000110000.pcie: PCI host bridge to bus 0000:00
[    0.823757] brcm-pcie 1000110000.pcie: link down
[    0.828549] pcieport 0000:00:00.0: PME: Signaling with IRQ 37
[    0.828609] pcieport 0000:00:00.0: AER: enabled with IRQ 37
[    0.828891] brcm-pcie 1000120000.pcie: host bridge /axi/pcie@120000 ranges:
[    0.828896] brcm-pcie 1000120000.pcie:   No bus range found for /axi/pcie@120000, using [bus 00-ff]
[    0.828905] brcm-pcie 1000120000.pcie:      MEM 0x1f00000000..0x1ffffffffb -> 0x0000000000
[    0.828910] brcm-pcie 1000120000.pcie:      MEM 0x1c00000000..0x1effffffff -> 0x0400000000
[    0.828918] brcm-pcie 1000120000.pcie:   IB MEM 0x1f00000000..0x1f003fffff -> 0x0000000000
[    0.828923] brcm-pcie 1000120000.pcie:   IB MEM 0x0000000000..0x0fffffffff -> 0x1000000000
[    0.830092] brcm-pcie 1000120000.pcie: setting SCB_ACCESS_EN, READ_UR_MODE, MAX_BURST_SIZE
[    0.830100] brcm-pcie 1000120000.pcie: Forcing gen 2
[    0.830129] brcm-pcie 1000120000.pcie: PCI host bridge to bus 0001:00
[    0.935760] brcm-pcie 1000120000.pcie: link up, 5.0 GT/s PCIe x4 (!SSC)
[    0.947864] pcieport 0001:00:00.0: enabling device (0000 -> 0002)
[    0.947896] pcieport 0001:00:00.0: PME: Signaling with IRQ 38
[    0.947951] pcieport 0001:00:00.0: AER: enabled with IRQ 38

Apologies for the wall of text but hopefully it’ll give context to where I’m at with this!

Thanks

Andrew

or maybe I had problems understanding this because I am not a native speaker…

Anyhow: the “link down” followed by “link up” for PCIe is just normal, you can see this even if nothing is attached. And that is what your Pi5 is thinking. Since you did everything possible on software side and documented it here, I would suggest that you contact support and link to this thread.

Yeah forgive me - I do appreciate your responses so thanks for the help so far

So those PCIe dumps are for different adapters I think - the first one is the PCIE lane and the second is for the ethernet port. To prove this I changed the PCIE version to 3 for the NVME adapter and hey presto it sets it to PCIex3 but still no dice on the link - still showing down

 1.503841] brcm-pcie 1000110000.pcie: host bridge /axi/pcie@110000 ranges:
[    1.510852] brcm-pcie 1000110000.pcie:   No bus range found for /axi/pcie@110000, using [bus 00-ff]
[    1.519962] brcm-pcie 1000110000.pcie:      MEM 0x1b00000000..0x1bfffffffb -> 0x0000000000
[    1.528274] brcm-pcie 1000110000.pcie:      MEM 0x1800000000..0x1affffffff -> 0x0400000000
[    1.536587] brcm-pcie 1000110000.pcie:   IB MEM 0x0000000000..0x0fffffffff -> 0x1000000000
[    1.546078] brcm-pcie 1000110000.pcie: setting SCB_ACCESS_EN, READ_UR_MODE, MAX_BURST_SIZE
[    1.554402] brcm-pcie 1000110000.pcie: Forcing gen 3
[    1.559437] brcm-pcie 1000110000.pcie: PCI host bridge to bus 0000:00
[    2.039819] brcm-pcie 1000110000.pcie: link down

Whilst the ethernet still shows gen 2 and fires up the 4 lane PCIe bridge

[    2.088055] brcm-pcie 1000120000.pcie: host bridge /axi/pcie@120000 ranges:
[    2.095050] brcm-pcie 1000120000.pcie:   No bus range found for /axi/pcie@120000, using [bus 00-ff]
[    2.104156] brcm-pcie 1000120000.pcie:      MEM 0x1f00000000..0x1ffffffffb -> 0x0000000000
[    2.112466] brcm-pcie 1000120000.pcie:      MEM 0x1c00000000..0x1effffffff -> 0x0400000000
[    2.120787] brcm-pcie 1000120000.pcie:   IB MEM 0x1f00000000..0x1f003fffff -> 0x0000000000
[    2.129097] brcm-pcie 1000120000.pcie:   IB MEM 0x0000000000..0x0fffffffff -> 0x1000000000
[    2.138591] brcm-pcie 1000120000.pcie: setting SCB_ACCESS_EN, READ_UR_MODE, MAX_BURST_SIZE
[    2.146902] brcm-pcie 1000120000.pcie: Forcing gen 2
[    2.151919] brcm-pcie 1000120000.pcie: PCI host bridge to bus 0001:00
[    2.311825] brcm-pcie 1000120000.pcie: link up, 5.0 GT/s PCIe x4 (!SSC)
[    2.428376] pcieport 0001:00:00.0: enabling device (0000 -> 0002)
[    2.434538] pcieport 0001:00:00.0: PME: Signaling with IRQ 40
[    2.440378] pcieport 0001:00:00.0: AER: enabled with IRQ 40

Given this has happened on two separate NVME Base boards I’m reluctant to think it’s the boards themselves - there is a chance but what are the odds of both not working?

The only other two things it could be are

  • Incompatible NVME drive (even though the one I got with the drive did initially show up)

  • The PI having a fault PCIE connector

In a last ditch attempt to at least rule one of these out I’ve just bought a Crucial P3 which is on the good list on the product page so going to give that a whirl now and see what happens…

Failing that I’m starting to run out of ideas with this one

So, just like that the drive appears!

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
mmcblk0     179:0    0  14.8G  0 disk
├─mmcblk0p1 179:1    0   512M  0 part /boot/firmware
└─mmcblk0p2 179:2    0  14.3G  0 part /
nvme0n1     259:0    0 465.8G  0 disk

Looks like this version of the Samsung 980 I have is being returned then.

The real head scratcher is why the ADATA drive I got with the first NVME Base worked for a short period of time but not since then? I’ll get on to support about that one and try to get a replacement.

Thanks again for your help.

Hi,

I have a similar situation with my Samsung 980 500GB on NVMe Base, I’ve tried all possible solutions found here (in the forum), but with no luck. The drive is not listed in lsblk and the same:

...[ 2.026329] brcm-pcie 1000110000.pcie: link down...

How did you get it to work?