My Pimoroni PIM700 came with a NVMe Base and a 512GB Netac NV2000. With the default PCIe Gen2 configuration, there were no Advanced Error Reporting (AER) messages in the output of $journalctl -b. With the Gen3 configuration, the output contained the following four lines multiple times:
pcieport 0000:00:00.0: AER: Corrected error received: 0000:00:00.0
pcieport 0000:00:00.0: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Receiver ID)
pcieport 0000:00:00.0: device [14e4:2712] error status/mask=00000040/00002000
pcieport 0000:00:00.0: [ 6] BadTLP
These four lines were repeated more frequently with increased SSD activity. These masked correctable errors did not appear to downgrade performance. I turned them off by adding to /boot/firmware/cmdline.txt
before rootwait. Call this the Gen3x configuration. Reboot. Note
$ journalctl -b | grep ASPM
Feb 18 15:04:06 rpi5 kernel: PCIe ASPM is disabled
After rebooting, the errors were gone. With Gen2, the pibenchmarks score was 38389. With Gen3x, the score was 44516. Pimoroni benchmarked this Netac model and got a score of 44058.
Boot each configuration, run the command
$sudo lspci -vvv > gen2.txt (gen3.txt, gen3x.txt)
and examine line 99 in the three output files.
- Gen2: DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
- Gen3: DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
- Gen3x: DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
These lines specify capabilities of the Netac NVMe controller. DevSta stands for device status and CorrErr stands for correctable error. The plus means report errors and the minus means do not report errors. Active State Power Management (ASPM) can be used to lower the PCIe lane power consumption when the NVMe drive is idle, but requires support from the kernel, the bridge, and the NVMe controller. The Broadcom 2712 PCI bridge supports ASPM but it is disabled. The Netac NV2000 NVMe drive I am currently using does not support ASPM. Yet the command line option pcie_aspm=off has flipped the Gen3 plus to the Gen2, Gen3x minus, suppressing the Gen3 correctable errors. The output shows these errors are being masked. Tracking down masked correctable errors may only make sense if they are impacting NVMe performance. Consequently, it would be helpful if users experimenting with PCIe Gen3 would report the NVMe drives they are using, the errors they are seeing, and any impact on performance.