Here’s an issue i’ve had with a variety of Hetzner dedicated servers. I first discovered it when I was alerted that a server with 1TB NVMe drive had reached 100% capacity. I found that /var/log/syslog
was filling up many times per second with these errors.
less /var/log/syslog
Dec 15 20:16:37 Ubuntu-2204-jammy-amd64-base kernel: [18698.029674] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Dec 15 20:16:37 Ubuntu-2204-jammy-amd64-base kernel: [18698.029677] nvme 0000:01:00.0: device [144d:a80a] error status/mask=00000001/0000e000
Dec 15 20:16:37 Ubuntu-2204-jammy-amd64-base kernel: [18698.029680] nvme 0000:01:00.0: [ 0] RxErr (First)
Dec 15 20:16:42 Ubuntu-2204-jammy-amd64-base kernel: [18702.752410] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
Dec 15 20:16:42 Ubuntu-2204-jammy-amd64-base kernel: [18702.752420] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Dec 15 20:16:42 Ubuntu-2204-jammy-amd64-base kernel: [18702.752422] nvme 0000:01:00.0: device [144d:a80a] error status/mask=00000001/0000e000
Dec 15 20:16:42 Ubuntu-2204-jammy-amd64-base kernel: [18702.752425] nvme 0000:01:00.0: [ 0] RxErr (First)
Dec 15 20:16:42 Ubuntu-2204-jammy-amd64-base kernel: [18702.796264] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
Dec 15 20:16:42 Ubuntu-2204-jammy-amd64-base kernel: [18702.796273] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Dec 15 20:16:42 Ubuntu-2204-jammy-amd64-base kernel: [18702.796276] nvme 0000:01:00.0: device [144d:a80a] error status/mask=00000001/0000e000
Dec 15 20:16:42 Ubuntu-2204-jammy-amd64-base kernel: [18702.796279] nvme 0000:01:00.0: [ 0] RxErr (First)
Dec 15 20:16:45 Ubuntu-2204-jammy-amd64-base kernel: [18706.493606] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
Dec 15 20:16:45 Ubuntu-2204-jammy-amd64-base kernel: [18706.493614] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Dec 15 20:16:45 Ubuntu-2204-jammy-amd64-base kernel: [18706.493617] nvme 0000:01:00.0: device [144d:a80a] error status/mask=00000001/0000e000
Dec 15 20:16:45 Ubuntu-2204-jammy-amd64-base kernel: [18706.493620] nvme 0000:01:00.0: [ 0] RxErr (First)
Dec 15 20:16:46 Ubuntu-2204-jammy-amd64-base kernel: [18707.444456] pcieport 0000:00:01.3: AER: Corrected error received: 0000:02:00.0
Dec 15 20:16:46 Ubuntu-2204-jammy-amd64-base kernel: [18707.444464] nvme 0000:02:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Dec 15 20:16:46 Ubuntu-2204-jammy-amd64-base kernel: [18707.444467] nvme 0000:02:00.0: device [144d:a80a] error status/mask=00000001/0000e000
Dec 15 20:16:46 Ubuntu-2204-jammy-amd64-base kernel: [18707.444470] nvme 0000:02:00.0: [ 0] RxErr (First)
Dec 15 20:16:48 Ubuntu-2204-jammy-amd64-base kernel: [18709.500278] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
Dec 15 20:16:48 Ubuntu-2204-jammy-amd64-base kernel: [18709.500286] nvme 0000:01:00.0: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID)
Dec 15 20:16:48 Ubuntu-2204-jammy-amd64-base kernel: [18709.500289] nvme 0000:01:00.0: device [144d:a80a] error status/mask=00000001/0000e000
Dec 15 20:16:48 Ubuntu-2204-jammy-amd64-base kernel: [18709.500291] nvme 0000:01:00.0: [ 0] RxErr (First)
Dec 15 20:16:50 Ubuntu-2204-jammy-amd64-base kernel: [18710.562333] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
If you open a technical support ticket with them (requires scheduling a 30min window, Central European Time (UTC+1)), they’ll either clean the drive connectors or swap a part. Sometimes the errors go away completely, sometimes they reduce to once every few seconds. If the latter happens, Hetzner will say that’s within their acceptable limits.
You can run dmesg | grep -i aer to check that all of the errors are corrected.
[123710.515154] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123710.675379] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123710.749724] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123711.700107] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123712.160531] pcieport 0000:00:01.3: AER: Corrected error received: 0000:02:00.0
[123712.467374] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123712.502386] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123713.109711] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123713.933761] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123714.493832] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123714.951897] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123716.512362] pcieport 0000:00:01.3: AER: Corrected error received: 0000:02:00.0
[123716.844129] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123716.991703] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123717.622934] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123718.658627] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123719.577563] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123720.710414] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123721.017129] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0
[123721.065247] pcieport 0000:00:01.1: AER: Corrected error received: 0000:01:00.0