Disk read errors

Greelan · August 13, 2024, 01:49:07 PM

Getting the following repeatedly in the log after the update to 24.7:

2024-08-13T21:45:01	Notice	kernel	(nda0:nvme0:0:0:1): Error 5, Retries exhausted	
2024-08-13T21:45:01	Notice	kernel	(nda0:nvme0:0:0:1): CAM status: Unknown (0x420)	
2024-08-13T21:45:01	Notice	kernel	(nda0:nvme0:0:0:1): READ. NCB: opc=2 fuse=0 nsid=1 prp1=0 prp2=0 cdw=11e0c7d0 0 27 0 0 0	
2024-08-13T21:45:01	Notice	kernel	nvme0: UNRECOVERED READ ERROR (02/81) crd:0 m:0 dnr:0 p:1 sqid:2 cid:118 cdw0:0	
2024-08-13T21:45:01	Notice	kernel	nvme0: READ sqid:2 cid:118 nsid:1 lba:299943888 len:40

Would welcome suggestions for troubleshooting.

The install is on ZFS.

meyergru · August 13, 2024, 02:18:19 PM

That is a very specific message about a read error in LBA 299943888, so you could calculate the offset and do a direct read from /dev/nda0 at that location to see if it is really there, then with another offset to verify if it is really a hardware error. Try something like:

Code Select


dd if=/dev/nda0 of=/dev/null bs=512 skip=299943888 count=40

Look at smartctl -a /dev/nvme0 to find the blocksize, IDK if it is 512 Bytes or 4096.

With ZFS, this can happen because of COW, so at some point in time you will hit any bad spot if there is one. This does not have to be caused by the 24.7 upgrade.

You can also try a "smartctl --test=long /dev/nvme0".

Greelan · August 14, 2024, 12:36:20 PM

Thanks

Code Select

dd if=/dev/nda0 of=/dev/null bs=512 skip=299943888 count=40
dd: /dev/nda0: Input/output error
32+0 records in
32+0 records out
16384 bytes transferred in 0.007795 secs (2101909 bytes/sec)

Code Select

smartctl -a /dev/nvme0
smartctl 7.4 2023-08-01 r5530 [FreeBSD 14.1-RELEASE-p3 amd64] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       UMIS LENSE40256GMSP34MESTB3A
Serial Number:                      SS0L25152X3RC0AF114X
Firmware Version:                   2.3.7182
PCI Vendor/Subsystem ID:            0x1cc4
IEEE OUI Identifier:                0x044a50
Total NVM Capacity:                 256,060,514,304 [256 GB]
Unallocated NVM Capacity:           0
Controller ID:                      6059
NVMe Version:                       1.3
Number of Namespaces:               1
Namespace 1 Size/Capacity:          256,060,514,304 [256 GB]
Namespace 1 Utilization:            0
Namespace 1 Formatted LBA Size:     512
Namespace 1 IEEE EUI-64:            504a04 c500000000
Local Time is:                      Wed Aug 14 20:27:21 2024 AEST
Firmware Updates (0x12):            1 Slot, no Reset required
Optional Admin Commands (0x0017):   Security Format Frmw_DL Self_Test
Optional NVM Commands (0x0016):     Wr_Unc DS_Mngmt Sav/Sel_Feat
Log Page Attributes (0x03):         S/H_per_NS Cmd_Eff_Lg
Maximum Data Transfer Size:         32 Pages
Warning  Comp. Temp. Threshold:     80 Celsius
Critical Comp. Temp. Threshold:     84 Celsius

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     6.50W    6.50W       -    0  0  0  0        0       0
 1 +     4.60W    4.60W       -    1  1  1  1        5       5
 2 +     3.90W    3.90W       -    2  2  2  2        5       5
 3 -     1.50W    1.50W       -    3  3  3  3     4000    4000
 4 -   0.0050W    0.50W       -    4  4  4  4    20000   30000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         1

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        43 Celsius
Available Spare:                    98%
Available Spare Threshold:          3%
Percentage Used:                    100%
Data Units Read:                    5,950,084 [3.04 TB]
Data Units Written:                 985,528,704 [504 TB]
Host Read Commands:                 82,219,540
Host Write Commands:                10,135,637,294
Controller Busy Time:               492,801
Power Cycles:                       53
Power On Hours:                     32,993
Unsafe Shutdowns:                   16
Media and Data Integrity Errors:    2,809
Error Information Log Entries:      3,018
Warning  Comp. Temperature Time:    0
Critical Comp. Temperature Time:    0
Temperature Sensor 1:               43 Celsius

Error Information (NVMe Log 0x01, 16 of 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message
  0       3018     2  0x005f  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
  1       3017     3  0x0068  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
  2       3016     2  0x0063  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
  3       3015     4  0x0073  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
  4       3014     2  0x006d  0x0281 0xe800            0     1     -  Unknown Command Specific Status 0x40
  5       3013     1  0x0061  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
  6       3012     4  0x007b  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
  7       3011     3  0x006f  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
  8       3010     3  0x006c  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
  9       3009     1  0x006b  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
 10       3008     2  0x007b  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
 11       3007     2  0x0079  0x0281 0x7801            0     1     -  Unknown Command Specific Status 0x40
 12       3006     2  0x007d  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
 13       3005     2  0x0079  0x0281 0x7801            0     1     -  Unknown Command Specific Status 0x40
 14       3004     2  0x0072  0x0281  0x7d1            0     1     -  Unknown Command Specific Status 0x40
 15       3003     1  0x006b  0x0281  0x000            0     1     -  Unknown Command Specific Status 0x40
... (48 entries not read)

Self-test Log (NVMe Log 0x06)
Self-test status: No self-test in progress
Num  Test_Description  Status                       Power_on_Hours  Failing_LBA  NSID Seg SCT Code
 0   Extended          Completed: failed segments            32993        61616     1   7   -    -

Suggestive of a hardware issue?

Patrick M. Hausen · August 14, 2024, 12:37:22 PM

Code Select

Failing_LBA
61616

Er ... yes? :)

Greelan · August 14, 2024, 12:46:39 PM

Except I also see posts like this, which suggests that the failures might not be what they seem to be.

doktornotor · August 14, 2024, 01:01:43 PM

Quote from: Greelan on August 14, 2024, 12:46:39 PM
Except I also see posts like this, which suggests that the failures might not be what they seem to be.

Not sure what you mean? That discussion still concerns bad blocks recovered and remapped after many and many retries. If you want to deal with randomly failing drives, sure you can keep doing so...

Greelan · August 14, 2024, 01:15:42 PM

More a matter of hoping... Lol

Oh, well, off to get a new one.

doktornotor · August 14, 2024, 01:22:44 PM

I always discard these drives. The thing is, if something critical, such as kernel, happens to land in the bad spot(s), you have an unbootable box. Not something you want to deal with normally.

Greelan · August 14, 2024, 01:24:00 PM

Agreed. Fortunately a reinstall with OPNsense is pretty straightforward with a configuration backup

meyergru · August 14, 2024, 01:27:37 PM

This SSD has had it - and it told you so. Just look at the "Percentage Used": it is at 100%. That is no wonder, because of the 504 TBytes that have been written to it in 3.6 years of usage.

UMIS is a Lenovo internal brand, those consumer-grade SSDs have been built into Laptops and are not for write-intensive loads such as what you did to it (writing 170 times as much as you read). The disk has used 2% of its available spare already, so that there have been bad blocks while writing already is 100% guaranteed. This is because of the disk capacity having been overwritten ~1000 times.

The reason for posts like you have pointed to is another one: The Non-Pro models of Samsung (and the Pro models as well) once suffered from a firmware problem: Matter-of-fact, SSDs use flash memory, which is powered by capacitively charged cells. Those cells lose charge over time, especially when they are not used (i.e. "written to").

I found that with a Samsung 980 Pro, which got very slow after ~1 year. It had game installations on it, which were written only once. Now, over time, the cells had degraded, such that ECC errors occurred, which had to be corrected and thus the read performance was severly degraded. Samsung fixed that with a firmware upgrade that once in a while "refreshes" all cells that have not been written in a while - further adding to cell wear even without anybody actually writing data.

Kingstons KC3000 seems to suffer similar issues.

Even with prosumer-grade disks and disk sizes less than 1 TByte (i.e. without much spare allocation), you can expect ~4-5 years max. lifetime unless you log on memory or reduce log levels. Also, the database for RRD poses a heavy write load on the disk.

That is to say: The bigger, the better, QLC is a no-go, better use TLC (industrial grade) than MLC.

Greelan · August 14, 2024, 01:35:48 PM

Appreciate the insights. Interesting regarding the level of writes, since this box has only ever been used for OPN. I thought routers/firewalls weren't that disk intensive? I don't have OPN configured to do excessive firewall logging.

Any suggestions for a replacement?

doktornotor · August 14, 2024, 01:41:59 PM

Quote from: Greelan on August 14, 2024, 01:35:48 PMI thought routers/firewalls weren't that disk intensive? I don't have OPN configured to do excessive firewall logging.

If you enable the insights (netflow), it will eat your drive for lunch if its a small one. RRD was also already mentioned, plus ZFS filesystem is CoW - so that does not treat the SSDs too gently either.

meyergru · August 14, 2024, 03:22:45 PM

Yes, but there is a difference between RRD and Netflow: RRD is stored in /var/db/rrd, which is always placed on disk, wherease Netflow is in /var/log/flowd*, which can be kept in RAM if you enable that under System: Settings: Miscellaneous.

I always do that, because I usually do not care about /var/log after a reboot.

ZFS sure does a lot of writes, however, regardless of COW, a rewrite is still the same as a write for flash-based memory, a new (or reallocated) block is used anyway, so under the hood, flash memory acts like COW anyway from a wearout perspective.

As for a recommendation, I would probably try a Lexar NM790 or NM800 Pro with 1 TByte. They have >= 1 PByte lifetime writes. The 2 TByte models are not much more expensive. There are also version with an integrated heatsink.

Greelan · August 24, 2024, 02:46:13 AM

Returning to say I've replaced the disk and it was a very smooth process. Huge kudos to Franco, Ad and Jos and all other contributors for making it so!

franco · August 24, 2024, 04:02:15 AM

Happy to hear. Just curious how old this ZFS install was?

Cheers,
Franco

Disk read errors

Greelan

August 13, 2024, 01:49:07 PM

meyergru

August 13, 2024, 02:18:19 PM #1

Greelan

August 14, 2024, 12:36:20 PM #2

Patrick M. Hausen

August 14, 2024, 12:37:22 PM #3

Greelan

August 14, 2024, 12:46:39 PM #4

doktornotor

August 14, 2024, 01:01:43 PM #5

Greelan

August 14, 2024, 01:15:42 PM #6

doktornotor

August 14, 2024, 01:22:44 PM #7

Greelan

August 14, 2024, 01:24:00 PM #8

meyergru

August 14, 2024, 01:27:37 PM #9 Last Edit: August 14, 2024, 01:34:17 PM by meyergru

Greelan

August 14, 2024, 01:35:48 PM #10

doktornotor

August 14, 2024, 01:41:59 PM #11

meyergru

August 14, 2024, 03:22:45 PM #12

Greelan

August 24, 2024, 02:46:13 AM #13

franco

August 24, 2024, 04:02:15 AM #14