Disk read errors

Started by Greelan, August 13, 2024, 01:49:07 PM

Previous topic - Next topic
Almost exactly 3 years. It was a conversion from a UFS install and so the disk/system is around 4 years old or so

Ok, this could coincide with

community/23.7/23.7.12:o system: change ZFS transaction group defaults to avoid excessive disk wear

We did have to apply this change because ZFS was wearing out disks with its metadata writes too much even when absolutely no data was written in the sync interval. You could say that ZFS is an always-write file system. Because if you always write the actual data written will wear the drive, not the metadata itself. ;)

In your case it has probably been wearing out the disk before this was put in place. That's at least 2 years worth of increased wear.


Cheers,
Franco

Interesting. That change got past me. I had excessive wear on my DEC750 SSD which was bought in 2022 after only somewhat more than a year.

The disk is now at 56% usage, but currently increasing very slowly.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

The interesting value for nominal/guaranteed endurance can be viewed with smartctl -a or smartctl -x. For an NVME drive it's "Percentage Used:" while for a SATA drive it's "Percentage Used Endurance Indicator".

In this particular case from one of your first posts:

Percentage Used:                    100%

So the disk is worn out according to specs and apparently in reality, too.

I monitor the wear indicators for my NAS systems in Grafana like in the attached screen shot.

Kind regards,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on August 24, 2024, 01:18:44 PM
The interesting value for nominal/guaranteed endurance can be viewed with smartctl -a or smartctl -x. For an NVME drive it's "Percentage Used:" while for a SATA drive it's "Percentage Used Endurance Indicator".

In this particular case from one of your first posts:

Percentage Used:                    100%

So the disk is worn out according to specs and apparently in reality, too.

I monitor the wear indicators for my NAS systems in Grafana like in the attached screen shot.

Kind regards,
Patrick
Yeah, we already established that, and that's why the disk has been replaced. The last post before yours wasn't from me xD

Quote from: franco on August 24, 2024, 05:32:26 AM
Ok, this could coincide with

community/23.7/23.7.12:o system: change ZFS transaction group defaults to avoid excessive disk wear

We did have to apply this change because ZFS was wearing out disks with its metadata writes too much even when absolutely no data was written in the sync interval. You could say that ZFS is an always-write file system. Because if you always write the actual data written will wear the drive, not the metadata itself. ;)

In your case it has probably been wearing out the disk before this was put in place. That's at least 2 years worth of increased wear.


Cheers,
Franco

Franco, was this change applied to existing systems, not just new installations?

Because two months into my new disk, I already have 23 TB of writes ...

It's in effect for all systems beginning with 23.7.12 unless the sysctl is overwritten. Note this lowers the writes but does not eliminate them. IMO this is a ZFS design flaw flushing metadata for an unchanged file system, it's probably keeping track of itself more than the actual data, but it is what it is.


Cheers,
Franco