Fix for potential ZFS corruption (and other) issues?

Started by meyergru, December 09, 2023, 12:34:03 AM

Previous topic - Next topic
December 09, 2023, 12:34:03 AM Last Edit: December 09, 2023, 09:08:26 AM by meyergru
Having followed some potential ZFS corruption issues that have been found with the ongoing OpenZFS 2.2 release but really were lurking since 2.0 already, I noticed that Proxmox has just issued new kernels and ZFS utilities with fixed Open ZFS 2.2.2.

I now saw a video by Lawrence Systems about pfSense having fixed those issues as well.

@Franco: Will those FreeBSD fixes make it into 23.7.10 or a hotfix version?

P.S.: There are also some security-relevant fixes in the latest pfSense release which probably triggered this comment by Tom...
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 440 up, Bufferbloat A+

Fixes are in 13.2-p7, but I doubt that warrants a hotfix for OPNsense. Just the regular upkeep with FreeBSD for the next patch release.

The circumstances to trigger this data corruption bug are so rare and hardly encountered on a firewall ever. Really. That's why the bug has been discovered only recently. It's been lurking in the code for years.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Lots of scrambling for other systems reliant on ZFS. Proxmox and TrueNAS for a start. PFSense are not known for their proactive patch schedule, especially for the CE edition, but they have moved pretty quickly on this one. I hope OPNsense follows suit.

I think the security-relevant fixes (e.g. potential TCP injection) is the more relevant part of this. However since my experiences back in the day with the "superior" ReiserFS, I am somewhat sensitive of potential data corruption in filesystems.

This is esspecially true for ZFS, which supposedly was to be the be-all and end-all of filesystems. Having recently jumped on the bandwagon on a larger scale, I was underwhelmed by some learnings I had about it since. To illustrate it:

1. ZFS supposedly is the best filesystem for Proxmox because it allows for live migration. Yes, but it also allows for linear snapshots only...

2. ZFS is not so well integrated in Linux - at least I had data corruptions when NCQ was enabled. That bug is still present.

3. The latest data curruption problems that have been fixed in OpenZFS 2.2.2 just now.

Obviously, data corruption in a supposedly bullet-proof filesystem like ZFS is doubleplusungood (tm), however unlikely it may be. Interestingly enough, these corruptions apparently occur when data is appended to a file, something which OpnSense does while logging all the time.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 440 up, Bufferbloat A+

It's trigerred when files with are written and a second process that must use system calls to detect and efficiently handles holes copies that file before the block are flushed to stable storage. In that case it might get a hole (a run of zeroes) instead of the actual data.

That does not happen with logging.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)