I'm running OPNsense 23.7 on a Intel N100 with 8GB RAM and a 256GB NVMe. The only plugins I've installed additionally are: os-ddclient, os-udpbroadcastrelay and os-wireguard. I've not changed any logging configuration.
S.M.A.R.T. shows me 6GB of "Data Units Written" per day but the disk usage itself doesn't increased (used space it at 21GB constantly). Because of that I've configured the RAM disk for /tmp and /var/log but the "Data Units Written" remains unchanged. Using top for I/O stats I can't see any process that is doing writes continuously.
So my question is: What causes this disk writes and how to stop it?
SMART shows the value increase by 6 GB every day? It's not a daily value, it's cumulative since the SSD was built.
i am not 100% on this, but i saw similar and have since disabled netflow, since i dont use it, and the multiple gigabytes a day writes seemed to have stopped.
also netflow's backup which does tar/gz with 1 thread, was also taking forever, uneven blocking shutdown for minutes on end with no way to turn on pigz for multi-threaded compression, so living life without netflow now, lol
Netflow records more or less every single packet. That's a bit simplified because that's consolidated into "flows", but still. So it's expected. Don't enable it if you don't use it.
Quote from: Patrick M. Hausen on January 03, 2024, 09:19:05 PM
SMART shows the value increase by 6 GB every day? It's not a daily value, it's cumulative since the SSD was built.
The machine has been setup 5 days ago and you can see a daily increase. And there is a constant write I/O visible in my monitoring tool.
Quote from: Patrick M. Hausen on January 03, 2024, 09:41:56 PM
Netflow records more or less every single packet. That's a bit simplified because that's consolidated into "flows", but still. So it's expected. Don't enable it if you don't use it.
Netflow isn't enabled (Listening and WAN interface have nothing selected).
I had experience with high writes, too. For me the high amount of writes are caused by zenarmor.
I solved this by using a rotational hard disk for /var and move zenarmor from /usr/local/zenarmor to /var/zenarmor.
What monitoring tool are those screenshots from?
Quote from: CJ on January 04, 2024, 02:41:22 PM
What monitoring tool are those screenshots from?
Doesn't answer my question ;) But, I'm using CheckMK.
Found it.
What is my problem: My NVMe has a constant disk write (I/O) of ~120kB/s but the used space doesn't increase. So it must be related to anything changing existing files.
Notice, that I don't have enabled any RAM disks currently. I've identified *two* options to eliminate the disk writes:
1) As mentioned on https://forum.netgate.com/post/866236 (https://forum.netgate.com/post/866236), I've modified /etc/fstab to mount root partition with noatime option. This reduces the write I/O to ~60kB/s.
2) I've disabled Round-Robin-Database (can be found in GUI under Reporting > Settings), because I don't need these graphs. This reduces the write I/O to nearly 0.
I wonder, why there isn't much information on that. Maybe most people don't care about the wear-out of SSDs.
A typical mSATA SSD used for embedded systems like the Transcend m370 series has got a TBW value of 180 for the 64 GB model.
At 120 kB/s write and using multiples of 1000 for calculation this means you can write to this disk for 1500000000 seconds = 416667 hours = 17361 days = 47.5 years before reaching the specified TBW.
Similarly your initial figure of 6 GB per day results in 30000 days or 82 years.
yeah, we talk about microscopic waste, but electricity + wearout != sustainability ;)
Quote from: Patrick M. Hausen on January 05, 2024, 04:59:57 PM
A typical mSATA SSD used for embedded systems like the Transcend m370 series has got a TBW value of 180 for the 64 GB model.
At 120 kB/s write and using multiples of 1000 for calculation this means you can write to this disk for 1500000000 seconds = 416667 hours = 17361 days = 47.5 years before reaching the specified TBW.
Similarly your initial figure of 6 GB per day results in 30000 days or 82 years.
As I work on virtual machines (also for OpnSense), writes are always bad.
It is not only about wear of the SSDs used.
More important is that every write costs replication bandwidth, local backup space, cloud backup space and also causes the disks to wear.
9 GB (120kbs/sec) does not seem much, but translates to 9Gb traffic, 18Gb storage (replication), 18 Gb backups, 18 GB traffic for offsite backups and 18 GB of paid storage on the offsite backups. Per day.
Do This for 50 VM's and you find that a continuous write of 120 kB/sec is really a big waste of resources and adds a lot to the operational costs.
I can neither find "UFS" nor "ZFS" but I'm assuming ZFS:
https://github.com/opnsense/core/commit/269b9fbaf
It was patched in 23.7.12:
https://github.com/opnsense/changelog/blob/930133dafada3dbbe5bb63ffd5e5f4d9ecd61437/community/23.7/23.7.12#L21
Cheers,
Franco
Hi Franco,
according to my understanding of ZFS it won't make much of a difference if you flush transaction groups every 5 or every 90 seconds. What needs to be flushed to disk will be flushed to disk. And ZFS never does in-place overwrites. So if you write every 5 seconds or 18 times the transaction groups every 90 seconds ...
If @tverweij running virtualised instances is indeed using ZFS that specifically is a bad idea. Due to its copy on write nature you cannot thin provision virtual disks (well you can, but it doesn't make sense) because ZFS will eventually write every single disk block and so blow up the disk to its configured maximum size.
For virtual disks it's much better to use UFS and manage snapshots and backups at the hypervisor host level.
HTH,
Patrick
We looked at the data at hand and the TXG sync writes some 5 MB (a bit more or less) per sync and that's just metadata without ANY changes.
You can observe this with:
# zpool iostat -v 1
And fiddling with the timeout value vfs.zfs.txg.timeout
Cheers,
Franco
PS: On my current install it's about 1.2 MB per sync, but I know we've seen more before.
On my OPNsense with the transaction group timeout restored to 5 seconds it's 600-800k every flush. So that's matching with the 120k/s some of the other posters observe.
I wasn't aware that it's just metadata, so thanks. Will be a point to raise at the next ZFS production users call.
I think my argument about not using ZFS for virtual disks still holds.
Kind regards,
Patrick
I'm unsure how the disk size plays into this. Bigger disk may mean more metadata.
20GB writes per day was observable with users...
Cheers,
Franco
Quote from: Patrick M. Hausen on January 30, 2024, 10:41:30 PM
What needs to be flushed to disk will be flushed to disk. And ZFS never does in-place overwrites. So if you write every 5 seconds or 18 times the transaction groups every 90 seconds ...
If @tverweij running virtualised instances is indeed using ZFS that specifically is a bad idea. Due to its copy on write nature you cannot thin provision virtual disks (well you can, but it doesn't make sense) because ZFS will eventually write every single disk block and so blow up the disk to its configured maximum size.
For virtual disks it's much better to use UFS and manage snapshots and backups at the hypervisor host level.
That is a good word of advice.
Maybe a good idea to add this advice to the installation docs?
Because I chose ZFS because the docs say:
"Install (UFS|ZFS) - Choose UFS or ZFS filesystem. ZFS is in most cases the best option as it is the most reliable option, but it does require enough capacity (a couple of gigabytes at least)."
Quote from: Patrick M. Hausen on January 30, 2024, 10:41:30 PM
Hi Franco,
according to my understanding of ZFS it won't make much of a difference if you flush transaction groups every 5 or every 90 seconds. What needs to be flushed to disk will be flushed to disk. And ZFS never does in-place overwrites. So if you write every 5 seconds or 18 times the transaction groups every 90 seconds ...
If @tverweij running virtualised instances is indeed using ZFS that specifically is a bad idea. Due to its copy on write nature you cannot thin provision virtual disks (well you can, but it doesn't make sense) because ZFS will eventually write every single disk block and so blow up the disk to its configured maximum size.
For virtual disks it's much better to use UFS and manage snapshots and backups at the hypervisor host level.
HTH,
Patrick
When running a virtualized OPNsense on ZFS, the issue of blowing up disks can be solved by turning on "discard" in the Proxmox VM settings and setting up a daily ZFS trim cron job within OPNsense.