Update issues when upgrading from version 25.7.11 to 26.1.x

Started by LP, June 15, 2026, 02:46:06 PM

Previous topic - Next topic
Hello everyone,

we have a total of 4 sites running OPNsense installations. Unfortunately, we are experiencing update issues at 3 of these sites. Here are the details:

OPNsense (virtualized)
Installed version: 25.7.11_9
Hypervisor: VMware ESXi-7.0U3w-24784741-standard
Hardware: HPE ProLiant DL325 Gen10 Plus v2 server

After updating OPNsense to version 26.1.x, the following errors occur during boot:

Startup log excerpt:

Trying to mount root from ufs:/dev/ufs/OPNsense [rw,noatime]...
Root mount waiting for: CAM
Mounting filesystems...
tinefs: soft updates remains unchanged as enabled
tunefs: issue TRIM to the disk remains unchanged as enabled
** /dev/ufs/OPNsense

...

(da0:mptt0:0:1:0): UNMAP failed, switching to WRITE SAME(16) with UNMAP BIO_DELETE
(da0:mptt0:0:1:0): UNMAP. CDB: 42 00 00 00 00 00 00 00 08 00
(da0:mptt0:0:1:0): CAM status: SCSI status error
(da0:mptt0:0:1:0): SCSI status: Check Condition
(da0:mptt0:0:1:0): SCSI sense: ILLEGAL Request asc:24,0 (invalid field in CDB)
(da0:mptt0:0:1:0): Command byte 7 is invalid
(da0:mptt0:0:1:0): Error 22, Unretryable error
g_vfs_done():ufs/OPNsense[DELETE(offset=55167287296, length=4096)]error=5

...

(da0:mptt0:0:1:0): WRITE SAME(16). CDB: 93 08 00 00 00 00 00 d8 d1 8f 00 00 00 40 00 00
(da0:mptt0:0:1:0): CAM status: SCSI Status Error
(da0:mptt0:0:1:0): SCSI status: Check Condition
(da0:mptt0:0:1:0): SCSI sense: Vendor Specific asc:80, 85 (Vendor Specific ASC)
(da0:mptt0:0:1:0): Info 0
(da0:mptt0:0:1:0): Error 5, Unretryable error
g_vfs_done():ufs/OPNsense[DELETE(offset=7275184128, lenght=32768)]error=5

Even after OPNsense has booted up, error messages appear spontaneously on the login screen:

FreeBSD/amd64                 (ttyv0)

login: (da0: pvscs i0:0:1:0): WRITE SAME(16). CDB: 93 08 00 00 00 00 02 65 58 cf 0
0 00 00 40 00 00
(da0: pvscs 10:0:1:0): CAM status: SCSI Status Error
(da0: pvscs 10:0:1:0): SCSI status: Check Condition
(da0:pvscs i0:0:1:0): SCSI sense: Vendor Specific asc:80,85 (Vendor Specific ASC)
(da0:pvscsi0:0:1:0): Info: 0
(da0:pvscsi0:0:1:0): Error 5, Unretryable error
g_vfs_done():ufs/OPNsense[DELETE(offset =20580499456, length=32768) lerror = 5
g_vfs_done():ufs/OPNsense[DELETE(offset =20580564992, length=32768) Jerror = 5
g_vfs_done():ufs/OPNsense[DELETE(offset=20580466688, length=32768) lerror = 5

Difference compared to the other site where there are no issues -> Proxmox virtualization

Research into the problem revealed that the log entries may indicate an issue with the TRIM command (UNMAP) in virtualized environments. The VM resides on a thin-provisioned SSD storage pool. OPNsense attempts to free up unused storage blocks, but the virtualization host or the emulated controller does not correctly interpret the SCSI commands (UNMAP / WRITE SAME). This can lead to file system errors (Error 5, Error 22).

Tried a quick fix — disable UNMAP/TRIM

echo 'vfs.unmap_enabled=0' >> /boot/loader.conf.local
Long-term configuration (host side)

If TRIM is to be used (to keep the VM disk thin):

Proxmox: Set the controller to "VirtIO SCSI single" and enable the "Discard" option for the disk.
VMware: Check that the virtual hardware version is up to date and the correct controller type (e.g., VMware Paravirtual) is selected.


So, I switched the storage controller in VMware to Paravirtual. I also updated the firmware versions for all host server components, as outdated RAID controller firmware can sometimes be the cause. I booted OPNsense in single-user mode and ran a filesystem check using the `fsck` command; the output confirmed that the "FILE SYSTEM IS CLEAN." I tested the update repeatedly.

Despite this, the errors persist when starting OPNsense after the update, and I had to roll back to a pre-update snapshot. Does anyone have any ideas regarding the cause or how to fix this? It is possible that the VM's .vmdk file is corrupted, but right now I'm a bit stumped. I'm happy to provide further information if needed. I would appreciate any help!

Best regards,
Luca

The easiest would be to prepare a new vm on zfs, install all updates and plugins, import the configuration file for the respective vm to be replaced and swap the VMs

Hi,
your thin-provisioned SSD storage pool, is that a iscsi LUN or a DAS?

Because your problem sounds more like a SCSI error. 
I wouldn't use a PVSCSI controller for a VM like this; instead, I'd use a standard LSILogic SAS controller.
That's because a PVSCSI only really shows its speed when you have a huge amount of random I/O (e.g., HANA DB). But that's not the case here.
Why don't you set up a new OPNsense VM with an LSILogic controller and import the configuration (just to test whether your SCSI errors are gone)?

Markus

Hi,

Thanks for your replies. The thin-provisioned SSD storage pool is DAS (storage installed directly in the server). I've considered setting up a new OPNsense instance and importing the configuration, though having to do that three times would obviously be a hassle. The issue is that we use OPNsense in school environments in conjunction with Linuxmuster as the school server; during the initial setup, various keys and configurations are exchanged and provisioned. I haven't found any information on whether I can simply swap out the OPNsense VM, given the dependencies involving LDAP and proxy authentication. I suspect the trust relationship would break if I replaced the VM. I'll try checking the Linuxmuster forum to see if anyone has experience with this and can provide some insight. That likely leaves setting up a new OPNsense VM in parallel, importing the config, and testing to see if everything works.

Does the configuration backup save everything, or do any manual adjustments need to be made after restoring the config?

Thanks!

Best regards,
Luca

Quote from: LP on June 16, 2026, 10:30:04 AMDoes the configuration backup save everything, or do any manual adjustments need to be made after restoring the config?
What is set or changed using the UI gets committed to the config file that is then used for import. Anything else is not i.e. done in the filesystem directly.

Here exactly the same problem on Hyper V 2019 Gen 2
Hardware: HPE DL360 Gen10, 2x Xeon Silver 4210R, HPE Smart Array P408i-a SR Gen10 with battery.

Its not only the errors, it freezes about daily - especially when there is a lot of traffic.



Isn't there an option to "quiesce" the guest file system or not when creating the snapshot. Can you try not to stop I/O during snapshot creation?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Nothing that I can find.

But in the link, comment 6, is stated:
thomaslauer 2019-05-20 19:05:15 UTC

Hi, i have 120 PFSense VMs from 2.3.4 to 2.4.4-2 all Hyperv VMs with GEN2 and UFS.
and some vms with Hyperv GEN2 and UFS. All this VM has the same issue.

I have only one VM with GEN2 and ZFS. This VM has no SCSI Errors during the snapshot.


In comment 7 is stated:
Nick 2019-05-31 16:53:28 UTC

I am having this problem with PFSense (2.4.4-RELEASE-p3) running on Hyper-V (Windows 2012 R2). In my case, replication might run fine for a while (hours, days) but at some point there is a SCSI Status Error during a WRITE operation and PFSense/FreeBSD will become locked up or partially working but eventually will not respond to network or UI requests.  I would love a resolution to this.  For the moment, I've disabled replication and it's been fine.

Perhaps useful, perhaps not:  I've been running PFSense for years as a replicating VM on Hyper-V W2K2012R2 without issues.  It was just this week when I started having problems.  PFSense was previously running many different versions (2.2, 2.3, 2.4.3).  When I started having problems this week I had not upgraded PFSsense or the hypervisor.  As far as I can tell "nothing changed".

Nick

The problem is in comment 7: at some point there is a SCSI Status Error during a WRITE operation and PFSense/FreeBSD will become locked up or partially working but eventually will not respond to network or UI requests.

I hope the solution in comment 6: I have only one VM with GEN2 and ZFS. This VM has no SCSI Errors during the snapshot.
So, I am going to test this, and see what that brings - but in the past, I had other problems using ZFS ...

Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

I use production - needed for backup.
But I also tried Standard - just made a standard snapshot by hand and removed it again.

Both production as standard snapshot causes the errors to be displayed in the OpnSense VM.
No difference that I can see.

Quote from: tverweij on Today at 06:33:53 PMI use production - needed for backup.
But I also tried Standard - just made a standard snapshot by hand and removed it again.

Both production as standard snapshot causes the errors to be displayed in the OpnSense VM.
No difference that I can see.


Wait a minute.
I just disabled the option "Create standard checkpoints if the guest does not support creation of production checkpoints".
And now it can not create a checkpoint anymore.

This means Standard checkpoints are used - Production checkpoints are somehow not supported by the OpnSense VM.

Production is not working because FreeBSD is not supported in Hyper-V. Only Microsoft can change that. Standard seems to have the issues you describe - again IMHO only MS can change that.

You could shutdown the VM at night, take a snapshot, boot up again.

Or schedule configuration backups to Nextcloud, git, SFTP, ... your choice really. And not rely on VM snapshots for backup.

Also there is a perfectly capable snapshot mechanism within OPNsense if you install with ZFS. So if I was to insist running an u supported guest OS in my hypervisor I would at least not plan with any of the advancec hypervisor mechanisms like snapshots to be working but use different means.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)