OPNsense Forum

Archive => 24.7, 24.10 Legacy Series => Topic started by: kevindd992002 on August 09, 2024, 06:25:54 PM

Title: Corrupted opnsense installation
Post by: kevindd992002 on August 09, 2024, 06:25:54 PM
Yesterday at around 12PM GMT+8, my whole network at home went down. I wasn't at home until now so I thought it was just an ISP issue. But then I checked my opnsense VM (proxmox hypervisor) and saw that opnsense was rebooting in an endless loop as shown here:

https://youtu.be/EovhQ0KbWnc

I forgot which version I am on but I last updated within a month or so, so it's pretty recent. I have cloud backups to my Google drive and for some reason the latest backup i see there is July 24, 2024. I don't remember making any changes since then so I guess that's fine if I need to restore from backup.

However, before restoring, I want to know what caused the issue in the first place so I'm good with troubleshooting this. I don't have a lot going there aside from having the AdGuard Home plugin running as a plugin and a couple of other plugins that I'm not yet actively using.

Thoughts?
Title: Re: Corrupted opnsense installation
Post by: meyergru on August 09, 2024, 06:46:54 PM
The root cause seems to be a bad directory inode at /, also you get a warning about an unproperly dismounted root volume.

However, it could well be that the root filesystem got corrupted because it was full. I do not know how you configured your logging, but AdguardHome might be the culprit. Such as it is, you could try to start a FreeBSD single user and try to repair the filesystem or at least to verify it is full, like I think.

To get the VM up and running again, I would either rewind to an earlier Proxmox snapshot (I use this (https://github.com/Corsinvest/cv4pve-autosnap)), backup or even reinstall and reload the OpnSense configuration. Probably, the latter is the best way to proceed,  because you seem to be on UFS and I would recommend ZFS.
Title: Re: Corrupted opnsense installation
Post by: kevindd992002 on August 09, 2024, 06:53:02 PM
Ahh, that makes sense.

I forgot to also mention that I forgot to take even a single snapshot of this VM! I'm new to Proxmox and opnsense (I'm a longtime pfsense user but I had enough of their BS) so forgive my noobness.

As for the reason why I chose UFS over ZFS for opnsense is because proxmox is already running on ZFS. According to my research, running a ZFS vDisk on a ZFS Proxmox OS is useless and redundant. Is this not the case?
Title: Re: Corrupted opnsense installation
Post by: Patrick M. Hausen on August 09, 2024, 06:56:35 PM
It is the case but only if you take regular snapshots  ;) The ZFS fundament does not protect your UFS from getting corrupted by an unclean shutdown, but it gives you a chance to rewind.
Title: Re: Corrupted opnsense installation
Post by: meyergru on August 09, 2024, 06:59:06 PM
Correct, ZFS on ZFS does not excatly help with performance, but OpnSense does not need disk performance and having an indestructible filesystem like ZFS is a plus - as you can see here.

Using something like cv4pve-autosnap prevents you from ever coming into the situation where you think: Damn, I wish I had done a snapshot before this. And it comes cheap if Proxmox is running on ZFS.
Title: Re: Corrupted opnsense installation
Post by: Patrick M. Hausen on August 09, 2024, 07:02:31 PM
I advise against ZFS on ZFS because it thwarts thin provisioning of virtual disks. Apart from that of course it works.
Title: Re: Corrupted opnsense installation
Post by: doktornotor on August 09, 2024, 07:10:26 PM
Quote from: kevindd992002 on August 09, 2024, 06:53:02 PM
running a ZFS vDisk on a ZFS Proxmox OS is useless and redundant. Is this not the case?

Shrug. You've seen the result with UFS... Don't use thin provisioning. Wouldn't use snapshots on the host either. Other than that, my experience with UFS has been nothing but pure nightmare; won't touch it even with 10ft pole.
Title: Re: Corrupted opnsense installation
Post by: kevindd992002 on August 09, 2024, 07:44:09 PM
Quote from: meyergru on August 09, 2024, 06:59:06 PM
Correct, ZFS on ZFS does not excatly help with performance, but OpnSense does not need disk performance and having an indestructible filesystem like ZFS is a plus - as you can see here.

Using something like cv4pve-autosnap prevents you from ever coming into the situation where you think: Damn, I wish I had done a snapshot before this. And it comes cheap if Proxmox is running on ZFS.

So ZFS on ZFS it is then. Do I still get the advantage of ZFS even if I just have one disk for Proxmox (copies = 2) and one vDisk for opnsense?

As for the thin provisioning concern, I have thin provisioning enabled on the Proxmox ZFS disk but I can't see an option for the opnsense VM vDisk so not sure what it is set to currently. No to thin provisioning on vDisks too if I want ZFS on ZFS?
Title: Re: Corrupted opnsense installation
Post by: kevindd992002 on August 09, 2024, 08:54:10 PM
I checked the size of /dev/vtbdp03 which is the freebsd-ufs partition and only 7.1G out of 54G is used.

Title: Re: Corrupted opnsense installation
Post by: Patrick M. Hausen on August 09, 2024, 08:56:55 PM
It does not matter what the VM sees inside the virtualised enwironment. With ZFS in the VM the disk will always grow to the maximum size provisioned outside of the VM. Even if inside only a fraction is really used.

That doesn't matter much if the disk is just 30 or maybe 60 G in size. Any reasonable hypervisor host should have that much space for a VM.
Title: Re: Corrupted opnsense installation
Post by: meyergru on August 09, 2024, 09:00:28 PM
The advantage of ZFS for the OpnSense VM in itself that is is nearly indestructible even by problems in the VM itself, unlike UFS. ZFS has checksumming on all levels and is a pure transactional COW filesystem, whatever the host does.

Redundancy like raid-z2 should be handled on the Proxmox host. The thin provisioning of the VM disks there will not be effective, because of the COW in the VM guest overwrites everything at some point, but you can use "discard" for the VM disk to reuse those free blocks.
Title: Re: Corrupted opnsense installation
Post by: Patrick M. Hausen on August 09, 2024, 09:02:08 PM
Quote from: meyergru on August 09, 2024, 09:00:28 PM
[...] but you can use "discard" for the VM disk to reuse those free blocks.
Can you give me a pointer to that? Interesting, I wonder how that should work. How does the host know which blocks are free in the guest? TRIM?
Title: Re: Corrupted opnsense installation
Post by: meyergru on August 09, 2024, 09:07:05 PM
Correct. The trim command is used to discard the blocks. Under Proxmox, you can even enable SSD emulation.

BTW: If /dev/vtbdp03 is only 7.1 of 54 GByte, the maybe it was not logging that caused the problem. But I do not know which partitions are used under UFS, so do not know it that one holds the log data.
Title: Re: Corrupted opnsense installation
Post by: kevindd992002 on August 09, 2024, 09:09:51 PM
Quote from: meyergru on August 09, 2024, 09:07:05 PM
Correct. The trim command is used to discard the blocks. Under Proxmox, you can even enable SSD emulation.

BTW: If /dev/vtbdp03 is only 7.1 of 54 GByte, the maybe it was not logging that caused the problem. But I do not know which partitions are used under UFS, so do not know it that one holds the log data.

Ok, I think I'm convinced to use ZFS on the VM. However, I need to have this corrupted VM running first so I can just copy the settings from the old to the new VM manually, side-by-side. Or can I use a UFS opnsense backup xml to reload the config in a ZFS opnsense?

These are the partitions in a UFS system:

https://media.discordapp.net/attachments/1271505834261614713/1271537474346287147/image.png?ex=66b7b30d&is=66b6618d&hm=8b8941d81ef22e5536dc55d384c91b787e610f4e9614211d3f68fd740733d484&=&format=webp&quality=lossless
Title: Re: Corrupted opnsense installation
Post by: kevindd992002 on August 09, 2024, 09:12:37 PM
I also tried fixing the ufs fs:

https://cdn.discordapp.com/attachments/1271505834261614713/1271540560439541952/image.png?ex=66b7b5ed&is=66b6646d&hm=7777d32610eeb1cc0f9e01dae43fe4d59b4758ec48462719151f91294be6156f&

Then tried booting normally again and I didn't see the mounting warning anymore. However, it still rebooted like before and then I'm back to square one with the mounting warning/error.
Title: Re: Corrupted opnsense installation
Post by: doktornotor on August 09, 2024, 09:17:03 PM
Attempting to fix UFS... hmmm, good luck. I'd boot from from ISO and try to mount the broken thing read-only, hoping the kernel won't panic on that.
Title: Re: Corrupted opnsense installation
Post by: Patrick M. Hausen on August 09, 2024, 09:18:10 PM
Quote from: kevindd992002 on August 09, 2024, 09:09:51 PM
Or can I use a UFS opnsense backup xml to reload the config in a ZFS opnsense?
Yes, of course. OPNsense itself (and the configuration backup/restore) is completely oblivious of the filesystem the underlying OS uses.
Title: Re: Corrupted opnsense installation
Post by: meyergru on August 09, 2024, 09:19:37 PM
Yes, you could use another FreeBSD/OpnSense VM and attach the virtual disk to it and try to mount it and copy /conf/config.xml, hoping that this file is intact.
Title: Re: Corrupted opnsense installation
Post by: kevindd992002 on August 10, 2024, 10:17:49 AM
Do I need to reinstall all packages even after installing with the recovery xml?

Sent from my SM-S916B using Tapatalk

Title: Re: Corrupted opnsense installation
Post by: newsense on August 10, 2024, 10:32:37 AM
All the OPNsense plugins will be reinstalled when you check for updates.

For the third party ones you'll have to reinstall, hopefully you have copies of the config files as they're not included in the OPNsense one.
Title: Re: Corrupted opnsense installation
Post by: kevindd992002 on August 11, 2024, 04:57:35 PM
Got it. In the conf file:

1. Will I be able to see which version it is from? And is it best to create a new opnsense vm that's exactly the same version as my old corrupted opnsense VM?

2. Does it list all third-party plugins that I have installed? Aside from AdGuard Home, I'm pretty sure there are a few more third-party plugins that I installed but haven't used yet. I just want to make sure I'm not missing anything.
Title: Re: Corrupted opnsense installation
Post by: kevindd992002 on August 29, 2024, 06:42:21 PM
Quote from: newsense on August 10, 2024, 10:32:37 AM
All the OPNsense plugins will be reinstalled when you check for updates.

For the third party ones you'll have to reinstall, hopefully you have copies of the config files as they're not included in the OPNsense one.

To follow-up on this, the opnsense plugins do not get reinstalled when you "check for updates". They get reinstalled when you click on the "resolve plugins conflict" button under System -> Firmware -> Status.
Title: Re: Corrupted opnsense installation
Post by: franco on August 29, 2024, 09:00:38 PM
FWIW, on the console it does sync back plugins for lack of "resolve" screen estate since 2021.

https://github.com/opnsense/core/commit/7165b665eb


Cheers,
Franco