OPNsense Forum

English Forums => 25.1, 25.4 Production Series => Topic started by: FullyBorked on May 22, 2025, 02:59:32 PM

Title: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 22, 2025, 02:59:32 PM
Super long frustrating story short as I can make it.  Had some severe weather, lost the PSU in my OPNsnese box.  Had a spare dell PC with a quad port nic I keep as a backup in case of a hardware failure. I need to know if I've hit a bug that I need to report or if this is just ignorance on the restore process.

So, to try and recover here is what I did.  I created my bootable ISO, installed my backup PC in place, booted the opnsense media >   

The config import option presented in the console on install seems unable to mount a separate USB drive with my backup xml. I don't know if this means the usb drive is formatted in a format it cannot read or if there is something more specific it's looking for.  But no matter in the past I just booted the live environment, imported backup, updated interfaces and moved on.  So that's the path I headed down >

Booted into the live environment, imported my backup file, and I am met with the expected interface assignment warning.  That's where this all fell apart for me.

I have the following interface:

-I navigate to interfaces > assignments, then change the wan and lan interface to match the physical nics they are now plugged into on the new hardware.  Click save but it warns that it can't save due to the vlan parent interfaces being wrong and resets my assignment.
-Ok, so navigate to interfaces > devices > vlans and attempt to change their parent interfaces.  But then it won't let me save due to the vlan naming.  Apparently my naming is old and has the interface name in the name (ix0_vlan10) for example, so I decide to just fix it to start with "vlan" as the warning dictates.  But it won't let me save it because "Interface cannot be changed while assigned". So it sorta deadlocks me...
-Fine so let's go just delete the assignment, can't it's in a group...
-Fine go remove the vlan's from the group their in in the firewall config
-Back to assignments, delete vlan assignments, assign lan and wan to their correct nic, and click apply and I lose my connectivity to OPNsense entirely no idea why...

I did this process on repeat maybe 5 or 6 times with trying slight variations each time. Every time it ended in loss of connectivity to the LAN interface on my firewall. 

With all that said, is this user error? Do I not understand the process? Or has my old vlan naming config deadlocked me into being unable to restore to dissimilar hardware?  If so how do I fix this config so that I don't get so locked up in this situation next time?   

This time I was able to rob a WAY too large PSU out of a gaming computer to get my original OPNsenese box powered back up and a replacement part ordered but if this was an issue that I couldn't have repaired I would have been reinstalling and configuring OPNsense from scratch.   
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: EricPerl on May 22, 2025, 11:14:16 PM
I just did an import on bare metal (from a VM config). I used a FAT USB.
The drive had the entire /conf directory copied from the source. You need at least /conf/config.xml.
Before the import, I searched/replaced all physical device names (vtnet -> igc in this case).
My vlan devices were already vlan0_<VLANID> so I didn't have to convert these.
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: cookiemonster on May 22, 2025, 11:21:28 PM
It has to be any FAT except extFAT
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 23, 2025, 01:22:20 AM
Quote from: EricPerl on May 22, 2025, 11:14:16 PMI just did an import on bare metal (from a VM config). I used a FAT USB.
The drive had the entire /conf directory copied from the source. You need at least /conf/config.xml.
Before the import, I searched/replaced all physical device names (vtnet -> igc in this case).
My vlan devices were already vlan0_<VLANID> so I didn't have to convert these.

ah ok, so I can't just put my backup xml on a usb drive and it import it in the console?  Also I encrypt my backups so opening and adjusting doesn't seem like an option, I'd have to make an unencrypted backup to make that sort of modification.  I could make an unencrypted one now, but that doesn't help when my machine was borked. 
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 23, 2025, 01:23:25 AM
Quote from: cookiemonster on May 22, 2025, 11:21:28 PMIt has to be any FAT except extFAT

It's formatted in fat32, but from the above post sounds like it was looking for some config file maybe copied from the actual OS instead of my backup xml that get generated nightly. 
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: EricPerl on May 23, 2025, 01:35:23 AM
The file needs to be /conf/config.xml on the USB drive (NOT at the root).

I have a last known good version of the entire /conf of the appliance (which contains all versions).
Straight scp...
You can compare with a version obtained from your backups. I expect them to match.
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 23, 2025, 03:47:02 AM
Quote from: EricPerl on May 23, 2025, 01:35:23 AMThe file needs to be /conf/config.xml on the USB drive (NOT at the root).

I have a last known good version of the entire /conf of the appliance (which contains all versions).
Straight scp...
You can compare with a version obtained from your backups. I expect them to match.

so maybe if I had renamed the backup config.xml and created the /conf/ file structure it might have been able to find/read it? 


But even then as long as I have encrypted backups I'll still have the issue of figuring out the vlan/interface deadlock post restore. Unless there is some way to open the encrypted XML file outside of opnsense.  I feel like if this wasn't user error it might need a bug report.
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 23, 2025, 03:54:50 AM
I feel like maybe we've strayed off the issue at hand.  IMO at least editing backup conf files before a restore seems broken.  So either I need to fix my vlan config to match current standards so I don't get deadlocked or some other better understanding of the recovery process and interface assignments (if it's some user error on my part). 

Starting with the first one, what's the best way to rename these vlans to match current naming convention? 
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: Patrick M. Hausen on May 23, 2025, 08:29:38 AM
"vlan0*" as far as I know. Can be "vlan07" or "vlan0.7" or anything but must start with a 0.

Unless things changed again, recently :-)
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: ThuTex on May 23, 2025, 10:02:19 AM
Quote from: FullyBorked on May 22, 2025, 02:59:32 PMApparently my naming is old and has the interface name in the name (ix0_vlan10)

this is also something i had noticed after my last upgrade.
i think it's a good idea to move to that (standard?) naming for vlans, but they might want to have had something to prevent legacy machines from breaking on it (like automatically changing the names when upgrading, if that would be possible for example)

the best solution would just indeed be updating the configs to be vlan0.x instead of ix0_vlanx
the "maybe the devs can think of old users" option would be to have something in the restore that catches old naming conventions and automatically updates it.

i think it's just a scenario that's not been thought of to test (old interface naming of vlans, and restoring that to a different host is likely not a daily use case)

guess i'll be adding the "update vlan naming" to my todo list as well before i get bitten in the same way you did.
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: EricPerl on May 23, 2025, 09:27:05 PM
Quote from: FullyBorked on May 23, 2025, 03:54:50 AMI feel like maybe we've strayed off the issue at hand.  IMO at least editing backup conf files before a restore seems broken.  So either I need to fix my vlan config to match current standards so I don't get deadlocked or some other better understanding of the recovery process and interface assignments (if it's some user error on my part). 

Starting with the first one, what's the best way to rename these vlans to match current naming convention? 
I've done 3 or 4 configuration imports over the past 9 months. PCIe passthrough -> vtnet (twice), vtnet -> bare metal igc.
The only time I've ran into trouble was user error (mangling filenames while building an ISO, which would not have occurred with just /conf/config.xml).

All my VLANs follow the vlan0.VLANID convention (not _ per previous message). I never had to tinker with these.
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: ThuTex on May 23, 2025, 11:53:42 PM
i'm not sure when the switch from interface_vlanx to vlan0.x was implemented, but for my current machine (installed 2 years ago i believe - not sure if it was a clean install or still also an import back then) it was still with interface__vlanx and that was causing several issues with the current version (like when i updated mtu on an interface that interface just became unreachable - but after updating the device name it worked fine)

but, if you are aware of this (maybe it was mentioned in changelogs... should really start reading those again), then it's not too much of a big deal to update them as long as you have an alternate interface available.
in case of a restore however, i can very much see how this will break things.
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 24, 2025, 12:11:04 AM
Quote from: ThuTex on May 23, 2025, 11:53:42 PMi'm not sure when the switch from interface_vlanx to vlan0.x was implemented, but for my current machine (installed 2 years ago i believe - not sure if it was a clean install or still also an import back then) it was still with interface__vlanx and that was causing several issues with the current version (like when i updated mtu on an interface that interface just became unreachable - but after updating the device name it worked fine)

but, if you are aware of this (maybe it was mentioned in changelogs... should really start reading those again), then it's not too much of a big deal to update them as long as you have an alternate interface available.
in case of a restore however, i can very much see how this will break things.

Any guidance on how to properly update the vlan naming without blowing up everything?  Will it simply let me rename them?  Looks like I'll have to remove them from any groups, which wouldn't be too bad.
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 24, 2025, 03:48:00 AM
Did some testing, I can't figure out how to rename a vlan without blowing out it's interface config and firewall rules.  So far it looks like renaming is a total rebuild.  Is that really the case?
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: EricPerl on May 24, 2025, 03:51:29 AM
Download + Edit offline + Restore?
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 24, 2025, 03:57:34 AM
Quote from: EricPerl on May 24, 2025, 03:51:29 AMDownload + Edit offline + Restore?

That's funny enough what I was just looking at, seems like it should work as long as you don't miss any.  Might try one of my less critical vlans and see what the outcome is.
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 24, 2025, 04:08:00 AM
Quote from: FullyBorked on May 24, 2025, 03:57:34 AM
Quote from: EricPerl on May 24, 2025, 03:51:29 AMDownload + Edit offline + Restore?

That's funny enough what I was just looking at, seems like it should work as long as you don't miss any.  Might try one of my less critical vlans and see what the outcome is.


Initial test seems to have worked without issue. 
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: FullyBorked on May 24, 2025, 04:23:41 AM
Quote from: FullyBorked on May 24, 2025, 04:08:00 AM
Quote from: FullyBorked on May 24, 2025, 03:57:34 AM
Quote from: EricPerl on May 24, 2025, 03:51:29 AMDownload + Edit offline + Restore?

That's funny enough what I was just looking at, seems like it should work as long as you don't miss any.  Might try one of my less critical vlans and see what the outcome is.


Initial test seems to have worked without issue. 

Ok, I think the renaming via config file modification seems to have been successful. 
Title: Re: Recoving to dissimilar hardware ended in deadlock of interface assignment
Post by: ThuTex on May 24, 2025, 10:01:35 AM
Quote from: FullyBorked on May 24, 2025, 12:11:04 AMAny guidance on how to properly update the vlan naming without blowing up everything?

i see i was a bit late as you've already got it done, but here's how i did it:

- make sure you can reach the firewall from atleast 2 networks (i used my main machine + smartphone on 2 different vlans)
- create a temporary useless vlan
- go to assignments, take your main machine vlan, change it to the useless vlan you just created

(you'll no longer be able to access the fw through your main machine)

- login with your phone
- go back to devices, update the name
- back to assignments, setting the original vlan back to it's right assignment, save, and for good measure, restart dhcp (because this one now has no clue for dhcp on your newly named original interface)

then, repeat the process for all other vlans, which you can now comfortably do from your main machine, and then restart dhcp so all interfaces are picked up.

(and to end, if you added temporary rules for your phone to access the firewall, remove them once done -ofcourse)

bonus: to be sure that you don't completely F up, also go to snapshots, create a new snapshot before starting, and set the unchanged config to next-active.
if things go wrong, just reboot the firewall so it ends back up at the previous config and you can start over without having locked yourself out.
once finalized, set your changed config to be the next-active again.