OPNsense Forum

Archive => 22.7 Legacy Series => Topic started by: cpower on August 10, 2022, 06:25:05 am

Title: OPNsense 22.7_4: Loss of Network Connectivity
Post by: cpower on August 10, 2022, 06:25:05 am
Hey all,

We upgraded to 22.7_4 and promptly lost some network connectivity after the upgrade. But this wasn't an all-or-nothing loss-- a few strange "patches" seemed to have corrected part of the issue, though not all. To explain:

Initially, we lost connection to the firewall itself via the OpenVPN firewall. This was because OPNSense was unable to query the FQDN of our IdP and we were able to regain access by adding pointing OPNSense to our IdP's DNS servers, allowing their IP addresses to resolve. In fact, from the console, the only IPs that can be pinged at all are those that are set in /etc/resolv.conf (set indirectly by the web interface).

From the web interface, we are unable to ping any external host when we use Interfaces > Diagnostics > Ping with Source Address set to Default. However, if we set Source Address to any of our other interfaces (including our LAN and WAN interfaces), we receive a successful ping result. This is reflected on the console-- for example, if we execute ping www.google.com -- we receive the following.

Code: [Select]
PING www.google.com (142.250.138.103): 56 data bytes
The process hangs until it times out. The good news is that it appears that the IP itself was resolved-- it's the actual ping that's failing. However, if we explicitly set any of the interface IPs, i.e. ping -S 10.0.0.1 www.google.com -- we receive the following expected output.

Code: [Select]
PING www.google.com (142.250.138.147) from 10.0.0.1: 56 data bytes
64 bytes from 142.250.138.147: icmp_seq=0 ttl=105 time=9.189 ms
64 bytes from 142.250.138.147: icmp_seq=1 ttl=105 time=9.136 ms
64 bytes from 142.250.138.147: icmp_seq=2 ttl=105 time=9.166 ms
64 bytes from 142.250.138.147: icmp_seq=3 ttl=105 time=9.067 ms
...

Again, this is successful with all explicitly-defined interfaces. The use of curl and other similar tools is also successful-- it fails when we don't explicitly specify an interface, and it succeeds when we do. Likewise, all machines that are behind the firewall have maintained their network connectivity, can reach out to the internet, and are otherwise operating normally. Tunnels that reach out to other datacenters are likewise operational.

When we looked at the firewall log live view while using the failed ping command (the one that doesn't specify an interface), we notice that the Source appears to be 0.0.0.0 for whatever reason. My guess is that this is the issue, but I don't really know how to resolve that.

As it stands, we can no longer check for updates from OPNSense itself-- both the console and the web UI have lost the ability to pull data externally. I will note that the console can still ping machines that are on the LAN so my guess is that this issue has something to do with the gateway itself (possibly), but we hadn't changed any of the settings prior to update.

We did execute a connectivity audit... it was painfully slow. The current output is as follows.

Code: [Select]
***GOT REQUEST TO AUDIT CONNECTIVITY***
Currently running OPNsense 22.7_4 (amd64/OpenSSL) at Wed Aug 10 04:12:52 UTC 2022
Checking connectivity for host: pkg.opnsense.org -> 89.149.211.205
PING 89.149.211.205 (89.149.211.205): 1500 data bytes

--- 89.149.211.205 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
Checking connectivity for repository (IPv4): https://pkg.opnsense.org/FreeBSD:13:amd64/22.7
Updating OPNsense repository catalogue...
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/meta.txz: Operation timed out
repository OPNsense has no meta file, using default settings
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.pkg: Operation timed out
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.txz: Operation timed out
Unable to update repository OPNsense
Error updating repositories!
Checking connectivity for host: pkg.opnsense.org -> 2001:1af8:4f00:a005:5::
ping: UDP connect: No route to host
Checking connectivity for repository (IPv6): https://pkg.opnsense.org/FreeBSD:13:amd64/22.7
Updating OPNsense repository catalogue...
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/meta.txz: Non-recoverable resolver failure
repository OPNsense has no meta file, using default settings
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.pkg: Non-recoverable resolver failure
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.txz: Non-recoverable resolver failure
Unable to update repository OPNsense
Error updating repositories!
***DONE***

Any help that y'all could give on this would be phenomenal. Thanks in advance!
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: Nadir22 on August 11, 2022, 09:05:38 am
I have exactly the same issue after upgrading from 22.1.10 to 22.7.

I can also confirm the issue with source address, the default source doesn't allow to ping any external hosts, indeed to ping OPNsense mirrors I had to set the source address to WAN interface.

Thank you for figuring this out because at least this is shedding some light on what's causing the issue with firmware updates no longer working.
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: fxsaddict on August 11, 2022, 12:27:39 pm
I don't understand the discussion : did you solve the problem?
My firewall's backup 22.1.10 was in nextcloud. By now, I can't have this file: Murphy's law!
What can I do?
thanks
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: workswiththeweb on August 11, 2022, 03:24:09 pm
TLDR: Disable Settings -> General -> Gateway switching (reboot might be necessary)

I ran into this too. Thankfully I have a few OPNsense instances I use as backup backdoor management VPN servers at various data centers. These are extremely basic, SHTF OOB VPN only. Two upgraded just fine all the way to 22.7.1, the others appeared to break at 22.7_4.

The updater output:

Code: [Select]
***GOT REQUEST TO CHECK FOR UPDATES***
Currently running OPNsense 22.7_4 (amd64/OpenSSL) at Thu Aug 11 07:41:13 CDT 2022
Fetching changelog information, please wait... fetch: transfer timed out
Updating OPNsense repository catalogue...
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/meta.txz: Operation timed out
repository OPNsense has no meta file, using default settings
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.pkg: Operation timed out
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.txz: Operation timed out
Unable to update repository OPNsense
Error updating repositories!
pkg: Repository OPNsense cannot be opened. 'pkg update' required
Checking integrity... done (0 conflicting)
Your packages are up to date.
***DONE***

At first, I thought this was a mirror or DNS problem. However, switching mirrors didn't help, and my DNS servers checked out.

After fiddling around with the few settings that were different for each environment I found disabling Settings -> General -> Gateway switching allowed the package updater to work again. One instance did need a reboot though.

The ones that didn't work either had a dual wan capability or were OVA copies of instances that had dual WAN.

As for why that change works, I can't say. I have my own monkeys to manage and fires to put out. I can say that turning it back on after an update to 22.7.1 appears to break it again. If someone needs data from me just let me know.
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: workswiththeweb on August 11, 2022, 10:53:10 pm
This is still bothering me, something else is going on I can't put my finger on. My home router has gateway switching enabled, and works. I can confirm that turning gateway switching off on the affected instances does allow the update process to function again though.  My home router has much more going on compared to my other instances.

I'll try to find some time to look at this more.
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: workswiththeweb on September 03, 2022, 06:39:09 pm
For anyone finding this via Google, I had one server still acting up. I ended up doing a reset with opnsense-bootstrap found here:
https://github.com/opnsense/update#opnsense-bootstrap

Code: [Select]
# pkg install ca_root_nss
# fetch https://raw.githubusercontent.com/opnsense/update/master/src/bootstrap/opnsense-bootstrap.sh.in
# sh ./opnsense-bootstrap.sh.in -r 22.7

After reinstalling a few plugins and rebooting again everything worked well. Make sure to backup your configs regularly.
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: nycspud on September 04, 2022, 12:18:19 am
Updated to 22.7, then 22.7.2.  I have a multi-wan config.
WAN links drop as soon I check for update.  WAN links also drop as soon as I try to back up to Google Drive.
Only way to get WAN links back up is a reboot.
I did try to disable the Gateway Switching option but made no difference.

I tried the opnsense-bootstrap but WAN links dropped as soon I ran the opnsense-bootstrap.sh command.
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: ckishappy on September 04, 2022, 01:54:24 pm
Same here, but both opnsense-bootstrap and disable default gateway switching didn't help. I still have 100% packet loss for the updates...

I have a multiwan with 3 single WAN gateways and two gateway groups for the WANs. Used to work well beforehand but struggle since 22.7...

Pls advise

**GOT REQUEST TO AUDIT CONNECTIVITY***
Currently running OPNsense 22.7.3_2 (amd64/OpenSSL) at Sun Sep  4 13:43:45 CEST 2022
Checking connectivity for host: pkg.opnsense.org -> 89.149.211.205
PING 89.149.211.205 (89.149.211.205): 1500 data bytes

--- 89.149.211.205 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
Checking connectivity for repository (IPv4): https://pkg.opnsense.org/FreeBSD:13:amd64/22.7
Updating OPNsense repository catalogue...
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/meta.txz: Operation timed out
repository OPNsense has no meta file, using default settings
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.pkg: Operation timed out
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.txz: Operation timed out
Unable to update repository OPNsense
Error updating repositories!
Checking connectivity for host: pkg.opnsense.org -> 2001:1af8:4f00:a005:5::
ping: UDP connect: No route to host
Checking connectivity for repository (IPv6): https://pkg.opnsense.org/FreeBSD:13:amd64/22.7
Updating OPNsense repository catalogue...
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/meta.txz: Non-recoverable resolver failure
repository OPNsense has no meta file, using default settings
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.pkg: Non-recoverable resolver failure
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.txz: Non-recoverable resolver failure
Unable to update repository OPNsense
Error updating repositories!
***DONE***
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: ckishappy on September 05, 2022, 09:36:44 pm
quick update: when I switch off the wireguard vpn and disable the gateway switching, the firewall firmware can be updated again. Not sure what the problem really is..
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: lehrhardt on September 07, 2022, 06:15:17 pm
Hey there,

we just updated our firewall cluster to version 22.7.3_2 and also experienced connectivity problems. We have a multiwan setup with 3 different upstream providers. The firewall itself lost outbound connectivity.

We did some tcpdumping and saw that packets had an ip address of 0.0.0.0.

If we explicitly create snat rules on the wan interface to a public ip from the corresponding wan ip space, traffic works.

We cross checked with another setup, running with the 22.1 version and did not see this behaviour there, source ip was set to a correct address.
Title: Re: OPNsense 22.7_4: Loss of Network Connectivity
Post by: nycspud on September 08, 2022, 08:33:54 pm
Disabling Wireguard  allowed me to check for updates without the WAN links dropping immediately.

Of course when I tried to update I kept the error message that opnsense was not a valid repository or something like that.  I then ran the opnsense-bootstrap.sh script.  It upgraded opnsense to 22.7.4 then I reinstalled a few plugins and restored from backup config.

With Wireguard enabled the WAN links drop immediately when checking for updates.  I can't even ping them from the CLI.
I do have the Wireguard kernel installed so that possibly has something to do with it.