No Internet Connection After Updating to 23.1.10

Started by Arszilla, June 25, 2023, 09:52:30 AM

Previous topic - Next topic
Hi there,

I initially updated my OPNsense (running in a Proxmox VM) when the 23 release was made. I was either running 23.0.X or 23.1.1. I don't exactly remember the release I had prior to the update currently. Regardless, since it's been a while since I updated my OPNsense and I was performing maintenance on my homelab, updating other stuff, I decided to update OPNsense as well. The update installed a new kernel and required a reboot, which happened automatically. However, upon reboot, I lost all internet connectivity in my household.

I can verify my ISP router has internet. I have connected to it via Ethernet and removed the routing/DNS setting to my OPNsense and I am able to access the internet. When reverted, and all traffic is going back to my OPNsense, my devices can get an internal IP address, thus I am able to reach my servers etc. but I cannot ping 1.1.1.1 or google.com. When pinging Google, I do not get any messages etc. presumably because I am not able to translate the domain to an IP address. When I ping 1.1.1.1, I get no route to host.

I tried downgrading to whatever I had (presumably 23.1.1) using opnsense-update -kr 23.1.1, however, since I have no internet, I cannot get the kernel/packages.

I have read that several people faced issues with the 23.1.6 update, due to the AdGuard Home plugin being broken for a bit. I have AdGuard Home in my network, but not on OPNsense as a plugin. It's on a separate VM/container (i.e. independent), and it was fully functional before the update, as I've updated it 2-3 days before I decided to update OPNsense.

I am trying to figure out what is the problem, i.e. why am I not getting any internet on my WAN, and how to fix it, but I can't seem to figure out what the problem is or what the solution is. As a result, I am curious how can I downgrade/revert the update, before I try something drastic, such as spinning up a new OPNsense VM, and migrating the traffic there using a backup of my configuration etc.

Does anyone know how can I fix this, or if they've experienced such an issue?

Thanks!

Upon checking some logs, I saw that the previous OPNsense version I had was 23.1.3_4, which got bumped up to 23.1.10_1

is the wan interface up?
Please check the status in your opnsense web-gui.
Are you able to ping the routers lan address from your opnsense?
Do you have an "Upstream Gateway" configured in your wan interface?
i want all services to run with wirespeed and therefore run this dedicated hardware configuration:

AMD Ryzen 7 9700x
ASUS Pro B650M-CT-CSM
64GB DDR5 ECC (2x KSM56E46BD8KM-32HA)
Intel XL710-BM1
Intel i350-T4
2x SSD with ZFS mirror
PiKVM for remote maintenance

private user, no business use

I am unsure how to verify the WAN interface is up, but checking Interface Statistics, I see WAN has 7 packets in and 0 out. LAN (I have it named VLAN0), has 21898 in, 21572 out.

For reference, my topology looks like this (even though this graph is a bit out of date, as I deleted some machines, moved 'em around etc.): https://imgbox.com/D8Sne1Hk

For further context, on my Technicolor modem, given by my ISP, the DNS is set to 10.10.10.1, which is VLAN10, and has been like that for months. I tried changing it to 192.168.0.2 (WAN Internal IP of my OPNsense), but no change.

Trying to ping 192.168.0.1 (the modem) results in "ping: sendto: no route to host", just like 1.1.1.1.

On Gateways (Single), I have WAN_GW set to 192.168.0.1 (Gateway) with 255 (upstream) priority. Similarly, on the WAN interface settings, IPv4 Upstream Gateway is set to this WAN_GW.

Checking ifconfig igb1 (WAN) on terminal, the "status" is "active".

At this point, I am seeing 2 options:

  • Reset the OPNsense installation, re-configure everything.
  • Nuke the OPNsense VM, somehow recreate the VM and configuration while potentially having no access to Proxmox console and OPNsense, due to the default settings.

I'm open for recommendations, if you have any.

June 25, 2023, 07:25:12 PM #6 Last Edit: June 25, 2023, 07:29:49 PM by Arszilla
So, I tried resetting the OPNsense installation. However, for whatever reason, I cannot access 192.168.1.1 (Default IP address for WAN) or 192.168.0.20 (IP address for LAN) in any shape or form, when I am directly connected to my ISP modem.

I also tried spinning up a new VM, however it's more problematic than anticipated. The issue is, I have OPNsense running on my Proxmox via PCI passthrough for the NICs. Since I can't run 2 VMs at the same time (using the same NICs), I had to shut the faulty OPNsense down, and spin up the new one so I can perform a clean installation. However, since Proxmox can only support "serial" connections for "qm terminal", and OPNsense uses VGA by default, I am unable to login to the VM console via Proxmox CLI, since I can't access Proxmox's Web UI when the faulty OPNsense is down.

I'm still trying to solve this problem, but I can seriously use a hand, as I am lost and tired.

Additionally, my OPNsense panel looks like the following: https://imgbox.com/k4Pz2z7X

Why arent you running your opnsense directly on hardware.
A VM router that depends on the host is always problematic in my opinion.
i want all services to run with wirespeed and therefore run this dedicated hardware configuration:

AMD Ryzen 7 9700x
ASUS Pro B650M-CT-CSM
64GB DDR5 ECC (2x KSM56E46BD8KM-32HA)
Intel XL710-BM1
Intel i350-T4
2x SSD with ZFS mirror
PiKVM for remote maintenance

private user, no business use

Because I do not have the hardware to run it. I only have a HP ProLiant DL360p Gen8 server. Getting a Protectli or a Lenovo m720q is a bit out of my budget. Plus, this setup has been working since November 2022. However, this update screwed me over.

I should have backed up/snapshotted my VM before updating. Sadly, that's on me.

So a small status update: after resetting my router, I decided to re-assign my interfaces. As a result WAN changed from a static IP to a DHCP IP (192.168.0.20 instead of 192.168.0.2). As a result, I became able to ping somewhat. Instead of having "no route to host", I just get packet losses now.

Quote from: Arszilla on June 25, 2023, 09:40:37 PM
So a small status update: after resetting my router, I decided to re-assign my interfaces. As a result WAN changed from a static IP to a DHCP IP (192.168.0.20 instead of 192.168.0.2). As a result, I became able to ping somewhat. Instead of having "no route to host", I just get packet losses now.

Why do you have a 192.168.0.x WAN?  What are you pinging and from where?  Can you post a network diagram?

While I agree with the previous poster about this being one of the reasons to run your router on it's own hardware, I can understand not having the budget.  Do you have access to a consumer router that you can hook up temporarily until you get your VM fixed?  You should be able to pick one up used for cheap since all you need is basic connectivity.

Quote from: CJRoss on June 26, 2023, 01:08:02 PM
Quote from: Arszilla on June 25, 2023, 09:40:37 PM
So a small status update: after resetting my router, I decided to re-assign my interfaces. As a result WAN changed from a static IP to a DHCP IP (192.168.0.20 instead of 192.168.0.2). As a result, I became able to ping somewhat. Instead of having "no route to host", I just get packet losses now.

Why do you have a 192.168.0.x WAN?  What are you pinging and from where?  Can you post a network diagram?

While I agree with the previous poster about this being one of the reasons to run your router on it's own hardware, I can understand not having the budget.  Do you have access to a consumer router that you can hook up temporarily until you get your VM fixed?  You should be able to pick one up used for cheap since all you need is basic connectivity.

First off, I've shared my (somewhat outdated) topology in one of my previous replies: https://imgbox.com/D8Sne1Hk

Secondly, I realized that my ISP modem had a "bridge mode", which I have toggled now. As a result, my WAN is now my public IP address. The rest of my LAN/VLANs etc. are as follows:

  • WAN: Public IP from ISP (igb1 - DHCP)
  • LAN: 10.10.0.1/24 (igb0 - Static)
  • VLAN10: 10.10.10.1/24 (vlan01 - static) (tied to igb0)
  • VLAN20: 10.10.20.1/24 (vlan02 - static) (tied to igb0)
  • VLAN30: 10.10.30.1/24 (vlan03 - static) (tied to igb0)
  • VLAN40: 10.10.40.1/24 (vlan04 - static) (tied to igb0)

Even with the change to WAN, I have no internet access. I can share my firewall rules to bring further clarity:

As I see it, I have 2 options currently:

  • Temporarily re-purpose a laptop I use to be OPNsense, configure it/recreate it to replicate the current configuration (can't really import due to interface differences), recreate the VM in Proxmox in the meantime, then re-instate the VM and remove the laptop from the topology
  • Factory reset the pre-existing OPNsense, set the WAN and LAN values to the prior values, somehow recreate the VLANS.

Do note that no device is in LAN or uses LAN (10.10.0.1/24) directly.

Lastly, under the new WAN IP, I tried to run "traceroute 1.1.1.1" on my OPNsense directly (via console/terminal) and got "traceroute: findsaddr: failed to connect to peer for src addr selection".

I've been able to repurpose my laptop to act as an OPNsense firewall for the time being, being able to connect to my internal infra. I will be nuking my VM in Proxmox and hoping to restore the network. However, I am still unable to figure out what the problem is with my connection currently, as I am beginning to suspect it's my FW rules.

Any insight on this?

After a successful re-install and reconfiguration, my OPNsense is back up and is running as intended. I am not sure what the cause for all this pain and suffering was in the first place, but I am slowly (but surely) rewriting my firewall rules.