After 3 days, LAN interface completely drops connectivity

Started by dfw3xam1n3r, April 10, 2023, 03:51:01 PM

Previous topic - Next topic
IPv4 and IPv6 stops working after three days. Anything short of a reboot doesn't resolve it. I reboot, it comes back up. Is anyone else having this issue?
OPNsense 24.7.7  - QEMU/KVM (Ubuntu), i9-9900K 16 core @ 5ghz, 16GB RAM, 64GB SSD, 2 dedicated SFP+ NICs

This virtualized?  What make/model of dedicated NICs?
OPNsense 24.7.7 running on:
Dell Optiplex 3050
Intel I5-7600 @ 3.5Ghz (4 Cores)
Intel I350-T4 Nic
8G DDR4
256G SSD

Yes virtualized. Both NIC's are (using lshw -C network):

product: RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
vendor: Realtek Semiconductor Co., Ltd.
OPNsense 24.7.7  - QEMU/KVM (Ubuntu), i9-9900K 16 core @ 5ghz, 16GB RAM, 64GB SSD, 2 dedicated SFP+ NICs

Are you running Zenarmor? That could well be a reason for this as there's some netmap issues being worked on.

I am in fact, yes. Ok, now that's making more sense now that you say that. I'm hoping that's all it is.
OPNsense 24.7.7  - QEMU/KVM (Ubuntu), i9-9900K 16 core @ 5ghz, 16GB RAM, 64GB SSD, 2 dedicated SFP+ NICs

Check out this thread, in particular this post from Franco: https://forum.opnsense.org/index.php?topic=32114.msg161656#msg161656

You can install the netmap testing kernel on the Opnsense command line with the command outlined in the post and then make sure your Zenarmor is configured to run in "Routed Mode (L3 Mode, Reporting + Blocking) with emulated netmap driver", to see if that fixes your issue.
I had the same issue with connection stalls after 2-3 days on all Zenarmor protected interfaces and am currently trying it out as well.

Gotchya, really appreciate the help. Installed update and restarted. We'll see how it goes! Thanks!
OPNsense 24.7.7  - QEMU/KVM (Ubuntu), i9-9900K 16 core @ 5ghz, 16GB RAM, 64GB SSD, 2 dedicated SFP+ NICs

April 11, 2023, 12:44:38 PM #7 Last Edit: April 11, 2023, 01:29:44 PM by dfw3xam1n3r
Well that lasted less than a day before LAN dropped after applying the patch for netmap. :/ I realized I forgot to set Zenarmor with the emulated vs native driver, so we'll see what happens from here.
OPNsense 24.7.7  - QEMU/KVM (Ubuntu), i9-9900K 16 core @ 5ghz, 16GB RAM, 64GB SSD, 2 dedicated SFP+ NICs

Especially watch out for your "MBUF Usage" on the Opnsense Dashboard. If you notice it increasing very quickly (like several thousands over the span of an hour), you might suffer from another MBUF leak - which should however actually be fixed in the latest build from Franco.

Yeah so far that's pretty low. But I didn't pay attention to it before, so not sure what it looked like before these changes.
OPNsense 24.7.7  - QEMU/KVM (Ubuntu), i9-9900K 16 core @ 5ghz, 16GB RAM, 64GB SSD, 2 dedicated SFP+ NICs

So far it's ok, I think (attached).
OPNsense 24.7.7  - QEMU/KVM (Ubuntu), i9-9900K 16 core @ 5ghz, 16GB RAM, 64GB SSD, 2 dedicated SFP+ NICs

Well darn it, it dropped again early this morning, even with all of that in place.
OPNsense 24.7.7  - QEMU/KVM (Ubuntu), i9-9900K 16 core @ 5ghz, 16GB RAM, 64GB SSD, 2 dedicated SFP+ NICs

Realistically I wouldn't virtual your firewall as you end up chasing these edge issues.  Combined with realtek nics which have known issues with BSD this is just a recipe for constant troubleshooting.
OPNsense 24.7.7 running on:
Dell Optiplex 3050
Intel I5-7600 @ 3.5Ghz (4 Cores)
Intel I350-T4 Nic
8G DDR4
256G SSD

I had a similar issue some days ago. The bare metal LAN interface (intel) of my VoIP network lost the carrier and dropped its IPv4 address. A reboot solved the issue. But I don't know if this issue is related to yours.
OPNsense 24.7.11_2-amd64

We have the exact same issue running 23.1 with an Watchguard M370 appliance.

Lan port appears up but the connectivity is lost and it is not visible from the lat network even with arp -a.

The problem happens infrequently every 7-14 days and is very difficult to track down. VPN and WAN interface work and the firewall management is acccessible when this happens (Through VPN). Zenarmor is activated, but it is not really doing much besides reporting: Routed Mode (L3 Mode, Reporting + Blocking) with native netmap driver.

Will try with the emulated driver if that will fix the issue. The logs have nothing noteworthy from the time of the issue happening.

Just installed the latest 23.1.6 patches but not feeling optimistic since this has happened multiple times already.

Any ideas on tracking down the issue?