OPNsense Forum

English Forums => 25.7 Series => Topic started by: quirkyferret on September 11, 2025, 06:20:26 PM

Title: 25.7.2 - Wan gateway consistently going down at random intervals
Post by: quirkyferret on September 11, 2025, 06:20:26 PM
This problem started weeks ago, and has been intermittent but is getting worse, now i am losing the WAN gateway and needing to restart the appliance at intervals from 5 minutes, to 3 hours, and can't ever expect it to be up over an entire day.

Initially, i had some upstream problems with my ISP, but these appear to have been sorted, while the Opnsense issue remains. Setup that has been working for years started showing this behavior, I was a fair bit behind in my updates- but having now gotten to 25.7.2, issue is still present and bad as ever.

Issue is gateway shows down, no response to ping. Nothing in the device health logs, no issues in the internal LAN or DMZ configs. Don't see anything on the firewall logs. 

Packet capture shows outbound packets from the Wan to various devices, and absolutely no return traffic. I've replaced the ethernet cable on the Wan to gateway connection. I've tried using other devices in troubleshooting sessions- none of them have problems connecting to the gateway when is in this state, but opnsense continiues to persist the problem even if the cable is physically disconnected and replaced.

When issue is present, there is no way to recover other then rebooting the appliance. I've tried disabling the interface and re-enabling it. I've tried shutting down extraneous services- Crowdsec, VPN, etc. 

I honestly don't know what the next logical step is at this point. Reimage the entire device and rebuild all my configs from scratch? Try to harang the ISP to get me a packet capture from their end? Not sure how that would go, there's definetly very limited troubleshotting options in the portal i have access to, and all those show the gateway working fine when opnsense drops out.

Intel Nics, as i know some people have referenced problems with realtek ones.

Title: Re: 25.7.2 - Wan gateway consistently going down at random intervals
Post by: meyergru on September 11, 2025, 11:51:34 PM
ASPM is disabled? FreeBSD cannot handle that well. Use tuneable "hw.pci.enable_aspm = 0" to disable ASPM if your BIOS does not support disabling it for your NICs (most do not).

If your NICs are I226, take a look at this: https://forum.opnsense.org/index.php?topic=48695
Title: Re: 25.7.2 - Wan gateway consistently going down at random intervals
Post by: xjan on September 12, 2025, 03:30:03 AM
Saw a similar issue today where WAN become unresponsive after upgrading from 25.1.12 to 25.7.2 last week. Unfortunately it was at a remote site, so I had limited diagnostic options. About an hour before WAN became unresponsive I saw latency for all traffic through going through the WAN interface skyrocket (500-1500 ms extra latency compared to regular). After that WAN connectivity dropped completely, but LAN traffic between multiple VLANs kept working. A reboot fixed the issue, but we'll see it becomes a reoccurring problem.

'RTL8111/8168/8211/8411 PCI Express Gigabit Ethernet Controller' NIC. Intel Celeron N3150 CPU. os-realtek-re plugin is installed. Although Realtek isn't ideal the machine has been stable for several years with earlier versions of OPNsense.


After the incident I digged through the logs, but didn't find anything too interesting (that I could make sense of). Although this notice that was repeated at least three times (in Log Files: General) did look a bit suspicious:
<7>[557293] sonewconn: pcb 0xfffff800438fc000 (0.0.0.0:53 (proto 6)): Listen queue overflow: 49 already in queue awaiting acceptance (208 occurrences), euid 0, rgid 0, jail 0
Right now I configured the "hw.pci.enable_aspm = 0" tuneable just in case.