Losing WAN connection periodically

Started by jstarta, August 21, 2025, 09:38:14 PM

Previous topic - Next topic
The fw WAN is likely not in a /30.
So let's ask.... OP, what subnet is your FW WAN getting from dhcp, or now whatever OS is connecting to the ISP?

Not sure what version of OPNsense you are running, but duly noted freeBSD 14.3-RELEASE has a noted fix for igc driver.
https://www.freebsd.org/releases/14.3R/relnotes/



Quote from: BrandyWine on August 30, 2025, 06:21:37 AMThe fw WAN is likely not in a /30.
So let's ask.... OP, what subnet is your FW WAN getting from dhcp, or now whatever OS is connecting to the ISP?

Its /22, but I have a static IP so I'll always get the same IP from the ISP

Quote from: BrandyWine on August 30, 2025, 08:26:50 AMNot sure what version of OPNsense you are running, but duly noted freeBSD 14.3-RELEASE has a noted fix for igc driver.
https://www.freebsd.org/releases/14.3R/relnotes/



Its on the latest.



I'll have a look at those links you've sent. So far the switch to using proxmox with Opnsense as a VM has been flawless.

September 02, 2025, 06:32:38 PM #34 Last Edit: September 02, 2025, 06:34:51 PM by BrandyWine
Quote from: jstarta on August 31, 2025, 11:01:55 AMIts /22, but I have a static IP so I'll always get the same IP from the ISP
Is that /22 an rfc1918 block, or ISP public? Do you also get DHCP?

Quote from: BrandyWine on September 02, 2025, 06:32:38 PM
Quote from: jstarta on August 31, 2025, 11:01:55 AMIts /22, but I have a static IP so I'll always get the same IP from the ISP
Is that /22 an rfc1918 block, or ISP public? Do you also get DHCP?

No, it's not. When I initially went with this ISP I had them disable CG-NAT (In case that's what you were thinking it might be).

I've had 5 days uptime with zero problems since installing it as a VM (Under Proxmox), so it seems it's an issue with BSD drivers.

September 03, 2025, 04:49:18 AM #36 Last Edit: September 03, 2025, 04:53:14 AM by BrandyWine
Quote from: jstarta on September 02, 2025, 09:34:25 PMI've had 5 days uptime with zero problems since installing it as a VM (Under Proxmox), so it seems it's an issue with BSD drivers.
Using the same hardware?
When did you get to the newer OPNsense 25.7.x (bsd 14.3) ?

Maybe there's a conflict between the bsd kernel driver and the controller firmware. Must be millions of devices running the Intel 226 running bsd 14.3 (guessing how many).

I don't recall in the thread, did you try an older version of OPNsense (https://docs.opnsense.org/releases.html)? This would have given you a definitive on if the bsd 14.3 with updated igc driver was the cause.

I want to chime in and say I have had the same problem of WAN experiencing 100% Packet Loss periodically since upgrading to 25.7. I have two CWWK mini PC routers, one with an N100 CPU and the other with an N350 CPU. Both come with four i226 ports. I have a Verizon Fios connection (residential).

The 100% packet loss happens about once a day. Interestingly, when the N350 mini PC router experienced 100% packet loss, the machine continued to function on the LAN side, but with no WAN connection. I reviewed all the logs but couldn't find anything wrong. When the N100 mini PC router experienced 100% packet loss, it would reboot itself, and the WAN connection would come back.

I have tried all the suggestions in this thread, and nothing worked. Here are the things I have tried:

  • Setting hw.pci.enable_aspm to 0 didn't make a difference.
  • Setting hw.igc.eee_setting to 0 didn't make a difference.
  • I have checked the WAN DHCP leases, and it seemed to work as expected.
  • I have turned off VTd in the BIOS, but it didn't make a difference.

For my next troubleshooting step, I will install Proxmox on one of the mini PC and virtualize Opnsense to see if the problem still persists. 

Quote from: pdhsker on September 06, 2025, 04:10:06 PM[...]
I have checked the WAN DHCP leases, and it seemed to work as expected.[...]

How about ARP? Don't just check presence - check that the MACs are correct. Incorrect ARP is unlikely outside of a bridged link with multiple devices, but you never can tell.

Quote from: pdhsker on September 06, 2025, 04:10:06 PMI have tried all the suggestions in this thread, and nothing worked.

An arp issue would be very strange considering DHCP is working ok.

Do you have IPv6 enabled on WAN side? If so try disabling IPv6.

Quote from: BrandyWine on September 07, 2025, 08:53:57 AM
Quote from: pdhsker on September 06, 2025, 04:10:06 PMI have tried all the suggestions in this thread, and nothing worked.

An arp issue would be very strange considering DHCP is working ok.

Do you have IPv6 enabled on WAN side? If so try disabling IPv6.


I did have IPv6 enabled. I will try to disable it and see if that makes any difference.

On the other hand, I have installed Proxmox on the N350 mini pc router, then Opnsense as a VM. It has been more than 24 hours, and the router has been stable so far.

Quote from: pdhsker on September 08, 2025, 04:36:18 AM
Quote from: BrandyWine on September 07, 2025, 08:53:57 AM
Quote from: pdhsker on September 06, 2025, 04:10:06 PMI have tried all the suggestions in this thread, and nothing worked.

An arp issue would be very strange considering DHCP is working ok.

Do you have IPv6 enabled on WAN side? If so try disabling IPv6.


I did have IPv6 enabled. I will try to disable it and see if that makes any difference.

On the other hand, I have installed Proxmox on the N350 mini pc router, then Opnsense as a VM. It has been more than 24 hours, and the router has been stable so far.
Making it a VM isnt really fixing the issue. It's a workaround with impact on performance, plus now you have to manage Promox.

Quote from: BrandyWine on September 08, 2025, 07:15:48 AMMaking it a VM isnt really fixing the issue. It's a workaround with impact on performance, plus now you have to manage Promox.

I know, but my purpose was to find out if the problem was caused by me making mistakes in my settings or the driver. The VM Opnsense has been stable for five days, so I am pretty sure the problem was with the driver. Surprisingly, I didn't see many people reporting this problem, so I wonder if a combination of driver issues and my settings causes it.

Quote from: jstarta on August 27, 2025, 01:54:46 AMroot@OPNsense:~ # pciconf -lbcevV igc1
igc1@pci0:89:0:0:       class=0x020000 rev=0x04 hdr=0x00 vendor=0x8086 device=0x125c subvendor=0x1462 subdevice=0xb0b1

Ok, your 226V looks weird. subvendor 1462? 1462 is vendor MSI, and subdevice b0b1?

Wow, that's weird. MSI does not make the 226V, so their ID should not be used at all. Drivers do depend on hardware ID's, so perhaps that's your issue.

All the 226V's that I have seen thus far are full Intel device "class=0x020000 rev=0x04 hdr=0x00 vendor=0x8086 device=0x125c subvendor=0x8086 subdevice=0x0000"

I would go to the Intel i226 Firmware thread and flash the 226's. The 226's are still physical devices no matter what stuff runs above that layer, etc. After that I would think about just going back to native OPNsense, no VM.

QuoteField   Description
Subvendor ID   Identifies the manufacturer of the PCI device.
Subdevice ID   Identifies the specific model or version of the device.

These identifiers are crucial for device drivers and operating systems to correctly recognize and manage PCI devices.