WAN interface flapping with 22.1.2

Started by foxmanb, March 03, 2022, 01:45:18 PM

Previous topic - Next topic
Same WAN flapping issue on Protectli 6 port hardware with intel NIC.  No MAC Spoofing enabled.  Suricata enabled only on LAN.  Multi-WAN setup, Primary (flapping) WAN is Static IPv4 via comcast.  Disabling "Gateway monitoring" fixes the issue, but obviously breaks the failover group and is not a long term solution.  Have verified that the monitor IP was still up and responding during "flaps"...



Previous versions of OPNSense did not display this issue.

Just created a forum account to report this too.

I had upgraded to 22.1.2_1 a few days ago and had not seen any real issues.

However when I enabled IPS last night I immediately started seeing the WAN interface start flapping (going up and down, together with the CPU usage continuously going from 0-100% and back again).

I switched off IPS and rebooted the router but this did not resolve the issue.

I had to revert to the previous version using opnsense-revert -r 22.1.1 opnsense - thanks to a poster further up this thread for that!

Immediately after reverting the WAN connection went stable again. I should mention that I am using MAC spoofing.

I have a similar problem. Just installed Opnsense today. First it worked, then I upgraded the version I downloaded to the newest version and the WAN interface goes UP DOWN UP DOWN all the time. When I configure the WAN as static IP, that doesn't happen. On the same interface I also get the message on the console:
appresolve: can't allocate llinfo for x.x.x.x on igb1
(x.x.x.x is the gateway address of the wan ip). This WAN is connected by a WLAN bridge so x.x.x.x actually is not physically on the same network. Still, it's always reachable and I don't get this message with pfsense.
Last but not least, I also use MAC spoofing on that interface.
Any ideas?

Further to my post above, I've since (as a result of an error during the nginx plugin install) seen a crash report.

Of interest is the dmesg.boot log:

rpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
igb3: link state changed to UP
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
igb3: link state changed to DOWN
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
igb3: link state changed to UP
arpresolve: can't allocate llinfo for <WAN_Gateway_IP> on igb3
igb3: link state changed to DOWN


Hope this is useful information!

I've been chasing this same issue for a few days now.  WAN gets flappy when I'm using a spoofed MAC. I'm doing dual DHCP WAN with CARP.  Once I upgraded, my WAN circuit started acting up.  Killing the spoofed MAC did it.  Still chasing an answer as to why. 

March 14, 2022, 02:13:02 PM #20 Last Edit: March 14, 2022, 02:14:52 PM by ToniE
similar problems, the only working solution for me was to reinstall to 21.7.8

are there any updates on this issu??running opsense in a vm whit dedicated pashtrue pci card and ii have the same problem.... is it a kernel problem? is there a work around? i do need macspoofing....

22.1.3 still have this issue, for me....

Also still having issues on 22.1.3, disabling gateway monitoring (which is not an option in my multi-wan environments) fixes the issue since the link is not really down.


QuoteDpinger problem?
Yes, dpinger logs show missed pings/latency when it doesn't exist.  I have increased my latency threshold and ping frequency to compensate but it still occurs.  Note that if I have a static IP on the connection that's flapping...And replacing the opnsense box with a sonicwall (that's also doing gateway monitoring) and the issue disappears.  There is another thread on the forum about dpinger creating static routes to the gateway monitors and possibly causing issues.

2022-03-20T09:45:45-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Alarm latency 19767us stddev 1454us loss 37%
2022-03-20T09:43:46-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 0 RTT: 20989us RTTd: 5447us Loss: 20%)
2022-03-20T09:43:46-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Clear latency 20989us stddev 5447us loss 20%
2022-03-20T09:38:21-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 1 RTT: 27705us RTTd: 11063us Loss: 37%)
2022-03-20T09:38:21-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Alarm latency 27705us stddev 11063us loss 37%
2022-03-20T05:35:41-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 0 RTT: 490329us RTTd: 1441182us Loss: 0%)
2022-03-20T05:35:41-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Clear latency 490329us stddev 1441182us loss 0%
2022-03-20T05:35:31-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 1 RTT: 506742us RTTd: 1463061us Loss: 0%)
2022-03-20T05:35:31-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Alarm latency 506742us stddev 1463061us loss 0%
2022-03-20T01:36:23-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 0 RTT: 24121us RTTd: 12174us Loss: 0%)
2022-03-20T01:36:23-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Clear latency 24121us stddev 12174us loss 0%
2022-03-20T01:35:19-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 1 RTT: 600018us RTTd: 1670674us Loss: 3%)
2022-03-20T01:35:19-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Alarm latency 600018us stddev 1670674us loss 3%
2022-03-19T13:36:19-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 0 RTT: 21108us RTTd: 7988us Loss: 0%)
2022-03-19T13:36:19-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Clear latency 21108us stddev 7988us loss 0%
2022-03-19T13:35:16-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 1 RTT: 814390us RTTd: 2117391us Loss: 0%)
2022-03-19T13:35:16-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Alarm latency 814390us stddev 2117391us loss 0%
2022-03-19T09:37:04-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 0 RTT: 23782us RTTd: 9052us Loss: 23%)
2022-03-19T09:37:04-04:00 Warning dpinger WAN_Comcast_GWv4 8.8.8.8: Clear latency 23782us stddev 9052us loss 23%
2022-03-19T09:36:23-04:00 Notice dpinger GATEWAY ALARM: WAN_Comcast_GWv4 (Addr: 8.8.8.8 Alarm: 1 RTT: 26176us RTTd: 11155us Loss: 37%)

I had the WAN flapping issue with mac spoofing and have resolved by installing the updated FreeBSD 13 em driver, though I don't know if you have intel nics as well?
To generate the driver file I spun up a FreeBSD vm then pkg search intel-em-mod and install. Copied the if_em_updated.ko driver to /boot/modules/ as per Franco's reply in this post https://forum.opnsense.org/index.php?topic=20905.0.
I also disabled suricata on the wan interface and turned off flow control on all NICs.
Now running on a non flapping OPNsense 22.1.3 with WAN DHCP and mac spoofing.
@Franco - can this driver be added to OPNsense as it seems to resolve a number of stability issues.

Interesting, I am using intel nics (Protecli boxes), but suricata is only on the LAN and WAN 1 is static, WAN 2 is DHCP.  No MAC spoofing...

Looks like the driver your using will be included in FreeBSD 13.1

On my systems where I can reproduce this issue I also have Intel NIC's... but not em. I have ixl (X710) or igc (I225).
To test this Intel driver issue theory, I've changed to a virtual WAN interface (vtnet on KVM and virtio) and there was no issue with MAC spoofing + IDP IPS on!!!

March 21, 2022, 07:59:00 PM #29 Last Edit: May 14, 2022, 11:33:50 AM by firewall
Also seeing this issue on a NUC with 6x Intel eth devices, no MAC spoofing whatsoever, and Suricata enabled on LAN/WLAN only. This seems to be a rather prevalent issue--has it been acknowledged by project staff?

EDIT - SOLVED: This issue continued for 2 months until an issue with Intel driver config on FBSD13 was called out. The resolution was posted here and the fix described worked for me (thanks @tracerrx!):
https://forum.opnsense.org/index.php?topic=27299.msg137350#msg137350

EDIT 2: called it too soon. still broken.