Intermittent issue on all gateways when one has packet loss.

Started by TomT, March 16, 2022, 09:42:01 AM

Previous topic - Next topic
Hi

I had some strange network issues yesterday when one gateway had packet loss.

My WAN is PPPoE and I have a Wireguard VPN to Private Internet Access

OPT1 LAN - 192.168.1.x
OPT3 Wifi - 10.10.1.x

All LAN devices use the WAN as there default route, all wifi devices use the PIA wireguard VPN.
This is all working well and has been stable for quite a long time.

Yesterday all devices started having intermittent issues accessing the internet. My PC connected to the LAN had issues with putty sessions and SIP Phones which would disconnect and instantly re connect.  Ping would drop a couple of packets and then carry on as normal. My WAN connection has been up for 32+ days and all looks fine.

What I noticed was the PIA Gateway was reporting packet loss, once that hit 20% the PIA connection went down and the network devices, LAN & wifi, had a short session of issues.  Once PIA connected again it all worked fine until the next session of packet loss.

I disabled Gateway monitoring on the PIA gateway and that seemed to stop the issues. I've since changed the PIA server I connect to and that seems to have resolved the issue and Gateway monitoring is working fine.

While this was happening CPU usage on my firewall was around 10%, memory @ 6% and no issues with disk space.

Why would one gateway having packet loss affect another gateway ?
Any ideas how I can investigate this ?

Thanks

Hi

Has anyone any ideas on this ?
How can I find out what happens when one gateway has high packet loss ?

Thanks

Without further context, maybe:

Firewall: Settings: Advanced: disable "Kill states on gateway failure".

The setting doesn't exist in 22.1 since it was turned off for good which would have fixed the issue if it is indeed this one.


Cheers,
Franco

Hi Franco.

Thanks for the reply.
I'm thinking about swapping the VPN back to the original server and see if I start getting packet loss again.

If I do are there any logs I can capture that may help narrow down what is happening ?

Thanks

Well... I can still only assume:

You are on 21.7.x and you have the gateway kill activated.

If you could clear up my assumptions by naming your current version (the forum is 21.7 legacy after all) and a screenshot of your gateway kill option that would help us get (any) further.


Cheers,
Franco

Hi Franco,

I'm currently running OPNsense 21.7.2_1-amd64
I do plan to update either this afternoon or tomorrow.

I've ticked 'Kill states on gateway failure' and re enabled the Gateway Monitor and will see how things go.

Thanks

@TomT the point is that you should NOT tick the kill state option.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Hi,

The checkbox is called 'Disable State Killing on Gateway Failure' that suggests ticking it will stop the states being killed.

However I may be misreading this !

Ah, sorry, my bad. Of course the state killing should be disabled.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Ok, thanks for clearing it up... I'm feeling lucky about this being the right fix.


Cheers,
Franco