OPNsense Forum
Archive => 22.1 Legacy Series => Topic started by: binaryanomaly on January 31, 2022, 11:50:27 am
-
Hi,
Since quite a while I experience occasional connection interruptions and can observe packet loss on OPNsense (not just since 22.1). I do suspect my ISP but I have not enough evidence to approach it yet.
I have already activated gateway monitoring. Interestingly packet loss is displayed as 0.0% in System -> Gateways -> Single.
Although Reporting -> Health -> Quality displays packet loss for the Gateway.
Which one is correct?
How can I investigate this further in OPNsense?
Thanks
-
Not in OPNsense, but I run a Smokeping instance to keep an eye on ISP issues. It's hard to argue with a widely accepted graphical measurement.
Bart...
-
Thanks a lot! Set it up.
Would you mind sharing your config?
-
This is the abridged content of my targets file /etc/smokeping/config.d/Targets
*** Targets ***
probe = FPing
menu = Top
title = Network Latency Grapher
+ UK
menu = UK
title = Britain
++ BBC
menu = BBC
title = BBC
host = www.bbc.co.uk
+ US
menu = US
title = United States
++ RedHat
menu = Red Hat
title = Red Hat
host = www.redhat.com
-
Thanks, so pretty standard config.
+ Remote
menu = Remote
title = Remote check
++ cloudflare
menu = Cloudflare
title = 1.1.1.1 check
#probe = FPingNormal
host = 1dot1dot1dot1.Cloudflare-dns.com
My graph looks like below, not sure why I'm getting "u" as unit for the y-axis.
Anything of concern?
fping -s 1.1.1.1
on command line returns 2.43ms avg which I do not recognize in the graph.
-
Those are SI prefixes (u = micro = 1/1000000). Either you live in the Cloudflare building or you have some DNS issue that returns a local host for an external URL.
Try something not behind a CDN like bbc.co.uk.
Bart...
-
Thanks, indeed there was something wrong with the host resolution, using the IP now.
Also it seems to show some packet loss, I'll have to investigate further.
Thanks for your help so far 👍🏻
-
The sweet spot is jitter around the 15-30 ms, since that's well within the domain of your ISP. Your internal latency will be around 2-3 ms and more affected by the quality (or lack thereof) of your infrastructure.
I get a solid 15 ms to sites behind CDN and I don't get worried until that doubles.
-
The results are pretty good. Constantly 2-3ms e2e.
After digging deeper I now found a rx_no_dma_resources
issue on the NIC.
It looks as if this could be the root cause of the intermittent issues I am experiencing. I have no idea why this suddenly appeared, may be related to some kernel upgrade or so on the vmhost itself.
I'm still confused though that OPNsense reports package loss in the Reporting/Health/Quality section but not in System/Gateway/Single.
@Franco: This might be a bug or I am not getting how this is intended to work.
-
After digging deeper I now found a rx_no_dma_resources
issue on the NIC.
It looks as if this could be the root cause of the intermittent issues I am experiencing.
Where did you found this error?
-
If I recall correctly this was an error message in dmesg of the vmhost.
The root cause of all of this packet loss was a bad ethernet cable. It took me weeks to identify though since there we no clear error messages or indications and I only had breaking problems intermittently.