Packet Loss and Stability help

Started by brybro, June 22, 2024, 05:40:52 PM

Previous topic - Next topic
June 22, 2024, 05:40:52 PM Last Edit: June 22, 2024, 05:55:28 PM by brybro
Hell All,

Preface: Please forgive my vocabulary and correct me as necessary, I am open to feedback in every capacity
- I am willing to restart from scratch
- I am willing to read / test / learn
- I will always assume you have best intentions for me
- I want to stick with OPNsense and get good at using it (not just copy others), but it is a hobby level lifestyle for me

Novice here, spent 10s-50s of hours over the last year reading/watching videos/playing with OPNsense on a fresh install box inspired by Network Chuck and Home Network Guy at first and now enjoying the rabbit hole fun. (Mainly setup like Home Network Guy guided network. (I would love to get a full VLAN and LAGG setup going but right now I have not accomplished that yet: https://homenetworkguy.com/how-to/set-up-a-fully-functioning-home-network-using-opnsense/)

Running:
OPNsense 24.1.9_3-amd64
FreeBSD 13.2-RELEASE-p11
OpenSSL 3.0.14


Network architecture:
ARRIS SURFboard SB8200 Modem - OPNsense Router (Fanless Desktop Computer Mini PC Intel Core i7 7500U 16GB RAM 128GH SSD) - Smart Unmanaged Switch (TL-SG108E) - hardwired to devices right now

ISP: Breezeline broadband 1000 Mbps

I have been working through trying to solve connection drops what I would describe as stability problems.

I will post anything that may be helpful but really I am at a loss on where to start with this forum post.

Situation:
- Basic operations install and updates on router all functional
- Everything connects to WAN and "internet" for all things as desired (all greens in the services dashboard)
- Everything runs except seems to be unstable for example a couple of scenarios:
-- 1) start netflix it loads perfect/quickly, but will buffer often
-- 2) loaded into Playstation 5 (NAT 2 on my static IP address), play a game of call of duty no problem sometimes but stutter lag and drop connection often
-- 3) PC (Static IP) playing game on Steam load and play fine then all of sudden drop server connection error message

I used Vsee Network test reading other posts to try and explain my situation:


The drops above happened when I loaded netflix and then when I pressed play on a show to stream.
(FYI See similar graphs when gaming too)

Appreciate any help to further evaluate and try to solve

Sincerely,
Brybro

June 22, 2024, 05:47:07 PM #1 Last Edit: June 22, 2024, 05:49:41 PM by brybro
Also if helpful I saw in another post to check gateway monitoring I am not sure what to make of this, if anything, yet, but this is a first screening snapshot:



BryBro

OPNsense doesn't lose packets.

What you're describing there is a functional setup with an upstream connectivity issue.

PingInfoView from Nirsoft can help you monitor both your WAN GW and an upstream server such as 8.8.8.8.

When experiencing outages keep track of it in a spreadsheet, verify the status of your modem if applicable and see if it's not synchronized, and finally talk to your ISP about it.

That graph you showed Quality | WAN_DHCP

Is a showing metric of a probe that checks on the GW provided by your ISP DHCP on WAN. By default when using DHCP on WAN this is what is checks, it should be the direct connected 1st L3 HOP.

loss - packet lost, meaning OPN send icmp packet but didn't receive a response
delay - is the time the packet took to reply back to OPN
stddev - is the diference, delta, between delay of two last icmp replies

What is shows here in your specific case is that, you are experiencing 33% packet loss when probing your GW.

As new mentioned this looks like upstream connectivity issue e.g, behind your OPNsense.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

Thank you both for the great info and help.

Ok, I have PingInfoView downloaded and 1st run going now
- I have the gateway address from what OPNsense told me it was in the system>gateway>config area (looks like the gateway is correct on PingInfoView as atlanticbb.net)
- I am comparing that to dns.google 8.8.8.8
-- currently 0% failed on both (76 and counting attempts set every 10 seconds)

I will monitor and see what happens there.

For clarity in my head, what I am understanding from your replies:
- I seem to have a functional setup of router / OPNsense
- upstream means in this case before we get to my router, in other words maybe an ISP issue

I'm researching the term L3 HOP now to make sure I understand that terminology but thus far I think it means the very next site my home setup connects to on the ISP side (?).

- Just a little confused on the word "Behind" in this phrase: " looks like upstream connectivity issue e.g, behind your OPNsense."

already thankful for both of you,
BryBro


L3 HOP is effectively a router. As a router operates on the 3rd Layer of the OSI model (L3). A device performing routing actions. HOP in these context means a device in the path from source to destination.

When you do a trace to 8.8.8.8 devices between the source and the destinations are all L3 HOPs thru which the packet traveled to reach its destination.

Quote from: brybro on June 22, 2024, 06:39:46 PM
- Just a little confused on the word "Behind" in this phrase: " looks like upstream connectivity issue e.g, behind your OPNsense."

Behind your LAN, as you source from inside the LAN > OPN > ISP, in this context I ment its behind.

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD

If you have a Windows PC at home, here are some more useful troubleshooting tools:
https://www.clouddirect.net/knowledge-base/KB0011455/using-traceroute-ping-mtr-and-pathping

Similar tools also available on Linux/Mac

Quote from: Seattle2k on June 25, 2024, 10:31:13 PM
If you have a Windows PC at home, here are some more useful troubleshooting tools:

Thank you for the suggestion, I will read up

Following up after a few days of using pinginfoview: (hours of data over multiple days)
- not too many failures overall which is a good thing for connectivity
- I added my router address to see if I was loosing connection there and that has had 0 failed requests
- I have had no more than 1-2% at most failed in any data collection session (typical % failed is much lower over this last weekend closer to 0.5% or less)
-- When I do have a failure both my local ISP GW and dns.google fail at the same time

Two thoughts:

1) If I was not an online gamer this probably would not matter, but all the failures seem to happen when gaming (both PC and PS5 independently tested) -  or at least that's when it bothers me  ;)

2) Tonight I noticed that my connection was really "bad" (still only 1.2% failure) but it was while I was gaming and my wife was upstairs streaming netflix. (The dropped connection "kicks" me from a game)
Stats: 6 "request timeout" on both 8.8.8.8 and my GW all at the identical time stamp over 1 hour of gaming on PS5 plugged in and streaming netflix over wifi
- ?? Bufferbloat ?? --> I am running a shaper (Flowqueue-codel with bandwidth set to about 85% my normal up/down) setup based on here: https://forum.opnsense.org/index.php?topic=7423.0

I am going to try throttling the wifi bandwith next I think

Pressing On,
BryBro


also if helps from today:


No one was home using heavy bandwidth at 0730
The losses between 1900-2030 correlate exactly with the last post errors that I was describing from tonight

Press On,
BryBro

June 26, 2024, 10:06:28 AM #9 Last Edit: June 26, 2024, 10:09:47 AM by Seimus
In regards of bufferbloat and FQ_C, most likely people dont yet know but we have an official guide in the docs that was co-authored with the help of bufferbloat community (creators and testers of FQ_C algorithm).

https://docs.opnsense.org/manual/how-tos/shaper_bufferbloat.html

In regards your issue lets summarize.
1. You monitor your OPNsense LAN IP (GW IP) - no issues seen
2. You monitor your ISP 1st HOP, GW that OPNsense uses for default route - issues seen
3. You monitor an internet IP (8.8.8.8 example) - issues seen

If the above is true than the issue is on your ISP side starting on the 1st HOP, meaning your ISP GW. SO either there is a cable issue between OPNsense and ISP GW, or the ISP GW itself.

P.S. what is your WAN BW? Download/Upload

Regards,
S.
Networking is love. You may hate it, but in the end, you always come back to it.

OPNSense HW
APU2D2 - deceased
N5105 - i226-V | Patriot 2x8G 3200 DDR4 | L 790 512G - VM HA(SOON)
N100   - i226-V | Crucial 16G  4800 DDR5 | S 980 500G - PROD