Monitoring packet loss/Qos issues

Started by secdoc, February 10, 2024, 09:27:31 PM

Previous topic - Next topic
Is there a way to monitor  other interfaces/vlans for packet loss? I have been troubleshooting QoS related issues and seeing high packet loss in  video conferencing apps such as Zoom and cannot narrow down where it is at. I have been able to rule out the ISP and the FW itself as dropping, but do not know if it could be in the switching infrastructure or cabling and not seeing anything specific like the health quality monitoring for the WAN gateway...

Any thoughts or ideas would be greatly appreciated...

please provide additional details. Where is the QoS configured? what hardwares are used, routers/swithces/etc.

Apologies.  Here are the details on the hardware/environment:

FW:

OPNsense DEC850
OPNsense 23.10.2-amd64 Business Edition
FreeBSD 13.2-RELEASE-p7
OpenSSL 1.1.1w
Licensed until 2024-12-20


Switches:

Mikrotik Switches running SWOS 2.13
1x CRS326-24
2x CRS312-4C-8XG
2x CRS309-1G-8S
2x CRS305-1G-4S


As far as QoS, I have not specified anything explicit. I have ATT Fiber Internet with a 5Gbps synchronous link. This is a business link, so they prefer Data type services.. I have had them come out and they have replaced ONT and some fiber. When I use services like Zoom or other conferencing services, where latency and jitter  can have an impact, I get poor quality and  get  the following, when looking at Statistics in Zoom for example. I am trying to determine where the issue is. If I use https://packetlosstest.com/ I get the following results in the attachment.

As can be seen, the latency and jitter are pretty bad. Just trying to determine best approach to understand what/where to look. I am also including screenshots of the gateway monitor and switch stats/error pages for reference.



How are you connecting the DEC850 SFP+ to the ISP?

Also, I see your switches are port-channel'd together. How are they connected to the DEC850?


Quote from: lilsense on February 12, 2024, 12:51:20 PM
How are you connecting the DEC850 SFP+ to the ISP?

Also, I see your switches are port-channel'd together. How are they connected to the DEC850?

I am connecting via a CAT6e cable. From testing, if I connect directly to the DEC850 and bypass switching, I do not see the issue, SO i am guessing it is a switching or cable issue, but do not find any definitive things to narrow down the issue...IO was hoping, since the DEC850 is the Router/FW/DHCP host for the networks, I would be able to see some type of report or monitor to see what  I may not be able to see in the switches...

The QoS is localized to the particular switch/device. So, I'd recommend investigating the traffic on the Mikrotiks.

you stated a 5gbps Internet, but your connection to ISP is 1gbps Cat6e? none the less, the issue may be on the ports on the switch where you can check to see if you see any errors.

Quote from: lilsense on February 12, 2024, 03:25:24 PM
The QoS is localized to the particular switch/device. So, I'd recommend investigating the traffic on the Mikrotiks.

you stated a 5gbps Internet, but your connection to ISP is 1gbps Cat6e? none the less, the issue may be on the ports on the switch where you can check to see if you see any errors.

The connection (i.e., 1Gbps is dependent on the device...I tried uploading images of all the switches but the forum was saying the images were too large.)...The ATT ONT connects directly to the FW so it does not show in the switches. I have a 10Gbps SFP Fiber connection to the distro SW as a trunk link.

February 12, 2024, 04:42:36 PM #8 Last Edit: February 12, 2024, 05:44:17 PM by secdoc
Here is the distro SW error. Definitely seeing Rx MAC Errors on the port associated with the FW interface. Also I am not using these as L3 SW, so not sure how to deal with QoS from a L2 perspective...

you have tons of underrun as well for the laptop which is an issue. I'd recommend looking at the cables and monitoring the traffic during the zoom call.

February 13, 2024, 12:18:00 AM #10 Last Edit: February 13, 2024, 02:35:15 AM by secdoc
Quote from: lilsense on February 12, 2024, 05:55:46 PM
you have tons of underrun as well for the laptop which is an issue. I'd recommend looking at the cables and monitoring the traffic during the zoom call.
The laptop is a 1Gbps interface connected to a multi-gig(10Gbps) interface, so I kinda expect that to a certain perspective., all of the cables are pre made fiber or CAT6e cables.... from a "cable tester" perspective, all of the cables have been vetted...Would the Rx MAC Errors point to anything? They are apparent on the SW/FW ports...

I am also running smokeping to get some statistics and for the most part the info is showing that I am not seeing the amount of packet loss the various zoom/apps note. I have posted samples of Quad9. Areas where there was specific loss is likely associated with Fiber or ISP work.

Are there other thoughts...

Have you tried New Jersey server? I am getting a lot of drops from the Georgia server.

Quote from: lilsense on February 13, 2024, 12:57:50 AM
Have you tried New Jersey server? I am getting a lot of drops from the Georgia server.

The Jitter and latency was even worse with New Jersey...

I'd recommend posting your issues on the mikrotik forum.

https://forum.mikrotik.com/