Huge network latency since upgrade to 20.7

Started by dzikus, August 04, 2020, 08:21:05 AM

Previous topic - Next topic
August 04, 2020, 08:21:05 AM Last Edit: August 04, 2020, 08:23:15 AM by dzikus
Hello,

one of my routers after upgrade to 20.7 on single (it has 4port network card, problem occurs only on 1 port) broadcom 1000Base-T interfaces has huge latency, it was ~0,5ms, right now it is ~20-25ms. I already changed "VLAN Hardware Filtering" to disable with some success but it was working on 20.1.9_1 like a charm, currently not.

On attached screenshot host monitored thru this interface on 20.1.9_1 and on 20.7 after ~22:00. Please help what else can I tune to drop this latency (hardware checksum offload disabled, hardware TCP segmentation offload disabled, hardware large receive offload disabled and VLAN Hardware Filtering also disabled).

Additional info: this bge1 interface is also used to vlan tags as the only interface in this router.

Is this a ping via LAN? Man, you also had quite some serios packet loss when it was working with lower latency.  :o :o :o

No its connected thru bge1 (vlan1), there are also:
bge1: flags=8d43<UP,BROADCAST,RUNNING,PROMISC,OACTIVE,SIMPLEX,MULTICAST> metric 0 mtu 1500
bge1_vlan200: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
bge1_vlan100: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
bge1_vlan99: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
bge1_vlan50: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> metric 0 mtu 1500

It was example host with some packet loss, I attached another one available thru bge1 without such lost (which are negligible in this problem).

Look at the blue bit in the graph at 5ms .. this is not allowed in LAN .. it mean 1 of 10 pings fails ..

On a WAN with cable or similar, ok, but internally, maybe you should replace the NIC.

Quote from: mimugmail on August 04, 2020, 09:58:23 AM
Look at the blue bit in the graph at 5ms .. this is not allowed in LAN .. it mean 1 of 10 pings fails ..

On a WAN with cable or similar, ok, but internally, maybe you should replace the NIC.

This is still just an example host not directly connected to this NIC (there are many devices between this router and host on my graphs) and you should not bother with lost but with latency. And in smokeping light blue means 1 of 20 pings not 10!

If you set 20 pings per cycle, yes .. I usually use 10 in every 60 sec.
Maybe there's a known bug in FreeBSD 12.1 and broadcom nic.

Did you already google for known limitations?


Hm, me either. I don't use bge so I cant dive deeper into it.

August 04, 2020, 11:42:30 AM #8 Last Edit: August 04, 2020, 11:49:51 AM by dzikus
Quote from: mimugmail on August 04, 2020, 11:35:50 AM
Hm, me either. I don't use bge so I cant dive deeper into it.

Deleting all vlans on this bge1 resolves issue.

20.7 with broadcom:
bge1@pci0:2:0:1:   class=0x020000 card=0x1f5b1028 chip=0x165f14e4 rev=0x00 hdr=0x00
and vlans gives you huge latencies, before FreeBSD12 (till 20.1.9_1) everything working fine.

https://github.com/freebsd/freebsd/commits/35c027f3215c305ddf9814e895b7f4c880521eb8/sys/dev/bge
This is history of this shitty broadcom driver in FreeBSD. I don't see any vlan related commits since 12 release.