Lots of TCP acc_rtt and UDP packets go missing

Started by brujoand, October 20, 2024, 10:01:12 AM

Previous topic - Next topic
October 20, 2024, 10:01:12 AM Last Edit: October 20, 2024, 10:16:23 AM by brujoand
The setup:
So I've got a AliExpress QOTOM i7, intelNIC x 5, 8gb DDR4 and 32Gb SSD OPNsense 24.7.6-amd64 box at the gate to my network.

Behind it is a Linksys 24p POE+ switch. Line is a 600/600 fiber. Couple of Unifi AC PROs and a bunch of random machines.

I'm also running a BGP router on OPNsense for my k8s cluster, that part is working fine. My k8s cluster is in my LAB VLAN, with it's own subnet. Basic VLAN setup is:
LAN 1 10.1.1.1/24
IOT 10 10.10.1.1/24
JAIL 20 10.20.1.1/24
LAB 30 10.30.1.1/24

For good measure my BGP subnet is on 10.31.1.1/24 and k8s pods and services are 10.32.1.1/24 and 10.33.1.1/24

The problem:
Away from home I run Wireguard from all my clients, to get access to my home network. As I start streaming a video I notice it's lagging. My current connection is super fast +1Gbps, and my homelab has as I mentioned 600Mbps. Weird.

The debug
For my first test I do some iperf3 runs between two machines on LAB. Good results, 0 retries and basically full 1Gbps speed. Great.
Then I test between LAN and LAB.

iperf3 -c 10.30.1.16 -p 5201
Connecting to host 10.30.1.16, port 5201
[  5] local 10.1.1.49 port 58086 connected to 10.30.1.16 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  31.6 MBytes   265 Mbits/sec  226   31.1 KBytes
[  5]   1.00-2.00   sec  55.9 MBytes   469 Mbits/sec    5    208 KBytes
[  5]   2.00-3.00   sec  81.8 MBytes   686 Mbits/sec   25    276 KBytes
[  5]   3.00-4.00   sec  89.9 MBytes   754 Mbits/sec   18    290 KBytes
[  5]   4.00-5.00   sec  97.2 MBytes   816 Mbits/sec    7    406 KBytes
[  5]   5.00-6.00   sec  55.2 MBytes   463 Mbits/sec  168    126 KBytes
[  5]   6.00-7.00   sec  72.5 MBytes   608 Mbits/sec    3    303 KBytes
[  5]   7.00-8.00   sec  94.8 MBytes   795 Mbits/sec   55    208 KBytes
[  5]   8.00-9.00   sec  82.0 MBytes   688 Mbits/sec   20    279 KBytes
[  5]   9.00-10.00  sec  97.5 MBytes   817 Mbits/sec   33    188 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec   759 MBytes   636 Mbits/sec  560             sender
[  5]   0.00-10.00  sec   757 MBytes   635 Mbits/sec                  receiver

iperf Done.


That's weird. What about UDP?
iperf3 -c 10.30.1.16 -p 5201 -u
Connecting to host 10.30.1.16, port 5201
[  5] local 10.1.1.49 port 50916 connected to 10.30.1.16 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   129 KBytes  1.05 Mbits/sec  91
[  5]   1.00-2.00   sec   127 KBytes  1.04 Mbits/sec  90
[  5]   2.00-3.00   sec   129 KBytes  1.05 Mbits/sec  91
[  5]   3.00-4.00   sec   127 KBytes  1.04 Mbits/sec  90
[  5]   4.00-5.00   sec   129 KBytes  1.05 Mbits/sec  91
[  5]   5.00-6.00   sec   129 KBytes  1.05 Mbits/sec  91
[  5]   6.00-7.00   sec   127 KBytes  1.04 Mbits/sec  90
[  5]   7.00-8.00   sec   129 KBytes  1.05 Mbits/sec  91
[  5]   8.00-9.00   sec   127 KBytes  1.04 Mbits/sec  90
[  5]   9.00-10.00  sec   129 KBytes  1.05 Mbits/sec  91
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  1.25 MBytes  1.05 Mbits/sec  0.000 ms  0/906 (0%)  sender
[  5]   0.00-10.00  sec  1.25 MBytes  1.05 Mbits/sec  0.052 ms  0/906 (0%)  receiver

iperf Done.


Nope, that looks fine. Maby do it fast?

iperf3 -c 10.30.1.16 -p 5201 -u -b 0
Connecting to host 10.30.1.16, port 5201
[  5] local 10.1.1.49 port 45989 connected to 10.30.1.16 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   107 MBytes   897 Mbits/sec  77419
[  5]   1.00-2.00   sec   106 MBytes   893 Mbits/sec  77104
[  5]   2.00-3.00   sec   107 MBytes   895 Mbits/sec  77339
[  5]   3.00-4.00   sec   107 MBytes   894 Mbits/sec  77133
[  5]   4.00-5.00   sec   106 MBytes   892 Mbits/sec  77039
[  5]   5.00-6.00   sec   107 MBytes   894 Mbits/sec  77189
[  5]   6.00-7.00   sec   107 MBytes   895 Mbits/sec  77175
[  5]   7.00-8.00   sec   107 MBytes   896 Mbits/sec  77346
[  5]   8.00-9.00   sec   107 MBytes   893 Mbits/sec  77126
[  5]   9.00-10.00  sec   107 MBytes   893 Mbits/sec  77230
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  1.04 GBytes   894 Mbits/sec  0.000 ms  0/772100 (0%)  sender
[  5]   0.00-10.01  sec  1.02 GBytes   876 Mbits/sec  0.011 ms  0/770388 (0%)  receiver

iperf Done.


No that looks pretty good to me. Running the tests the other way yields basically the same results for, except for the UDP full speed test, which now has  a lot of loss:

iperf3 -c 10.1.1.49 -p 5201 -u -b 0
Connecting to host 10.1.1.49, port 5201
[  5] local 10.30.1.16 port 41748 connected to 10.1.1.49 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   113 MBytes   949 Mbits/sec  81970
[  5]   1.00-2.00   sec   113 MBytes   949 Mbits/sec  81930
[  5]   2.00-3.00   sec   113 MBytes   949 Mbits/sec  81890
[  5]   3.00-4.00   sec   113 MBytes   946 Mbits/sec  81660
[  5]   4.00-5.00   sec   113 MBytes   948 Mbits/sec  81800
[  5]   5.00-6.00   sec   113 MBytes   949 Mbits/sec  81940
[  5]   6.00-7.00   sec   113 MBytes   949 Mbits/sec  81940
[  5]   7.00-8.00   sec   113 MBytes   949 Mbits/sec  81930
[  5]   8.00-9.00   sec   113 MBytes   949 Mbits/sec  81940
[  5]   9.00-10.00  sec   113 MBytes   949 Mbits/sec  81930
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  1.10 GBytes   949 Mbits/sec  0.000 ms  0/818930 (0%)  sender
[  5]   0.00-10.27  sec   350 MBytes   286 Mbits/sec  0.043 ms  565615/818919 (69%)  receiver

iperf Done.


That is a lot of lost datagrams, and to boot it's _always_ 69%. Not just almost, exactly. Why? This has to mean something? But what? So I open the OPNsense firewall live view, filtering on dest_port=5201. I can see the packets fly accross the screen all green and nice. But now the loss jumps to 82%, and 81%. I close the firewall log, rerun the test and we are back to 69%. Confused I start monitoring the load on the OPNsense box, which usually is about 0.2 and with the firewall log open, and the iperf3 test running it maxes at about 0.6. And the resulting tests now has about 38% loss. okay, breathe.. maybe this is just a side effect of something else and not the root cause.

Confused I tried to test from a LAB machine into the OPNsense box:
iperf3 -c 10.30.1.1 -p 5201
Connecting to host 10.30.1.1, port 5201
[  5] local 10.30.1.16 port 52086 connected to 10.30.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   105 MBytes   884 Mbits/sec  270    147 KBytes
[  5]   1.00-2.00   sec   104 MBytes   871 Mbits/sec  232   83.4 KBytes
[  5]   2.00-3.00   sec   104 MBytes   874 Mbits/sec  348    150 KBytes
[  5]   3.00-4.00   sec   103 MBytes   867 Mbits/sec  175    141 KBytes
[  5]   4.00-5.00   sec   106 MBytes   885 Mbits/sec  223    106 KBytes
[  5]   5.00-6.00   sec   105 MBytes   881 Mbits/sec  226    103 KBytes
[  5]   6.00-7.00   sec   103 MBytes   864 Mbits/sec  202   96.2 KBytes
[  5]   7.00-8.00   sec   105 MBytes   883 Mbits/sec  239    143 KBytes
[  5]   8.00-9.00   sec   105 MBytes   879 Mbits/sec  321   97.6 KBytes
[  5]   9.00-10.00  sec   103 MBytes   861 Mbits/sec  197    151 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.02 GBytes   875 Mbits/sec  2433             sender
[  5]   0.00-10.00  sec  1.02 GBytes   874 Mbits/sec                  receiver

iperf Done.


Okay, so now there is even more retries over TCP, and UDP, but at full speed there is a bit of loss:

iperf3 -c 10.30.1.1 -p 5201 -u -b 0
Connecting to host 10.30.1.1, port 5201
[  5] local 10.30.1.16 port 60044 connected to 10.30.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Total Datagrams
[  5]   0.00-1.00   sec   113 MBytes   950 Mbits/sec  81980
[  5]   1.00-2.00   sec   113 MBytes   949 Mbits/sec  81890
[  5]   2.00-3.00   sec   112 MBytes   942 Mbits/sec  81340
[  5]   3.00-4.00   sec   113 MBytes   949 Mbits/sec  81930
[  5]   4.00-5.00   sec   113 MBytes   949 Mbits/sec  81940
[  5]   5.00-6.00   sec   113 MBytes   949 Mbits/sec  81920
[  5]   6.00-7.00   sec   113 MBytes   949 Mbits/sec  81910
[  5]   7.00-8.00   sec   113 MBytes   945 Mbits/sec  81590
[  5]   8.00-9.00   sec   113 MBytes   949 Mbits/sec  81930
[  5]   9.00-10.00  sec   113 MBytes   949 Mbits/sec  81930
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-10.00  sec  1.10 GBytes   948 Mbits/sec  0.000 ms  0/818360 (0%)  sender
[  5]   0.00-10.00  sec  1.08 GBytes   931 Mbits/sec  0.015 ms  14627/818360 (1.8%)  receiver

iperf Done.


Just for fun I try to run iperf3 from a LAB machine to the OPNsense box but hitting it on the LAN ip, just to see.
And it has the same retries over TCP, but over UDP. Weirdness:
iperf3 -c 10.20.1.1 -p 5201 -u
Connecting to host 10.20.1.1, port 5201
iperf3: error - unable to read from stream socket: Resource temporarily unavailable


But when I check the server logs, I can see that we at least made contact, but guess those connections are over TCP?
Accepted connection from 10.30.1.16, port 47122
[  5] local 10.30.1.1 port 5201 connected to 10.30.1.16 port 32967
[ ID] Interval           Transfer     Bitrate         Jitter    Lost/Total Datagrams
[  5]   0.00-1.00   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   1.00-2.02   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   2.02-3.00   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   4.00-5.01   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   5.01-6.01   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   6.01-7.00   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   7.00-8.01   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   8.01-9.00   sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]   9.00-10.01  sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
[  5]  10.01-11.00  sec  0.00 Bytes  0.00 bits/sec  0.000 ms  0/0 (0%)
.....


Getting desperate I head over to my interface settings and under LAN I check 'Overwrite global settings' and also 'Disable hardware checksum offload ', ' Disable hardware TCP segmentation offload ', ' Disable hardware large receive offload ' and for good measure I also disable VLAN hardware offloading. Suddenly all TCP tests run flawlessly. Problem solved?

iperf3 -c 10.30.1.1 -p 5201
Connecting to host 10.30.1.1, port 5201
[  5] local 10.30.1.16 port 40168 connected to 10.30.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   113 MBytes   946 Mbits/sec    0    400 KBytes
[  5]   1.00-2.00   sec   112 MBytes   940 Mbits/sec    0    447 KBytes
[  5]   2.00-3.00   sec   111 MBytes   930 Mbits/sec    0    447 KBytes
[  5]   3.00-4.00   sec   112 MBytes   937 Mbits/sec    0    489 KBytes
[  5]   4.00-5.00   sec   111 MBytes   934 Mbits/sec    0    489 KBytes
[  5]   5.00-6.00   sec   112 MBytes   936 Mbits/sec    0    489 KBytes
[  5]   6.00-7.00   sec   111 MBytes   930 Mbits/sec    0    489 KBytes
[  5]   7.00-8.00   sec   111 MBytes   935 Mbits/sec    0    489 KBytes
[  5]   8.00-9.00   sec   112 MBytes   942 Mbits/sec    0    489 KBytes
[  5]   9.00-10.00  sec   111 MBytes   932 Mbits/sec    0    489 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.09 GBytes   936 Mbits/sec    0             sender
[  5]   0.00-10.00  sec  1.09 GBytes   934 Mbits/sec                  receiver

iperf Done.


The congestion window is much lower though, but stable. However after like 10 minutes:
iperf3 -c 10.30.1.1 -p 5201
Connecting to host 10.30.1.1, port 5201
[  5] local 10.30.1.16 port 51446 connected to 10.30.1.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   109 MBytes   915 Mbits/sec  216    212 KBytes
[  5]   1.00-2.00   sec   105 MBytes   885 Mbits/sec  321   77.8 KBytes
[  5]   2.00-3.00   sec   105 MBytes   880 Mbits/sec  290    151 KBytes
[  5]   3.00-4.00   sec   107 MBytes   895 Mbits/sec  166    208 KBytes
[  5]   4.00-5.00   sec   105 MBytes   881 Mbits/sec  210   69.3 KBytes
[  5]   5.00-6.00   sec   105 MBytes   883 Mbits/sec  207    110 KBytes
[  5]   6.00-7.00   sec   106 MBytes   892 Mbits/sec  236    105 KBytes
[  5]   7.00-8.00   sec   106 MBytes   892 Mbits/sec  246    154 KBytes
[  5]   8.00-9.00   sec   105 MBytes   884 Mbits/sec  203   97.6 KBytes
[  5]   9.00-10.00  sec   105 MBytes   881 Mbits/sec  278    115 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.03 GBytes   889 Mbits/sec  2373             sender
[  5]   0.00-10.00  sec  1.03 GBytes   887 Mbits/sec                  receiver

iperf Done.


So maybe, changing these hardware offloading settings made "something" reload. It seems like "something" or some type of resources is actually being exhausted. I'm just not network smart enough to figure out what. I've been looking at this for about a month now and I'm not sure what else to do.

I did however do a tcpdump when running the iperf3 tests from a LAB machine to the OPNsense box on the LAB network. And basically there is a huge amount of 'tcp.analysis.ack_rtt'.

The MTU is 1500 everywhere except for wireguard which is 1420.

Traceroute from a LAB machine to the OPNsense box only works using TCP:
traceroute 10.30.1.1
traceroute to 10.30.1.1 (10.30.1.1), 30 hops max, 60 byte packets
1  * * *
2  * * *
3  * * *
4  * * *
...
29  * * *
30  * * *


traceroute 10.30.1.1 -T
traceroute to 10.30.1.1 (10.30.1.1), 30 hops max, 60 byte packets
1  tindsense (10.30.1.1)  0.319 ms  0.299 ms *


So I mean, something is wrong. I just. Help?