Abysmal performance in one direction over QSFP+ (Can't disable VLAN_HWTSO)

Started by Kangie, July 11, 2023, 12:07:50 AM

Previous topic - Next topic
Hi All,

I'm currently diagnosing some apparent connectivity issues on my home network between my OPNsense router (Version    23.1.11) and a mikrotik switch (crs354-48g-4s+2q+rm).

I'm connecting from a ConnectX-3 VPI card with ports in Ethernet mode, using mlx4en driver.

Symptoms:

Traffic (in one direction) seems fine (TX from the router) however anything sent _to_ the router sees a ton of retransmissions and abysmal speed.

kangie@kangie@obsidian ~ $ doas iperf3 -c 10.xx.xx.1 --bidir
Connecting to host 10.xx.xx.1, port 5201
[  5] local 10.xx.xx.46 port 41176 connected to 10.xx.xx.1 port 5201
[  7] local 10.xx.xx.46 port 41190 connected to 10.xx.xx.1 port 5201
[ ID][Role] Interval           Transfer     Bitrate         Retr  Cwnd
[  5][TX-C]   0.00-1.00   sec   107 KBytes   880 Kbits/sec    6   2.83 KBytes       
[  7][RX-C]   0.00-1.00   sec  84.4 MBytes   708 Mbits/sec                 
[  5][TX-C]   1.00-2.00   sec  0.00 Bytes  0.00 bits/sec    5   1.41 KBytes       
[  7][RX-C]   1.00-2.00   sec  60.8 MBytes   510 Mbits/sec                 
[  5][TX-C]   2.00-3.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  7][RX-C]   2.00-3.00   sec   110 MBytes   920 Mbits/sec                 
[  5][TX-C]   3.00-4.00   sec  0.00 Bytes  0.00 bits/sec    0   1.41 KBytes       
[  7][RX-C]   3.00-4.00   sec  22.9 MBytes   192 Mbits/sec                 
[  5][TX-C]   4.00-5.00   sec  0.00 Bytes  0.00 bits/sec    8   1.41 KBytes       
[  7][RX-C]   4.00-5.00   sec  80.4 MBytes   675 Mbits/sec                 
[  5][TX-C]   5.00-6.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  7][RX-C]   5.00-6.00   sec  86.1 MBytes   722 Mbits/sec                 
[  5][TX-C]   6.00-7.00   sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  7][RX-C]   6.00-7.00   sec   111 MBytes   928 Mbits/sec                 
[  5][TX-C]   7.00-8.00   sec  0.00 Bytes  0.00 bits/sec    2   2.83 KBytes       
[  7][RX-C]   7.00-8.00   sec   102 MBytes   857 Mbits/sec                 
[  5][TX-C]   8.00-9.00   sec  0.00 Bytes  0.00 bits/sec    2   1.41 KBytes       
[  7][RX-C]   8.00-9.00   sec  48.1 KBytes   394 Kbits/sec                 
[  5][TX-C]   9.00-10.00  sec  0.00 Bytes  0.00 bits/sec    1   1.41 KBytes       
[  7][RX-C]   9.00-10.00  sec  9.90 KBytes  81.1 Kbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID][Role] Interval           Transfer     Bitrate         Retr
[  5][TX-C]   0.00-10.00  sec   107 KBytes  88.0 Kbits/sec   27             sender
[  5][TX-C]   0.00-10.00  sec  9.90 KBytes  8.11 Kbits/sec                  receiver
[  7][RX-C]   0.00-10.00  sec   658 MBytes   552 Mbits/sec  113             sender
[  7][RX-C]   0.00-10.00  sec   657 MBytes   551 Mbits/sec                  receiver


Based on the one-sidedness of this output, I suspected that TSO (or similar) was enabled, (which it was!), but having disabled all offloading via the UI (Interfaces > Settings), and rebooted the node, I still see the following in my ifconfig and can't seem to disable it:

root@OPNsense:/home/kangie # ifconfig mlxen0
mlxen0: flags=8847<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        description: TrunkToCRS354 (opt12)
        options=8c00a8<VLAN_MTU,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,LINKSTATE>
        ether f4:52:14:10:a3:51
        inet 10.xx.xx.1 netmask 0xffffff00 broadcast 10.xx.xx.255
        groups: IG_LOCAL IG_OUT_WAN
        media: Ethernet autoselect (40Gbase-CR4 <full-duplex,rxpause,txpause>)
        status: active
        nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>


I'm able to toggle flags (e.g. debug) by running ifconfig mlxen0 -debug, however trying ifconfig mlxen0 -vlanmtu -vlanhwtso
has no impact on card options.

What am I missing here? Is there anything that I've failed to consider.

Edit: I have a spare ConnectX-3 which displays the same performance issues - it could be the DAC or switch; I'll need to dig out and configure my fibre switch and its DAC to do some comparative testing, however I _suspect_ that this issue is on the router side:

Immediately after a power cycle, the interface stats show within the UI:

Base Interface:
TrunkToCRS354 interface (opt12, mlxen0)
Status    up
MAC address    f4:52:14:10:a3:51 - Mellanox Technologies, Inc.
MTU    1500
IPv4 address    10.xx.0.1/24
Media    40Gbase-CR4 <full-duplex,rxpause,txpause>
In/out packets    4 / 0 (724 bytes / 0 bytes)
In/out packets (pass)    0 / 0 (0 bytes / 0 bytes)
In/out packets (block)    724 / 0 (4 bytes / 0 bytes)
In/out errors    2698 / 0
Collisions    0


VLAN I'm connecting from:

WLAN interface (opt19, vlan013)
Status    up
MAC address    f4:52:14:10:a3:51 - Mellanox Technologies, Inc.
MTU    1500
IPv4 address    10.xx.xx.1/24
IPv6 link-local    fe80::f652:14ff:fe10:a351/64
IPv6 address    2403:580c:b8e6:4:f652:14ff:fe10:a351/64
Media    40Gbase-CR4 <full-duplex,rxpause,txpause>
In/out packets    6223 / 8343 (1.77 MB / 4.62 MB)
In/out packets (pass)    6097 / 8342 (1.76 MB / 4.62 MB)
In/out packets (block)    10968 / 1 (126 bytes / 64 bytes)
In/out errors    0 / 0
Collisions    0

An update:

Note: all iperf3s here were running the server _on_ opnsense.

I swapped the QSFP+ card out for an X520:

[ $ iperf3 -c opnsense.infra.footclan.ninja -P5 --bidir
. . .
[SUM][TX-C]   0.00-10.00  sec  1.08 GBytes   930 Mbits/sec    0             sender
[SUM][TX-C]   0.00-10.01  sec  1.07 GBytes   922 Mbits/sec                  receiver
. . .
[SUM][RX-C]   0.00-10.00  sec   259 MBytes   217 Mbits/sec  1889             sender
[SUM][RX-C]   0.00-10.01  sec   254 MBytes   213 Mbits/sec                  receiver


I was still seeing poor performance in one direction from the router so I enabled flow control, which didn't seem to make too much of a difference (i.e. I still see retransmits in one direction and horrible speed), however when running bidirectional tests that aren't simulltaneous it seems far better:

$ iperf3 -c opnsense.infra.footclan.ninja -P5
. . .
[SUM]   0.00-10.00  sec  1.09 GBytes   932 Mbits/sec    0             sender
[SUM]   0.00-10.00  sec  1.08 GBytes   928 Mbits/sec                  receiver


and

$ iperf3 -c opnsense.infra.footclan.ninja -P5 -R
. . .
[SUM]   0.00-10.00  sec  1.09 GBytes   933 Mbits/sec  1450             sender
[SUM]   0.00-10.00  sec  1.08 GBytes   928 Mbits/sec                  receiver


I _still_ see a lot of retransmits, however there are less now. Between my two SFP+-connected hosts I've managed to get ~5gbit with 5 parallel iperf instances in the "good" direction, and 500Mbit in the bad direction.

$ iperf3 -c opnsense.infra.footclan.ninja -P5 --bidir
. . .
[SUM][TX-C]   0.00-10.00  sec   603 MBytes   506 Mbits/sec    4             sender
[SUM][TX-C]   0.00-10.00  sec   593 MBytes   497 Mbits/sec                  receiver
. . .
[SUM][RX-C]   0.00-10.00  sec  5.70 GBytes  4.89 Gbits/sec   22             sender
[SUM][RX-C]   0.00-10.00  sec  5.69 GBytes  4.89 Gbits/sec                  receiver


However when running single direction tests I don't see nearly the same issue:

$ iperf3 -c opnsense.infra.footclan.ninja -P5
. . .
[SUM]   0.00-10.00  sec  9.35 GBytes  8.03 Gbits/sec    0             sender
[SUM]   0.00-10.00  sec  8.25 GBytes  7.09 Gbits/sec                  receiver



$ iperf3 -c opnsense.infra.footclan.ninja -P5 -R
. . .
[SUM]   0.00-10.01  sec  6.19 GBytes  5.31 Gbits/sec   25             sender
[SUM]   0.00-10.00  sec  6.18 GBytes  5.31 Gbits/sec                  receiver


Any thoughts or insight? Is my CRS354 unhappy or is there something in my opnsense configuration that's likely to be the cause?

Also mods, could you please move this to the 23.1 forum? I didn't reallise I was posting in 22.7