20Gb WAN - Slow with OpnSense

Started by redbaron73, May 08, 2023, 09:54:21 PM

Previous topic - Next topic
I have a 20Gb WAN connection. My OpnSense firewall is a Dell PowerEdge R640 /c Intel Silver 4110 CPU,64GB ram, and 2x Mellanox Connectx-5 Dual 100Gb Nics.


I have my WAN configured as Lagg0-Vlan300, and Lan as Lagg1-Vlan10

I have tested with iperf3 on LAN and speedtest.net - and can never exceed 3000Mbs.

When booting the system on debian live, and configuring the interfaces to match the OpnSense I am able to get 18Gb from speedtest and 930Gb from LAN using ntttcp.

I am at a loss on the configuration necessary for FreeBSD/OpnSense. I have applied what I know from linux and have listed the tunables below. I would appreciate any help. Let me know if there is more information that I should be providing.


/boot/loader.conf

# dynamically generated tunables settings follow
hw.ibrs_disable="1"
hw.ixl.enable_head_writeback="0"
hw.syscons.kbd_reboot="0"
kern.ipc.maxsockbuf="614400000"
kern.random.fortuna.minpoolsize="128"
kern.randompid="1"
net.enc.in.ipsec_bpf_mask="2"
net.enc.in.ipsec_filter_mask="2"
net.enc.out.ipsec_bpf_mask="1"
net.enc.out.ipsec_filter_mask="1"
net.inet.icmp.drop_redirect="1"
net.inet.icmp.icmplim="0"
net.inet.icmp.log_redirect="0"
net.inet.icmp.reply_from_interface="1"
net.inet.ip.accept_sourceroute="0"
net.inet.ip.forwarding="1"
net.inet.ip.intr_queue_maxlen="1000"
net.inet.ip.portrange.first="1024"
net.inet.ip.random_id="1"
net.inet.ip.redirect="0"
net.inet.ip.sourceroute="0"
net.inet.ipsec.async_crypto="1"
net.inet.rss.bits="2"
net.inet.rss.enabled="1"
net.inet.tcp.abc_l_var="52"
net.inet.tcp.blackhole="2"
net.inet.tcp.delayed_ack="0"
net.inet.tcp.drop_synfin="1"
net.inet.tcp.log_debug="0"
net.inet.tcp.minmss="536"
net.inet.tcp.mssdflt="1240"
net.inet.tcp.recvbuf_max="614400000"
net.inet.tcp.recvspace="65536"
net.inet.tcp.sendbuf_inc="65536"
net.inet.tcp.sendbuf_max="614400000"
net.inet.tcp.sendspace="65536"
net.inet.tcp.soreceive_stream="1"
net.inet.tcp.syncookies="1"
net.inet.tcp.tso="0"
net.inet.udp.blackhole="1"
net.inet.udp.checksum="1"
net.inet.udp.maxdgram="57344"
net.inet6.ip6.forwarding="1"
net.inet6.ip6.intr_queue_maxlen="1000"
net.inet6.ip6.prefer_tempaddr="0"
net.inet6.ip6.redirect="0"
net.inet6.ip6.use_tempaddr="0"
net.isr.bindthreads="1"
net.isr.defaultqlimit="2048"
net.isr.dispatch="deferred"
net.isr.maxthreads="-1"
net.link.bridge.pfil_bridge="0"
net.link.bridge.pfil_local_phys="0"
net.link.bridge.pfil_member="1"
net.link.bridge.pfil_onlyip="0"
net.link.ether.inet.log_arp_movements="1"
net.link.ether.inet.log_arp_wrong_iface="1"
net.link.tap.user_open="1"
net.link.vlan.mtag_pcp="1"
net.local.dgram.maxdgram="8192"
net.pf.share_forward="1"
net.pf.share_forward6="1"
net.pf.source_nodes_hashsize="1048576"
net.route.multipath="0"
security.bsd.see_other_gids="0"
security.bsd.see_other_uids="0"


dmesg | grep mlx5_core

mlx5_core0: <mlx5_core> mem 0xc8000000-0xc9ffffff at device 0.0 numa-domain 0 on pci8
mlx5_core0: INFO: mlx5_port_module_event:707:(pid 12): Module 0, status: plugged and enabled
mlx5_core0: INFO: health_watchdog:579:(pid 0): PCIe slot advertised sufficient power (75W).
mlx5_core0: INFO: init_one:1660:(pid 0): cannot find SR-IOV PCIe cap
mlx5_core: INFO: (mlx5_core0): E-Switch: Total vports 1, l2 table size(65536), per vport: max uc(128) max mc(2048)
mlx5_core0: Failed to initialize SR-IOV support, error 2
mlx5_core1: <mlx5_core> mem 0xc6000000-0xc7ffffff at device 0.1 numa-domain 0 on pci8
mlx5_core1: INFO: mlx5_port_module_event:707:(pid 12): Module 1, status: plugged and enabled
mlx5_core1: INFO: health_watchdog:579:(pid 0): PCIe slot advertised sufficient power (75W).
mlx5_core1: INFO: init_one:1660:(pid 0): cannot find SR-IOV PCIe cap
mlx5_core: INFO: (mlx5_core1): E-Switch: Total vports 1, l2 table size(65536), per vport: max uc(128) max mc(2048)
mlx5_core1: Failed to initialize SR-IOV support, error 2
mlx5_core2: <mlx5_core> mem 0xe4000000-0xe5ffffff at device 0.0 numa-domain 0 on pci10
mlx5_core2: INFO: mlx5_port_module_event:707:(pid 12): Module 0, status: plugged and enabled
mlx5_core2: INFO: health_watchdog:579:(pid 0): PCIe slot advertised sufficient power (75W).
mlx5_core2: INFO: init_one:1660:(pid 0): cannot find SR-IOV PCIe cap
mlx5_core: INFO: (mlx5_core2): E-Switch: Total vports 1, l2 table size(65536), per vport: max uc(128) max mc(2048)
mlx5_core2: Failed to initialize SR-IOV support, error 2
mlx5_core3: <mlx5_core> mem 0xe2000000-0xe3ffffff at device 0.1 numa-domain 0 on pci10
mlx5_core3: INFO: mlx5_port_module_event:707:(pid 12): Module 1, status: plugged and enabled
mlx5_core3: INFO: health_watchdog:579:(pid 0): PCIe slot advertised sufficient power (75W).
mlx5_core3: INFO: init_one:1660:(pid 0): cannot find SR-IOV PCIe cap
mlx5_core: INFO: (mlx5_core3): E-Switch: Total vports 1, l2 table size(65536), per vport: max uc(128) max mc(2048)
mlx5_core3: Failed to initialize SR-IOV support, error 2



dmesg | grep mce

mce0: Ethernet address: 10:70:fd:b3:57:9e
mce0: link state changed to DOWN
mce1: Ethernet address: 10:70:fd:b3:57:9f
mce1: link state changed to DOWN
mce2: Ethernet address: 10:70:fd:b3:57:f6
mce2: link state changed to DOWN
mce3: Ethernet address: 10:70:fd:b3:57:f7
mce3: link state changed to DOWN
mce0: ERR: mlx5e_ioctl:3516:(pid 48967): tso4 disabled due to -txcsum.
mce0: ERR: mlx5e_ioctl:3529:(pid 49276): tso6 disabled due to -txcsum6.
mce1: ERR: mlx5e_ioctl:3516:(pid 51468): tso4 disabled due to -txcsum.
mce1: ERR: mlx5e_ioctl:3529:(pid 51744): tso6 disabled due to -txcsum6.
mce2: ERR: mlx5e_ioctl:3516:(pid 54872): tso4 disabled due to -txcsum.
mce2: ERR: mlx5e_ioctl:3529:(pid 55595): tso6 disabled due to -txcsum6.
mce3: ERR: mlx5e_ioctl:3516:(pid 57614): tso4 disabled due to -txcsum.
mce3: ERR: mlx5e_ioctl:3529:(pid 57835): tso6 disabled due to -txcsum6.
mce0: link state changed to UP
mce2: link state changed to UP
mce2: link state changed to DOWN
mce2: link state changed to UP
mce1: link state changed to UP
mce3: link state changed to UP
mce3: link state changed to DOWN
mce3: link state changed to UP
mce1: promiscuous mode enabled
mce3: promiscuous mode enabled
mce0: promiscuous mode enabled
mce2: promiscuous mode enabled
mce0: promiscuous mode disabled
mce2: promiscuous mode disabled
mce0: promiscuous mode enabled
mce2: promiscuous mode enabled

For what it's worth but probably not what you want to hear. Had similar challenges, managed to get my Chelsio 10Gbit card to put through approx 5Gb/s after extensive Googling and experimenting and tuning - some of it is documented in previous threads here. At the same time a minimal Linux Debian environment on the same hardware with no tuning maxes very close to line rate in iPerf and NAT firewalling without breaking a sweat (CPU 85% idle). This experience had me abandon OpnSense for now in favour of Debian with a bunch of nftables rules and services configured through the shell. I might come back one day, but it seems to me that there is something quite crippling either in FreeBSD itself or in its configuration in OpnSense.

Yes, this is probably the route I will end up going if there isn't a quick fix.


Yes, that is where I took the majority of my tunables from. The hardware specs I have greatly exceed what the author has, so I was expecting similar results.

Have you seen this thread.

https://forum.opnsense.org/index.php?topic=26038.msg125602#msg125602

I just lurk here, and have a regular cable (1Gbit) setup.

There seem to be professionals who have gotten close to what you are going for.

Is mellanox supported by FreeBSD?

You might need to use a VM (again I have read a lot but never done what you are trying to do).

Cheers,

Can you download speedtest pkg and test local without nat?

Quote from: mimugmail on May 09, 2023, 06:12:17 AM
Can you download speedtest pkg and test local without nat?

This is how I am testing -- direct from the CLI itself.

Quote from: Koldnitz on May 09, 2023, 04:15:52 AM
Have you seen this thread.

https://forum.opnsense.org/index.php?topic=26038.msg125602#msg125602

I just lurk here, and have a regular cable (1Gbit) setup.

There seem to be professionals who have gotten close to what you are going for.

Is mellanox supported by FreeBSD?

You might need to use a VM (again I have read a lot but never done what you are trying to do).

Cheers,

That thread didn't add anything technical like the other post did for tuning.

I hesitate to think that adding a hypervisor layer would improve performance, but I am certainly willing to try that. It seems like an additional bottleneck layer.

@redbarron73, did you check whether you need to update the NIC firmware?

https://network.nvidia.com/support/firmware/connectx5en/

The last update was 2022-11-30 (version 16.35.2000 LTS).

Unless you need OEM specific: https://network.nvidia.com/support/firmware/dell/

You might also want to check for BIOS (a new one just got released) and CPU microcode updates too.

Quote from: benyamin on May 09, 2023, 10:09:55 PM
@redbarron73, did you check whether you need to update the NIC firmware?

https://network.nvidia.com/support/firmware/connectx5en/

The last update was 2022-11-30 (version 16.35.2000 LTS).

Unless you need OEM specific: https://network.nvidia.com/support/firmware/dell/

You might also want to check for BIOS (a new one just got released) and CPU microcode updates too.

Thanks -- Yes, I am running the latest firmware 16.35.2000 LTS

The only thing I am not confident on is the driver version, but from what I have read, updating the driver is not a trivial task.

What driver version are you on?

sysctl -a | grep mlx
...should reveal all.

v3.7.1 (from 2021-07-21) is the latest from nVidia anyway.

You don't really update NIC drivers in FreeBSD.

You probably also want to explicitly disable TSO (LSO) and LRO:
ifconfig mce0 -rxcsum -txcsum -tso -lro
ifconfig mce1 -rxcsum -txcsum -tso -lro
ifconfig mce2 -rxcsum -txcsum -tso -lro
ifconfig mce3 -rxcsum -txcsum -tso -lro


And disable hardware LRO too:
sysctl dev.mce.0.conf.hw_lro=0
sysctl dev.mce.1.conf.hw_lro=0
sysctl dev.mce.2.conf.hw_lro=0
sysctl dev.mce.3.conf.hw_lro=0


There might be similar hardware tunables for TSO/LSO too. This should reveal them:
sysctl -a | grep mce
sysctl -a | grep mlx


You used this net.inet.tcp.tso="0" too, so you might want to review your net.inet.tcp.lro tunables as well.

I believe these are the defaults:
net.inet.tcp.tso: 1
net.inet.tcp.path_mtu_discovery: 1
net.inet.tcp.lro.without_m_ackcmp: 0
net.inet.tcp.lro.with_m_ackcmp: 0
net.inet.tcp.lro.would_have_but: 0
net.inet.tcp.lro.extra_mbuf: 0
net.inet.tcp.lro.lockcnt: 0
net.inet.tcp.lro.compressed: 0
net.inet.tcp.lro.wokeup: 0
net.inet.tcp.lro.fullqueue: 0
net.inet.tcp.lro.lro_cpu_threshold: 50
net.inet.tcp.lro.entries: 8

Thank you for those suggestions. I had those options disabled in the UI, but did verify each of the settings manually and test again locally. The results are the same.


Quote from: benyamin on May 10, 2023, 02:14:16 AM
You probably also want to explicitly disable TSO (LSO) and LRO:
<snip>

Quote from: benyamin on May 10, 2023, 01:51:18 AM
What driver version are you on?

sysctl -a | grep mlx
...should reveal all.

v3.7.1 (from 2021-07-21) is the latest from nVidia anyway.

You don't really update NIC drivers in FreeBSD.


Confirmed - v3.7.1