Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - rungekutta

#46
Quote from: testo_cz on October 31, 2021, 08:20:27 AM
I'm sure there are people using cxgbe based NICs. Have you search this forum ?
Yes and without much coming up.  But yes, I understand Chelsio are very popular both with TrueNAS and pfSense due to strong support in FreeBSD, so I would expect the cards to work very well. Indeed I use the same card (Chelsio T420-CR) in TrueNAS and it works very well and easily saturates 10Gb against other hosts (not OpnSense!).

Quote from: testo_cz on October 31, 2021, 08:20:27 AM
Paste here some boot messages output of the driver ( dmesg | grep cxgbe ) , so it can be double-checked for queues/buffers/netmap configurations.


root@XXX:~ # dmesg | grep cxgbe
cxgbe0: <port 0> on t4nex0
cxgbe0: Ethernet address: 00:07:43:11:2b:10
cxgbe0: 8 txq, 8 rxq (NIC)
cxgbe1: <port 1> on t4nex0
cxgbe1: Ethernet address: 00:07:43:11:2b:18
cxgbe1: 8 txq, 8 rxq (NIC)
cxgbe1: tso4 disabled due to -txcsum.
cxgbe1: tso6 disabled due to -txcsum6.
cxgbe0: tso4 disabled due to -txcsum.
cxgbe0: tso6 disabled due to -txcsum6.
cxgbe1: link state changed to UP
cxgbe0: link state changed to UP
556.329305 [1130] generic_netmap_attach     Emulated adapter for cxgbe1 created (prev was NULL)
556.329322 [1035] generic_netmap_dtor       Emulated netmap adapter for cxgbe1 destroyed
556.331828 [1130] generic_netmap_attach     Emulated adapter for cxgbe1 created (prev was NULL)
556.373055 [ 320] generic_netmap_register   Emulated adapter for cxgbe1 activated
556.381334 [1130] generic_netmap_attach     Emulated adapter for cxgbe0 created (prev was NULL)
556.381356 [1035] generic_netmap_dtor       Emulated netmap adapter for cxgbe0 destroyed
556.384000 [1130] generic_netmap_attach     Emulated adapter for cxgbe0 created (prev was NULL)
556.384271 [ 320] generic_netmap_register   Emulated adapter for cxgbe0 activated


Quote from: testo_cz on October 31, 2021, 08:20:27 AM
When you switch network adapters of the VM from PCI passthrough to Linux bridge+Virtio for example, are you getting more throughput with iperf3 ?
(my smaller HW based Proxmox + 4 vCore OPNsense VM give up to 3Gbps from WAN PC - through OPN - to LAN PC, iperf3 -P2 -t60 ... )
Haven't tried virtual NICs. If there's one thing that people seems to have problem with, it's that... Recommendation seems to be to use passthrough if possible.
#47
Thank you blblblb and testo_cz. I'll get back to your suggestions shortly. Just to confirm first that the issue definitely is with OpnSense; I spun up a minimal Debian VM instead on the same host and ran the same test. Easily saturating the 10Gb link.

Log snippets from the other end of iPerf3, Debian VM for comparison:

-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.200.216, port 40042
[  5] local 192.168.200.10 port 5201 connected to 192.168.200.216 port 40044
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   1.00-2.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   2.00-3.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   3.00-4.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   4.00-5.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   5.00-6.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   6.00-7.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   7.00-8.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   8.00-9.00   sec  1.10 GBytes  9.41 Gbits/sec
[  5]   9.00-10.00  sec  1.10 GBytes  9.41 Gbits/sec
[  5]  10.00-10.00  sec  1.64 MBytes  9.37 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  11.0 GBytes  9.41 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.200.216, port 40046
[  5] local 192.168.200.10 port 5201 connected to 192.168.200.216 port 40048
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  1.10 GBytes  9.44 Gbits/sec    0   3.00 MBytes
[  5]   1.00-2.00   sec  1.10 GBytes  9.41 Gbits/sec    0   3.00 MBytes
[  5]   2.00-3.00   sec  1.08 GBytes  9.31 Gbits/sec  1527   1.83 MBytes
[  5]   3.00-4.00   sec  1.09 GBytes  9.41 Gbits/sec    0   2.20 MBytes
[  5]   4.00-5.00   sec  1.10 GBytes  9.41 Gbits/sec    0   2.20 MBytes
[  5]   5.00-6.00   sec  1.09 GBytes  9.41 Gbits/sec    0   2.20 MBytes
[  5]   6.00-7.00   sec  1.09 GBytes  9.41 Gbits/sec    0   2.20 MBytes
[  5]   7.00-8.00   sec  1.09 GBytes  9.38 Gbits/sec  920   2.26 MBytes
[  5]   8.00-9.00   sec  1.09 GBytes  9.36 Gbits/sec  921   2.31 MBytes
[  5]   9.00-10.00  sec  1.10 GBytes  9.41 Gbits/sec    0   2.32 MBytes
[  5]  10.00-10.00  sec   402 KBytes  8.89 Gbits/sec    0   2.32 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.9 GBytes  9.39 Gbits/sec  3368             sender
-----------------------------------------------------------


Same setup and hardware, OpnSense instead of Debian:

-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.200.1, port 19291
[  5] local 192.168.200.10 port 5201 connected to 192.168.200.1 port 27695
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   160 MBytes  1.34 Gbits/sec
[  5]   1.00-2.00   sec   169 MBytes  1.42 Gbits/sec
[  5]   2.00-3.00   sec   168 MBytes  1.41 Gbits/sec
[  5]   3.00-4.00   sec   172 MBytes  1.44 Gbits/sec
[  5]   4.00-5.00   sec   179 MBytes  1.50 Gbits/sec
[  5]   5.00-6.00   sec   178 MBytes  1.49 Gbits/sec
[  5]   6.00-7.00   sec   187 MBytes  1.57 Gbits/sec
[  5]   7.00-8.00   sec   197 MBytes  1.65 Gbits/sec
[  5]   8.00-9.00   sec   168 MBytes  1.41 Gbits/sec
[  5]   9.00-10.00  sec   166 MBytes  1.39 Gbits/sec
[  5]  10.00-10.04  sec  8.47 MBytes  1.72 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.04  sec  1.71 GBytes  1.46 Gbits/sec                  receiver
-----------------------------------------------------------
Server listening on 5201
-----------------------------------------------------------
Accepted connection from 192.168.200.1, port 53951
[  5] local 192.168.200.10 port 5201 connected to 192.168.200.1 port 65343
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   136 MBytes  1.14 Gbits/sec    0   1.03 MBytes
[  5]   1.00-2.00   sec   136 MBytes  1.14 Gbits/sec   23    766 KBytes
[  5]   2.00-3.00   sec   134 MBytes  1.12 Gbits/sec    0    892 KBytes
[  5]   3.00-4.00   sec   130 MBytes  1.09 Gbits/sec    0   1018 KBytes
[  5]   4.00-5.00   sec   129 MBytes  1.09 Gbits/sec    0   1.12 MBytes
[  5]   5.00-6.00   sec   135 MBytes  1.13 Gbits/sec    0   1.24 MBytes
[  5]   6.00-7.00   sec   132 MBytes  1.11 Gbits/sec    0   1.36 MBytes
[  5]   7.00-8.00   sec   132 MBytes  1.11 Gbits/sec    8   1.08 MBytes
[  5]   8.00-9.00   sec   144 MBytes  1.21 Gbits/sec    0   1.19 MBytes
[  5]   9.00-10.00  sec   132 MBytes  1.11 Gbits/sec    0   1.27 MBytes
[  5]  10.00-10.00  sec   126 KBytes  1.19 Gbits/sec    0   1.27 MBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.31 GBytes  1.12 Gbits/sec   31             sender
#49
I installed OpnSense on a usb stick and tested the same hardware on bare metal. Unfortunately more or less exactly the same results. WAN is limited to 1.4Gb/s with or without suricata and iperf3 between OpnSense and another 10Gb box averages around 1Gb/s over the 10Gb/s link.

I will try Linux on the same hardware next to see if it's a hardware problem or the limitations are with OpnSense.
#50
Nobody..? I guess I was hoping some FreeBSD whiz would come along and tell me to run some obscure PCIe command line tools or some other profiling to help me find the problem...   ;)

I guess I could try to install opnsense bare metal on a usb stick, move over the config and try that way to see how it performs. If better then the problem obviously lies with the virtualization somehow. Otherwise some hardware problem as not even iperf gets decent performance.
#51
Hi,

Just upgraded WAN to 10Gb so trying to get OPNSense up to 10Gb too. The hardware is powerful but performance is poor... I need some help troubleshooting!

Hardware:
ASRock X470-D4U motherboard with Ryzen 3700x CPU (8c/16t, 3.6GHz up to 4.4GHz turbo)
32GB RAM
Intel i350-T4 quad gigabit NIC
Chelsio T420-CR dual 10Gb SFP+
MikroTik CRS328-24P-4S+RM switch

Software environment:
Proxmox 7.0
OPNSense running virtualised with the Intel i350-T4 and Chelsio T420-CR in PCIe passthrough to the VM

And before you say anything... yes I also suspect that it's the virtualisation that somehow causes my performance issues... but I want to be sure before I migrate to bare metal as the virtualisation provides many benefits including easy snapshot backups etc.

Description of symptoms:
Noticed that WAN speed were poor (approx 1.4Gb/s) with Suricata still only consuming approx 50% of total CPU. No improvement with Suricata disabled, and in that case with CPU mostly idle according to top.

So I moved on to test my internal network with iPerf3. I verified that my NAS (TrueNAS, Chelsio T420-CR) and another Proxmox node (Ryzen 5950x, Mellanox ConnectX-4 Lx) saturate 10Gb/s no problem via iPerf3. Both machines however, via the same switch and network cards, into the Chelsio T420-CR in OPNSense, hardly manage to break 1Gb/s:

Connecting to host 192.168.200.1, port 5201
[  5] local 192.168.200.10 port 22966 connected to 192.168.200.1 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   124 MBytes  1.04 Gbits/sec    0    757 KBytes       
[  5]   1.00-2.00   sec   128 MBytes  1.07 Gbits/sec    0   1.30 MBytes       
[  5]   2.00-3.00   sec   132 MBytes  1.11 Gbits/sec   30    810 KBytes       
[  5]   3.00-4.00   sec   124 MBytes  1.04 Gbits/sec    0    937 KBytes       
[  5]   4.00-5.00   sec   133 MBytes  1.12 Gbits/sec    0   1.04 MBytes       
[  5]   5.00-6.00   sec   128 MBytes  1.07 Gbits/sec    0   1.16 MBytes       
[  5]   6.00-7.00   sec   144 MBytes  1.21 Gbits/sec    0   1.28 MBytes       
[  5]   7.00-8.00   sec   138 MBytes  1.15 Gbits/sec    0   1.31 MBytes       
[  5]   8.00-9.00   sec   133 MBytes  1.11 Gbits/sec    0   1.31 MBytes       
[  5]   9.00-10.00  sec   123 MBytes  1.03 Gbits/sec    0   1.31 MBytes       
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.28 GBytes  1.10 Gbits/sec   30             sender
[  5]   0.00-10.01  sec  1.27 GBytes  1.09 Gbits/sec                  receiver


CPU is approx 65% idle during the test so hardly the bottleneck.

Here is the output of "ifconfig -v cxgbe1":

cxgbe1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=28c00b8<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU,VLAN_HWCSUM,VLAN_HWTSO,LINKSTATE,HWRXTSTMP>
ether 00:07:43:11:2b:18
inet6 fe80::207:43ff:fe11:2b18%cxgbe1 prefixlen 64 scopeid 0x2
inet 192.168.200.1 netmask 0xffffff00 broadcast 192.168.200.255
media: Ethernet autoselect (10Gbase-Twinax <full-duplex,rxpause,txpause>)
status: active
nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
plugged: SFP/SFP+/SFP28 1X Copper Passive (Copper pigtail)
vendor: FS PN: SFPP-PC015 SN: S2108004672-1 DATE: 2021-08-11


/boot/loader.conf.local contains:

t4fw_cfg_load="YES"
if_cxgbe_load="YES"

# Disabling cxgbe caps
hw.cxgbe.toecaps_allowed="0"
hw.cxgbe.rdmacaps_allowed="0"
hw.cxgbe.iscsicaps_allowed="0"
hw.cxgbe.fcoecaps_allowed="0"


I have no idea where to go from here. Any ideas on how to troubleshoot to find the bottleneck?
#52
Thanks. I'm interested in tuning Suricata as it's already using a fair bit of CPU on 1Gb WAN, and as I will be getting 10Gb soon I foresee it becoming a bottleneck pretty quickly.

On initial setup and when I switched to Hyperscan I noticed a big difference. Anything else has been very marginal at best, so before starting to hack configuration files I'm interested in more quantifiable data on actual expected gains. Have you made any such comparisons...?
#53
What is the difference in performance and/or other behavior before and after? Do you really need to tune Suricata on a 100Mbit connection?
#54
Same thing here. Everything else equal (same rules etc), Suricata went from 10-20% to typically 20-40% CPU. That's on an 8 core Ryzen, so plenty of room left, but the difference is clear pre and post 21.7.3 upgrade.

I've got gigabit WAN and still max it out with IPS enabled, Suricata then uses 300-350% CPU, i.e. the equivalent of pegging 3-4 cores.
#55
Hardware and Performance / Re: new hardware
August 23, 2021, 03:43:36 PM
The higher end OPNsense hardware uses Epyc embedded (3101 and 3201), and comes with 10Gb but "only" claim 2Gb/s with IDS. Nice and small and fanless, but again, not quite enough grunt to do the line capacity full justice.

Meanwhile, higher end enterprise stuff with beefy Xeons or Epycs will certainly do the job but not something you want to put in your bedroom cupboard or living room...

Until AMD or Intel updates their embedded series, I think the way forward is custom build with desktop CPU on server m/board (in order to get enough ports). Current gen desktop CPUs are crazy fast and quite power efficient, particularly the Ryzen 5000 series.

Edit: and motherboards something like SuperMicro X11SCL-LN4F or ASRock Rack X470D4U or X570D4U
#56
Hardware and Performance / Re: new hardware
August 23, 2021, 12:19:38 PM
Re heat and power consumption, I think it's quite the opposite these days. The Ryzens are much more power efficient than Intel, due to various things, including AMD's 7nm manufacturing process.

See here from Anandtech where they provide some examples and measurements:
https://www.anandtech.com/show/16343/intel-core-i710700-vs-core-i710700k-review-is-65w-comet-lake-an-option/23

PS I don't know know what would happen these days if you removed the cooler... but I note that video is from 2001 ;-)
#57
Hardware and Performance / Re: new hardware
August 23, 2021, 11:41:04 AM
I'm in a similar position. Have been eyeing up the SuperMicro Epyc embedded boards (Epyc 3101 or 3201) - in many ways they are perfect, with 4 built-in Nics and a free PCIe slot to add more as required, and low power and small form factor. But they are getting quite old and still based on Zen 1, and I don't know if they have enough grunt to drive 10Gb. Current gen Ryzen (or Epyc) are nearly 50% faster single-core according to benchmarks.

Given that, and if sound is an issue, I think a recent'ish business desktop (e.g. Dell Optiplex or HP EliteDesk) with an added Nic is the best bang for the buck. However there would typically only be one (1) PCIe slot available. So ok for a quad gigabit Nic, or dual SFP+, but not both at the same time. That's however what I'm looking for, so currently leaning towards a custom Ryzen or Xeon E-2200 build based on a server uATX motherboard (SuperMicro or ASRock Rack) and carefully chosen case and fans to keep noise levels to minimum. Should be doable.
#58
Thanks. Noise unfortunately is a concern... this is for domestic use and the fibre terminates in the bedroom (cupboard) where I currently have the Qotom. I could of course trunk WAN somewhere else and then back out again (got servers in the cellar..). But am also considering a quiet Ryzen build. Should be doable with careful selection of parts (chassis and cpu fan). Something like the ASRock X470D4U has enough pci slots for both a quad gig nic and dual sfp+ 10Gb. And I've got those nics already.
#59
Quote from: rungekutta on August 15, 2021, 09:48:59 AM
Does anyone know if this would work, or if the Dell is hardwired to assume that the x16 card is a GPU? I've seen some reports like that for earlier versions.
I can answer my own question here after receiving guidance on a Dell forum. Unfortunately the PCIe x16 port is for GPU and not general purpose. There's only the single x4 port available for general purpose.

Shame. Would have been a killer setup otherwise, in terms of performance per $ and in a small and quiet package.

Guess I need to look at server motherboards for this, or an appliance like DEC850.
#60
Hah, true, but no I need 1x 10Gb port for wan, and 1x for lan. As for the rest I like to keep dmz, guest/IoT networks on their own nics in the firewall for simplicity, but there 1Gb is enough. I guess I could solve it with vlans in opnsense and trunking all networks into the same physical port, but so far I have avoided that.