I updated Thursday evening to OPNsense 21.1-amd64 and realized next morning that routed permanent video streams between LAN and WAN were significant slower until they broke very soon.
To the background, I have OPNsense running within my local network separating networks, so there is another router before reaching the Internet, which allows me 1GBit speedtests via OPNsense within my infrastructure.
So, in short I have LAN <-> OPNsense --> WAN <--> FritzBox --> Internet, this allows me stress tests from my LAN into the WAN without going through the slower internet connection of the provider. If it matters, I am running OPNsense on a 4 core Intel Xeon E5506, 20GB RAM, 2x Broadcom NetXtreme II BCM5709, 4x Intel 82580. Sensei currently deactivated.
After analysis I figured out that IDS/IPS is the root cause here. I came updated from 20.7.8_4 were everything was fine and as I read here https://opnsense.org/opnsense-21-1-marvelous-meerkat-released/ (https://opnsense.org/opnsense-21-1-marvelous-meerkat-released/) there are no changes made to Suricata within the release. I did not make any changes to the related setup or rules etc.
So, I made some interesting iperf3 measurements.
OPNsense v20.7.8_4:
Host LAN-net <-> Host WAN-net with IDS/IPS activated --> ~550 MBit/s
OPNsense v21.1:
Host LAN-net <-> Host WAN-net with IDS/IPS activated:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-60.00 sec 1.68 GBytes 240 Mbits/sec 128 sender
[ 5] 0.00-60.17 sec 1.68 GBytes 239 Mbits/sec receiver
Host LAN-net <-> Host WAN-net no IDS/IPS:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 2.54 GBytes 729 Mbits/sec 978 sender
[ 5] 0.00-30.01 sec 2.54 GBytes 728 Mbits/sec receiver
OPNsense <-> Host WAN-net no IDS/IPS:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 3.29 GBytes 942 Mbits/sec 408 sender
[ 5] 0.00-30.00 sec 3.29 GBytes 941 Mbits/sec receiver
As a result I can see within the update a performance dropped from ~550 Mbit/s down to ~240 Mbit/s, which is a performance drop of ~310 MBit/s aka 56%, which I cannot explain but measure. My overall routing power between LAN and WAN seems to be around 729 Mbit/s, which is acceptable for me as quite some video streams were passing through the firewall during measurement and where I do not have a comparable value from before the update.
Any suggestions what causes this IDS/IPS impact? Can someone second this behavior on his setup as well? I know for future only-internet-connections, this might be sufficient, but currently I feel unhappy with the result as it just came with the update to 21.1.
Looking forward for hints, ideas and comments.
I have seen the same bahavior. 50% - 60% performance loss in suricata.
It feels like every update somehow reduces the overall suricata performance.
Why is that?
HW:
https://bsd-hardware.info/?probe=453d257afe
EDIT:
with the hardware listed above i was able to reach gigabit speed with more then ~40000 Suricata rules with older Software builds. After sorting out rules that i dont need im down to ~25000. But the performance is still decreasing.
Posted in the Suricata sub-board, but it certainly appears that Suricata performance has degraded with the 21.1 upgrade.
In my case, my Xbox access to Gamepass basically became Null with Suricata enabled and back to normal (~500MBit/s) disabled....no performance issues prior to the OPN upgrade, same rules (ALL) etc.
I tried disabling the new policy approach but it didn't seem to matter.
FWIW....
It's probably more iflib patching on FreeBSD stable/12 ... You can install the older kernel to see...
# opnsense-update -zkr 20.7.8
# opnsense-shell reboot
Cheers,
Franco
Thanks for the reply.
i downgraded the kernel and rebootet.
The performance is nonetheless far under my expectations:
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 44.5 MBytes 373 Mbits/sec 47 626 KBytes
[ 5] 1.00-2.00 sec 78.7 MBytes 660 Mbits/sec 0 711 KBytes
[ 5] 2.00-3.00 sec 77.4 MBytes 649 Mbits/sec 1 559 KBytes
[ 5] 3.00-4.00 sec 78.7 MBytes 660 Mbits/sec 0 656 KBytes
[ 5] 4.00-5.00 sec 77.4 MBytes 650 Mbits/sec 0 741 KBytes
[ 5] 5.00-6.00 sec 74.9 MBytes 628 Mbits/sec 5 585 KBytes
[ 5] 6.00-7.00 sec 78.7 MBytes 660 Mbits/sec 0 680 KBytes
[ 5] 7.00-8.00 sec 78.7 MBytes 660 Mbits/sec 0 764 KBytes
[ 5] 8.00-9.00 sec 78.6 MBytes 660 Mbits/sec 8 618 KBytes
[ 5] 9.00-10.00 sec 78.7 MBytes 660 Mbits/sec 0 710 KBytes
I don't understand this:
> The performance is nonetheless far under my expectations
Well, it is the same as 20.7.8 with the old kernel or not?
Because... there have been no other moving parts in the major iteration that would cause this.
If your expectations are higher in any case you should probably skip right to https://bugs.freebsd.org/bugzilla/
Cheers,
Franco
I made an odd finding.
The Performance stays at around 650mbits. It doesnt matter if 22.000 or 10.000 rules are loaded.
I think a lot of the fixes in the -next kernels helped Suricata performance. I recently upgraded to 20.7.8 and also upgraded my 20.7.5-next kernel to 20.7.8 and noticed a speed drop. My WAN inspection went from 1Gb down to 630-650Mbps. Turning off Suricata while running 20.7.8 kernel and the speed returned to -next levels.
Tim
i am on OPNsense 21.1 and i don't have any problem?
iperf3 -c 10.0.3.1 -u -t 60 -i 10 -b 1000M
Connecting to host 10.0.3.1, port 5201
[ 5] local 10.0.3.2 port 60596 connected to 10.0.3.1 port 5201
[ ID] Interval Transfer Bitrate Total Datagrams
[ 5] 0.00-10.00 sec 1.10 GBytes 943 Mbits/sec 813711
[ 5] 10.00-20.00 sec 1.10 GBytes 943 Mbits/sec 813645
[ 5] 20.00-30.00 sec 1.10 GBytes 943 Mbits/sec 813746
[ 5] 30.00-40.00 sec 1.10 GBytes 943 Mbits/sec 813787
[ 5] 40.00-50.00 sec 1.10 GBytes 943 Mbits/sec 813730
[ 5] 50.00-60.00 sec 1.10 GBytes 943 Mbits/sec 813777
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-60.00 sec 6.58 GBytes 943 Mbits/sec 0.000 ms 0/4882396 (0%) sender
[ 5] 0.00-60.00 sec 6.56 GBytes 939 Mbits/sec 0.011 ms 20901/4882368 (0.43%) receiver
iperf Done.
Hardware:
AMD Ryzen 3 2200G with Radeon Vega Graphics (4 cores)
8GB RAM
Intel PRO/1000 PT Dual Port Server Adapter (PCI-e 4x) (driver: EM)
when i was on OPNsense 20.1.8_1 it was:
iperf3 -c 10.0.3.31 -u -t 60 -i 10 -b 1000M
Connecting to host 10.0.3.31, port 5201
[ 5] local 10.0.3.1 port 44924 connected to 10.0.3.31 port 5201
[ ID] Interval Transfer Bitrate Total Datagrams
[ 5] 0.00-10.00 sec 1.16 GBytes 1000 Mbits/sec 856118
[ 5] 10.00-20.00 sec 1.16 GBytes 1.00 Gbits/sec 856870
[ 5] 20.00-30.00 sec 1.16 GBytes 1000 Mbits/sec 857061
[ 5] 30.00-40.00 sec 1.16 GBytes 1.00 Gbits/sec 856166
[ 5] 40.00-50.00 sec 1.16 GBytes 1000 Mbits/sec 857113
[ 5] 50.00-60.00 sec 1.16 GBytes 1.00 Gbits/sec 857192
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-60.00 sec 6.98 GBytes 1000 Mbits/sec 0.000 ms 0/5140520 (0%) sender
[ 5] 0.00-60.00 sec 3.34 GBytes 479 Mbits/sec 0.046 ms 2680818/5140353 (52%) receiver
iperf Done.
next week i'm going to upgrade to 10Gbe nic and fiber, will test if there will be a decrease of performance...
@annoniempjuh looking at this numbers i assume that you are using suricata in IDS mode. My throughput was with IPS Mode.
What decreased the performance between 20.7.5 and 20.7.8?
Suricata is in IPS mode ;)
i only tested v20.1.8_1 and v21.1
@annoniempjuh you tested iperf3 with UDP. using udp i get simila numbers.
My result [ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 44.5 MBytes 373 Mbits/sec 47 626 KBytes
[ 5] 1.00-2.00 sec 78.7 MBytes 660 Mbits/sec 0 711 KBytes
[ 5] 2.00-3.00 sec 77.4 MBytes 649 Mbits/sec 1 559 KBytes
[ 5] 3.00-4.00 sec 78.7 MBytes 660 Mbits/sec 0 656 KBytes
[ 5] 4.00-5.00 sec 77.4 MBytes 650 Mbits/sec 0 741 KBytes
[ 5] 5.00-6.00 sec 74.9 MBytes 628 Mbits/sec 5 585 KBytes
[ 5] 6.00-7.00 sec 78.7 MBytes 660 Mbits/sec 0 680 KBytes
[ 5] 7.00-8.00 sec 78.7 MBytes 660 Mbits/sec 0 764 KBytes
[ 5] 8.00-9.00 sec 78.6 MBytes 660 Mbits/sec 8 618 KBytes
[ 5] 9.00-10.00 sec 78.7 MBytes 660 Mbits/sec 0 710 KBytes
was with plain settings: iperf3 -c <serverip>
What i ment with "What decreased the performance between 20.7.5 and 20.7.8?" was refering to klamath post.
Still this question remains unanswered. Maybe franco can shine a little light on this.
Quote from: seed on February 04, 2021, 03:24:02 PM
@annoniempjuh you tested iperf3 with UDP. using udp i get simila numbers.
My result [ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 44.5 MBytes 373 Mbits/sec 47 626 KBytes
[ 5] 1.00-2.00 sec 78.7 MBytes 660 Mbits/sec 0 711 KBytes
[ 5] 2.00-3.00 sec 77.4 MBytes 649 Mbits/sec 1 559 KBytes
[ 5] 3.00-4.00 sec 78.7 MBytes 660 Mbits/sec 0 656 KBytes
[ 5] 4.00-5.00 sec 77.4 MBytes 650 Mbits/sec 0 741 KBytes
[ 5] 5.00-6.00 sec 74.9 MBytes 628 Mbits/sec 5 585 KBytes
[ 5] 6.00-7.00 sec 78.7 MBytes 660 Mbits/sec 0 680 KBytes
[ 5] 7.00-8.00 sec 78.7 MBytes 660 Mbits/sec 0 764 KBytes
[ 5] 8.00-9.00 sec 78.6 MBytes 660 Mbits/sec 8 618 KBytes
[ 5] 9.00-10.00 sec 78.7 MBytes 660 Mbits/sec 0 710 KBytes
was with plain settings: iperf3 -c <serverip>
What i ment with "What decreased the performance between 20.7.5 and 20.7.8?" was refering to klamath post.
Still this question remains unanswered. Maybe franco can shine a little light on this.
didn't notice i was using UDP...
iperf3 -c 10.0.3.1
Connecting to host 10.0.3.1, port 5201
[ 5] local 10.0.3.2 port 44238 connected to 10.0.3.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 80.9 MBytes 679 Mbits/sec 0 243 KBytes
[ 5] 1.00-2.00 sec 63.4 MBytes 532 Mbits/sec 0 243 KBytes
[ 5] 2.00-3.00 sec 39.6 MBytes 332 Mbits/sec 0 243 KBytes
[ 5] 3.00-4.00 sec 49.5 MBytes 416 Mbits/sec 1 243 KBytes
[ 5] 4.00-5.00 sec 56.8 MBytes 476 Mbits/sec 0 243 KBytes
[ 5] 5.00-6.00 sec 54.5 MBytes 457 Mbits/sec 0 246 KBytes
[ 5] 6.00-7.00 sec 48.3 MBytes 405 Mbits/sec 1 246 KBytes
[ 5] 7.00-8.00 sec 44.4 MBytes 372 Mbits/sec 0 243 KBytes
[ 5] 8.00-9.00 sec 74.6 MBytes 626 Mbits/sec 0 246 KBytes
[ 5] 9.00-10.00 sec 35.9 MBytes 301 Mbits/sec 0 5.66 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 548 MBytes 460 Mbits/sec 2 sender
[ 5] 0.00-10.00 sec 546 MBytes 458 Mbits/sec receiver
iperf Done.
its indeed slower then i expected ::)
@seed
There is a -next release of the 20.7.X branch that included a lot of fixes for intel drivers with iflib and netmap. Here is the thread, https://forum.opnsense.org/index.php?topic=17363.0
I think the -next kernels have been removed for some reason, Im trying to see if someone can restore them as they are the fix to IDS/IPS running on intel cards.
My NIC hardware:
Ethernet Connection X722 for 10GbE SFP+
Ethernet Connection X722 for 10GBASE-T
I350 Gigabit Network Connection
Guys, -next is what lead to 21.1. The test kernels have been removed.
So if you want to compare stock 21.1 and 20.7.x is the best option.
Cheers,
Franco
today i did upgrade OPNsense end my server to 10Gbit NICs
hardware:
Intel Ethernet Converged Network Adapter X540-T2 (OPNsense)
Mellanox ConnectX-3 CX311A (unRAID server)
MikroTik Cloud Smart Switch 326-24G-2S+RM (switch)
Iperf results:
suricata OFF = cpu usage 40% / 51%
iperf3 -c 10.0.3.1 -t 60 -i 10
Connecting to host 10.0.3.1, port 5201
[ 5] local 10.0.3.2 port 35558 connected to 10.0.3.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-10.00 sec 3.03 GBytes 2.60 Gbits/sec 0 252 KBytes
[ 5] 10.00-20.00 sec 2.99 GBytes 2.57 Gbits/sec 0 246 KBytes
[ 5] 20.00-30.00 sec 2.98 GBytes 2.56 Gbits/sec 0 243 KBytes
[ 5] 30.00-40.00 sec 2.96 GBytes 2.54 Gbits/sec 0 209 KBytes
[ 5] 40.00-50.00 sec 2.93 GBytes 2.52 Gbits/sec 0 277 KBytes
[ 5] 50.00-60.00 sec 2.97 GBytes 2.55 Gbits/sec 0 260 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-60.00 sec 17.9 GBytes 2.56 Gbits/sec 0 sender
[ 5] 0.00-60.00 sec 17.9 GBytes 2.56 Gbits/sec receiver
iperf Done.
iperf3 -c 10.0.3.1 -t 60 -i 10 -R
Connecting to host 10.0.3.1, port 5201
Reverse mode, remote host 10.0.3.1 is sending
[ 5] local 10.0.3.2 port 36642 connected to 10.0.3.1 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 3.82 GBytes 3.28 Gbits/sec
[ 5] 10.00-20.00 sec 3.89 GBytes 3.35 Gbits/sec
[ 5] 20.00-30.00 sec 3.82 GBytes 3.28 Gbits/sec
[ 5] 30.00-40.00 sec 3.75 GBytes 3.22 Gbits/sec
[ 5] 40.00-50.00 sec 3.60 GBytes 3.09 Gbits/sec
[ 5] 50.00-60.00 sec 3.76 GBytes 3.23 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-60.00 sec 22.6 GBytes 3.24 Gbits/sec 8384 sender
[ 5] 0.00-60.00 sec 22.6 GBytes 3.24 Gbits/sec receiver
iperf Done.
suricata ON = cpu usage 59% / 76%
iperf3 -c 10.0.3.1 -t 60 -i 10
Connecting to host 10.0.3.1, port 5201
[ 5] local 10.0.3.2 port 37868 connected to 10.0.3.1 port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-10.00 sec 2.80 GBytes 2.40 Gbits/sec 0 5.66 KBytes
[ 5] 10.00-20.00 sec 2.81 GBytes 2.42 Gbits/sec 0 272 KBytes
[ 5] 20.00-30.00 sec 2.78 GBytes 2.38 Gbits/sec 0 223 KBytes
[ 5] 30.00-40.00 sec 2.79 GBytes 2.40 Gbits/sec 0 240 KBytes
[ 5] 40.00-50.00 sec 1.53 GBytes 1.32 Gbits/sec 4 1.41 KBytes
[ 5] 50.00-60.01 sec 0.00 Bytes 0.00 bits/sec 2 1.41 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-60.01 sec 12.7 GBytes 1.82 Gbits/sec 6 sender
[ 5] 0.00-61.65 sec 12.7 GBytes 1.77 Gbits/sec receiver
iperf Done.
iperf3 -c 10.0.3.1 -t 60 -i 10 -R
Connecting to host 10.0.3.1, port 5201
Reverse mode, remote host 10.0.3.1 is sending
[ 5] local 10.0.3.2 port 38420 connected to 10.0.3.1 port 5201
[ ID] Interval Transfer Bitrate
[ 5] 0.00-10.00 sec 1.40 GBytes 1.21 Gbits/sec
[ 5] 10.00-20.00 sec 1.37 GBytes 1.17 Gbits/sec
[ 5] 20.00-30.00 sec 1.40 GBytes 1.20 Gbits/sec
[ 5] 30.00-40.00 sec 1.39 GBytes 1.19 Gbits/sec
[ 5] 40.00-50.00 sec 1.40 GBytes 1.20 Gbits/sec
[ 5] 50.00-60.00 sec 1.41 GBytes 1.21 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-60.00 sec 8.37 GBytes 1.20 Gbits/sec 18 sender
[ 5] 0.00-60.00 sec 8.37 GBytes 1.20 Gbits/sec receiver
iperf Done.
I upgraded the kernel in my 20.7 install to 21.1 kernel and notice a speed drop when IPS is enabled.
IDS only:
Speedtest by Ookla
Server: ZochNet - Lincoln, TX (id = 20875)
ISP: Grande Communications
Latency: 22.87 ms (0.73 ms jitter)
Download: 926.23 Mbps (data used: 1.0 GB)
Upload: 46.38 Mbps (data used: 66.3 MB)
Packet Loss: 0.0%
IDS/IPS:
Speedtest by Ookla
Server: ZochNet - Lincoln, TX (id = 20875)
ISP: Grande Communications
Latency: 19.63 ms (1.04 ms jitter)
Download: 666.98 Mbps (data used: 626.3 MB)
Upload: 45.06 Mbps (data used: 67.3 MB)
Packet Loss: 0.0%
I did not have these speed issues when using the 20.7.5-next kernel branch with my hardware.
My monitoring shows increased CPU usage.
Before with 20.7.x it maxed out at 22%.
With 21.1.x it goes up to 80%.
With all tests I get my max bandwidth of 200mbit.
It does not happen with suricata disabled, the weird thing is that suricata is not set to listen on lan, only on dmz.
Is it possible that only XEN virtualized OPNsense instances could be affected? I have the impression that I have read a few times here in the forum about performance drops when switching to 21.1 in connection with XEN. But please take it with a grain of salt: I can't judge that myself, since I don't use XEN.
Quote from: thowe on February 08, 2021, 09:54:38 AM
Is it possible that only XEN virtualized OPNsense instances could be affected?
No, I do not think so, I'm also running directly on bare metal and I was affected here as well. Franco pointed out the kernel to be responsible and this could be confirmed. So I can currently live with only IDS and no IPS activated until this is under better control, but downgrading kernel is in my mind currently the approbate solution in relation to the performance drop.
Well, the issue still lies with FreeBSD stable/12 branch going backwards on performance. It is really an uphill battle to get fixes AND avoid regressions.
Cheers,
Franco
I haven't seen performance issues on my setup between these versions, but since there have been quite some fixes around netmap in different kernels, it's often a good idea to check if it's Suricata causing issues or netmap.
There is a simple "bridge" tool for netmap available in the kernel source directory, if people want to check netmap behaviour on their hardware they can always build the tool and create a bridge between the card and the host-stack to rule out certain driver issues.
To install it, you need the kernel source directory in place (/sur/src), you can use the build tools (https://github.com/opnsense/tools) to checkout all sources on your machine.
When the sources are in place, you can build the tools using the following commands (on amd64):
cd /usr/src/tools/tools/netmap/
make bridge
Next make sure netmap isn't used (no Suricata or Sensei) and create a bridge between the physical connection and the host stack, assuming the interface in question is called vmx2, the command would look like:
/usr/obj/usr/src/amd64.amd64/tools/tools/netmap/bridge -i netmap:vmx2 -i netmap:vmx2
Wait a few seconds and start the test again with OPNsense in between. When netmap isn't interfering the test with or without bridge should show roughly the same numbers.
Best regards,
Ad
Is there any tooling that opnsense can provide or recommend the users experiencing slowness with IDS/IPS can consume to help close loop this issue? I started using opnsense a few months ago and it seems that every minor and major release has some regression with netmap/iflib. I think having some tooling the users can use to provide meaningful feedback to the dev team to chase down these issues would be useful to all.
Quote from: AdSchellevis on February 08, 2021, 03:47:18 PM
I haven't seen performance issues on my setup between these versions, but since there have been quite some fixes around netmap in different kernels, it's often a good idea to check if it's Suricata causing issues or netmap.
There is a simple "bridge" tool for netmap available in the kernel source directory, if people want to check netmap behaviour on their hardware they can always build the tool and create a bridge between the card and the host-stack to rule out certain driver issues.
To install it, you need the kernel source directory in place (/sur/src), you can use the build tools (https://github.com/opnsense/tools) to checkout all sources on your machine.
When the sources are in place, you can build the tools using the following commands (on amd64):
cd /usr/src/tools/tools/netmap/
make bridge
Next make sure netmap isn't used (no Suricata or Sensei) and create a bridge between the physical connection and the host stack, assuming the interface in question is called vmx2, the command would look like:
/usr/obj/usr/src/amd64.amd64/tools/tools/netmap/bridge -i netmap:vmx2 -i netmap:vmx2
Wait a few seconds and start the test again with OPNsense in between. When netmap isn't interfering the test with or without bridge should show roughly the same numbers.
Best regards,
Ad
> I started using opnsense a few months ago and it seems that every minor and major release has some regression with netmap/iflib.
Interesting analysis...maybe because 20.7 was the first release that had netmap/iflib combo in the OS to deal with in kernel and FreeBSD support on it still is meagre at best.
Basically Sensei people had to go and sponsor work around netmap to get that somewhere back to where any version before 20.7 was. ;)
The iflib author hasn't done anything of value on his iflib work since and moved on to work on the wireguard kernel integration instead. It's a great outlook really with things like this going on in an OS.
Cheers,
Franco
@klamath I'm not sure why your quoting my message, but as said if it's purely about netmap, one could try to use bridge to pinpoint issues if they exists for their setup. It's definitely not the case that there are issues in all releases, we test the hardware we provide on periodic bases and haven't seen a lot of (major) issues ourselves.
Quite some reports about performance are related to too optimistic assumptions (e.g. expecting 1Gbps IPS on an apu board for example) or drivers which aren't very well supported (we ship what's being offered upstream, if support isn't great in FreeBSD for netmap, it highly likely isn't great on our end). IPS needs quite some computing power and isn't comparable to normal routing/firewall functions at all in terms of requirements.
When it comes to testing, we tend to offer test kernels and release candidates on periodic bases. To help catching issues up front, please do test, document behaviour when experiencing issues, and try to track them to FreeBSD bug reports if they exists. When there are fixes available upstream, we often assess if we can backport them into our system. Quite some fixes have been merged in the last versions for various drivers (with quite some help from the Sensei people as Franco mentioned), I haven't seen side affects in terms of performance myself, but that doesn't mean they don't exist for some drivers.
Best regards,
Ad
Thank you for the response, the reason I quoted you I was trying to pin-point what you are wanting us to test for and what artifacts we can transmit back for follow up. I am not trying to finger point or assign blame to anyone or any project, I am trying to see if there is some tooling that we the users can consume and send back on what we are seeing on our end to cut down on the "we didn't see that in our testing" back and forth as it doesn't help resolve the issues at hand.
If at the end of the day this level of back and forth is helpful and there is no standardized troubleshooting tooling we can use so be it, but I would like to work on getting this issue resolved.
That being said what are the next steps I and the other people having issues with the 21.1 kernel with IDS/IPS enabled provide you with narrowing down the issue?
Quote from: AdSchellevis on February 08, 2021, 09:07:25 PM
@klamath I'm not sure why your quoting my message, but as said if it's purely about netmap, one could try to use bridge to pinpoint issues if they exists for their setup. It's definitely not the case that there are issues in all releases, we test the hardware we provide on periodic bases and haven't seen a lot of (major) issues ourselves.
Quite some reports about performance are related to too optimistic assumptions (e.g. expecting 1Gbps IPS on an apu board for example) or drivers which aren't very well supported (we ship what's being offered upstream, if support isn't great in FreeBSD for netmap, it highly likely isn't great on our end). IPS needs quite some computing power and isn't comparable to normal routing/firewall functions at all in terms of requirements.
When it comes to testing, we tend to offer test kernels and release candidates on periodic bases. To help catching issues up front, please do test, document behaviour when experiencing issues, and try to track them to FreeBSD bug reports if they exists. When there are fixes available upstream, we often assess if we can backport them into our system. Quite some fixes have been merged in the last versions for various drivers (with quite some help from the Sensei people as Franco mentioned), I haven't seen side affects in terms of performance myself, but that doesn't mean they don't exist for some drivers.
Best regards,
Ad
@klamath no problem, to increase chances of gaining traction with an issue, best thing todo is to write down what you tested (exact equipment) and between which (kernel) versions you noticed a difference in performance. Like I tried to tell earlier, a lot of these issues aren't alike, so conditions matter. Between 20.7.x and 21.1.x the number of changes are more limited, 20.1 isn't very comparable due to the upstream change to iflib (not our choice, just a fact of life).
If the number of moving parts are limited, it's easier to point to specific changes. Sometimes a simple iperf3 test with machines on both ends of the firewall is already enough to notice a difference. When the problem is "my network card worked great before iflib", I'm afraid people are really looking for voluntary kernel engineers. Maybe it helps to ask the vendor for better FreeBSD support, I don't know, realistically if it's not equipement we use, chances aren't very large people will spend a lot of time on these type of issues over here. (sometimes tracking these issues down costs many days, if the vendor doesn't really care, there aren't a lot of people spending that much of their spare time)
Best regards,
Ad
Running version 20.7.8 Business Edition
Kernel 21.1 (FreeBSD cerberus.underworld.local 12.1-RELEASE-p12-HBSD FreeBSD 12.1-RELEASE-p12-HBSD #0 3c6040c7243(stable/21.1)-dirty: Mon Jan 25 12:27:52 CET 2021 root@sensey:/usr/obj/usr/src/amd64.amd64/sys/SMP amd64)
Hardware: https://www.supermicro.com/en/products/system/Mini-ITX/SYS-E300-9D-8CN8TP.cfm
Ram: 64 GB ECC
When Suricata is enabled with IDS/IPS protection the max WAN speed is capped at around 650-670Mbps, with IPS mode disabled I can achieve full 1Gb/s down.
I did not see any of this issue with 20.7.5-NEXT, but since I upgraded to 20.7.8 and the -next kernels got removed I have no way to restore my performance back to satisfactory levels.
Seems that Freenas is having the same issues around iflib: https://jira.ixsystems.com/plugins/servlet/mobile#issue/NAS-107593
They posted a work around that may or may not help people here:
sysctl net.iflib.min_tx_latency=1
Hi,
that workaround does not help with my system (igb driver). iperf3 stays at ~600Mb/s ...
Best regards, Space
any better on 21.1.1?
I have issues with high cpu load and monit alerts me every now and then that CPU is too high and it seems very strange, The high load doesn't seem to be caused by excessive traffic and is present even when almost nothing is going on.
Cpu load is between 50% and 100% even if traffic is as low as 3Mbit/s in and out
This was never a problem before 21.1
Nothing has been changed except the Opensense upgrades, no firewall changes or IPS rules,
My system is an APU 2d4 which has intel nics
Did you check which process spikes?
I did not but I will keep an eye on this.
Did some testing changing IPS from Hyperscan to Aho-Corasick and it made a big change in CPU Load... After a few hours, I switched back to Hyperscan and the load issue did not reappear.
I suspect Suricata at the moment but it may need some more invstigation.
So this update to 21.7 coincided with the time when my ISP bumped me from 300 down to gigabit...
I'm using a 2nd gen i5 with minimal rules enabled, and even just monitoring LAN, I also get CPU spikes in the suricata process above 100%, so threads I assume...
My down speed suffers as a result, and I get 300-400 megabits down, versus bursting at/above 920 with suricata disabled.
With the hardware and config I have this really shouldn't be the case...
How many rules and which pattern matcher? My oldest i5 to test was 5th gen
I'm using Hyperscan, with 46355 rules enabled...that may not be "not very many" I suppose, but it's a lot less than I ran without issue before the update and before the ISP upgraded me to gigabit from 300 down...
Without suricata disabled, I run 0-3% cpu most of the time, sometimes spiking if I do a speed test to 9%...
And yes, this is with all the hardware acceleration disabled on the NICs, as netmap still doesn't support any of it/there are bugs in some hardware and drivers...
I have tried with Aho-Corasick but it didn't seem to affect CPU.
On the newest Atom you get around 300mbit too, maybe the second gen doesnt give you more
https://www.routerperformance.net/opnsense/opnsense-performance-scope7-1510-21-1-6/
Interesting...it could be - or it could be some kind of thread config is needed to really get suricata running reliably on the machine.
The entire reason I use this machine (a corebooted thinkpad t430) is that it runs coreboot...I could virtualize on my rackmount vm hosts, but that defeats the purpose of my paranoia's love of coreboot.
So, in IDS only mode, it can get up around 400-500 megabits, with only 325 or so rules enabled...
I find all of this troubling because someone else with a protectli i5 dual core on reddit mentions getting up to 600 down (maxing out his connection) with an older version of opnsense and suricata around 11 months ago.
20.1 is indeed a bit faster than 20.7. after 20.7 everything is same. IDS mode usually gives you wirespeed, then you CPU might be just too old
So some things that might help, decrease ring size of the network interface that is being monitored. Disable HT, Enable/Disable some of these bios settings [1], make sure the system is set to performance mode for CPU Freq scaling.
https://www.academia.edu/33882347/Suricata_Extreme_Performance_Tuning
I ended up adjusting ring size, and also am using stream dropping - most of the things I catch tend to be small and not involved in large streams of data...
I'm also having the same issue right now.
currently running version 21.1
2021-08-23T23:22:08 dpinger[15656] WAN-GW 10.x.x.x: sendto error: 64
2021-08-23T23:22:04 dpinger[15656] WAN-GW 10.x.x.x: sendto error: 50
2021-08-23T23:21:21 dpinger[84453] GATEWAY ALARM: WAN-GW (Addr: 10.x.x.x Alarm: 0 RTT: 37243us RTTd: 157235us Loss: 0%)
2021-08-23T23:21:21 dpinger[15656] WAN-GW 10.x.x.x: Clear latency 37243us stddev 157235us loss 0%
2021-08-23T23:20:24 dpinger[60798] GATEWAY ALARM: WAN-GW (Addr: 10.x.x.x Alarm: 1 RTT: 558880us RTTd: 2270229us Loss: 11%)
2021-08-23T23:20:22 dpinger[15656] WAN-GW 10.x.x.x: Alarm latency 558880us stddev 2270229us loss 11%
2021-08-23T23:20:15 dpinger[15656] WAN-GW 10.x.x.x: sendto error: 55
2021-08-23T23:16:31 dpinger[99790] GATEWAY ALARM: WAN-GW (Addr: 10.x.x.x Alarm: 0 RTT: 33898us RTTd: 191777us Loss: 5%)
2021-08-23T23:16:31 dpinger[15656] WAN-GW 10.x.x.x: Clear latency 33898us stddev 191777us loss 5%
2021-08-23T23:15:13 dpinger[15656] WAN-GW 10.x.x.x: sendto error: 55
2021-08-23T23:15:12 dpinger[15656] WAN-GW 10.x.x.x: sendto error: 55
2021-08-23T23:12:08 dpinger[42452] GATEWAY ALARM: WAN-GW (Addr: 10.x.x.x Alarm: 0 RTT: 13809us RTTd: 67732us Loss: 0%)
2021-08-23T23:12:08 dpinger[15656] WAN-GW 10.x.x.x: Clear latency 13809us stddev 67732us loss 0%
2021-08-23T23:11:32 dpinger[63388] GATEWAY ALARM: WAN-GW (Addr: 10.x.x.x Alarm: 1 RTT: 664889us RTTd: 1880460us Loss: 10%)
2021-08-23T23:11:32 dpinger[15656] WAN-GW 10.x.x.x: Alarm latency 664889us stddev 1880460us loss 10%
2021-08-23T23:10:38 dpinger[65637] GATEWAY ALARM: WAN-GW (Addr: 10.x.x.x Alarm: 1 RTT: 1367329us RTTd: 4004130us Loss: 28%)
2021-08-23T23:10:38 dpinger[15656] WAN-GW 10.x.x.x: Alarm latency 1367329us stddev 4004130us loss 28%
2021-08-23T23:01:29 dpinger[39719] GATEWAY ALARM: WAN-GW (Addr: 10.x.x.x Alarm: 0 RTT: 21698us RTTd: 87994us Loss: 0%)
2021-08-23T23:01:29 dpinger[15656] WAN-GW 10.x.x.x: Clear latency 21698us stddev 87994us loss 0%
2021-08-23T23:01:20 dpinger[38796] GATEWAY ALARM: WAN-GW (Addr: 10.x.x.x Alarm: 1 RTT: 25364us RTTd: 94972us Loss: 15%)
I have some intermittent issues here.
After I turn off the IPS mode, connection is stable.
I'm also having some problem with Sensei keep crashing when enable the Suricata.
WAN using 1G and my LAN are connected with 10G
Do you guys have any ideas about this?
:D :D
are you running Suricata and Sensei on the same interface? It seems that Suricata is crashing and that is causing your gateway monitoring to flap, can you include logs from Suricata?
Hi Klamath,
Nope, suricata running on WAN, Sensei are running on LAN with LACP configuration.
here is the suricata log when I started the IPS.
2021-08-25T00:25:18 suricata[97785] [100262] <Notice> -- all 2 packet processing threads, 4 management threads initialized, engine started.
2021-08-25T00:25:18 suricata[97785] [101682] <Notice> -- opened netmap:igb1/T from igb1: 0x68baedfd300
2021-08-25T00:25:18 suricata[97785] [101682] <Notice> -- opened netmap:igb1^ from igb1^: 0x68baedfd000
2021-08-25T00:25:18 suricata[97785] [100590] <Notice> -- opened netmap:igb1^ from igb1^: 0x68b51534300
2021-08-25T00:25:17 suricata[97785] [100590] <Notice> -- opened netmap:igb1/R from igb1: 0x68b51534000
2021-08-25T00:24:50 suricata[97785] [100262] <Warning> -- [ERRCODE: SC_WARN_FLOWBIT(306)] - flowbit 'ET.phpBB3_register_stage2' is checked but not set. Checked in 2010896 and 0 other sigs
2021-08-25T00:24:50 suricata[97785] [100262] <Warning> -- [ERRCODE: SC_WARN_FLOWBIT(306)] - flowbit 'ET.phpBB3_register_stage4' is checked but not set. Checked in 2010897 and 0 other sigs
2021-08-25T00:24:50 suricata[97785] [100262] <Warning> -- [ERRCODE: SC_WARN_FLOWBIT(306)] - flowbit 'ET.phpBB3_test' is checked but not set. Checked in 2010894 and 3 other sigs
2021-08-25T00:24:50 suricata[97785] [100262] <Warning> -- [ERRCODE: SC_WARN_FLOWBIT(306)] - flowbit 'ms.rdp.synack' is checked but not set. Checked in 2014384 and 1 other sigs
2021-08-25T00:24:39 suricata[97785] [100262] <Warning> -- [ERRCODE: SC_WARN_DEPRECATED(203)] - keyword 'ssh.softwareversion' is deprecated and will be removed soon. Use 'ssh.software' instead. See https://suricata-ids.org/about/deprecation-policy/
2021-08-25T00:24:39 suricata[97785] [100262] <Warning> -- [ERRCODE: SC_WARN_DEPRECATED(203)] - keyword 'ssh.softwareversion' is deprecated and will be removed soon. Use 'ssh.software' instead. See https://suricata-ids.org/about/deprecation-policy/
2021-08-25T00:24:25 suricata[97536] [100148] <Notice> -- This is Suricata version 5.0.5 RELEASE running in SYSTEM mode
2021-08-25T00:24:24 suricata[65748] [100258] <Notice> -- Stats for 'igb1': pkts: 21435809, drop: 1700473 (7.93%), invalid chksum: 0
2021-08-25T00:24:23 suricata[65748] [100258] <Notice> -- Signal Received. Stopping engine.
Quote from: klamath on August 24, 2021, 04:30:43 PM
are you running Suricata and Sensei on the same interface? It seems that Suricata is crashing and that is causing your gateway monitoring to flap, can you include logs from Suricata?
That looks ok, I am wondering if you can include the logs when IDS fails, It seems that it is running successfully.
Yes, Its looks good when I started both IPS and sensei.
after few hours usually the problem occur.
When its happened, sensei engine will turn off automatically and I need to turn off IPS first before I can start back IPS and sensei once again.
netstat -ihw 1 also shows no drops.
packets errs idrops bytes packets errs bytes colls
3.7k 0 0 549K 2.3k 0 404K 0
3.5k 0 0 671K 2.3k 0 594K 0
4.5k 0 0 1.1M 3.0k 0 1.5M 0
3.9k 0 0 782K 2.4k 0 664K 0
4.3k 0 0 615K 3.0k 0 578K 0
3.9k 0 0 500K 2.0k 0 352K 0
3.2k 0 0 741K 2.1k 0 710K 0
3.3k 0 0 680K 1.5k 0 470K 0
3.0k 0 0 395K 1.5k 0 245K 0
4.2k 0 0 771K 2.2k 0 528K 0
2.4k 0 0 312K 1.3k 0 222K 0
3.5k 0 0 874K 1.9k 0 689K 0
2.7k 0 0 317K 1.4k 0 194K 0
4.2k 0 0 832K 2.2k 0 584K 0
3.6k 0 0 619K 2.1k 0 410K 0
4.8k 0 0 1.3M 3.2k 0 1.4M 0
2.9k 0 0 405K 1.9k 0 301K 0
22k 0 0 1.7M 20k 0 1.6M 0
5.3k 0 0 2.4M 3.1k 0 1.2M 0
5.5k 0 0 1.3M 4.2k 0 1.4M 0
3.6k 0 0 779K 2.6k 0 854K 0
input (Total) output
packets errs idrops bytes packets errs bytes colls
4.8k 0 0 1.2M 3.1k 0 1.2M 0
4.8k 0 0 987K 3.3k 0 807K 0
4.0k 0 0 736K 1.8k 0 316K 0
4.1k 0 0 930K 2.5k 0 777K 0
3.7k 0 0 544K 1.8k 0 313K 0
2.9k 0 0 478K 1.3k 0 243K 0
3.8k 0 0 614K 1.8k 0 343K 0
4.7k 0 0 1.2M 3.3k 0 1.5M 0
4.3k 0 0 596K 1.8k 0 275K 0
3.7k 0 0 808K 2.0k 0 725K 0
3.8k 0 0 736K 2.2k 0 483K 0
5.1k 0 0 869K 4.0k 0 594K 0
6.1k 0 0 886K 4.2k 0 1.3M 0
3.9k 0 0 536K 2.4k 0 354K 0
3.8k 0 0 580K 1.9k 0 398K 0
3.3k 0 0 519K 1.7k 0 302K 0
3.3k 0 0 508K 1.3k 0 236K 0
2.6k 0 0 413K 1.5k 0 399K 0
4.0k 0 0 568K 2.0k 0 384K 0
2.7k 0 0 426K 1.4k 0 340K 0
3.3k 0 0 730K 1.6k 0 425K 0
input (Total) output
packets errs idrops bytes packets errs bytes colls
2.8k 0 0 509K 1.5k 0 439K 0
5.3k 0 0 1.9M 2.9k 0 746K 0
4.1k 0 0 1.4M 3.1k 0 1.6M 0
7.8k 0 0 4.4M 3.2k 0 1.7M 0
2.7k 0 0 679K 1.6k 0 553K 0
2.7k 0 0 571K 1.3k 0 352K 0
2.4k 0 0 661K 1.3k 0 348K 0
3.3k 0 0 500K 1.7k 0 262K 0
3.1k 0 0 471K 2.1k 0 411K 0
I will let you know, if the problem occur once again.
So far I don't think hardware is the issues here.
I'm running with core i7, 32gb ram. 10g LACP on LAN and the ISP speed just 100mbps.
So it will be enough to handle the process right?
Quote from: klamath on August 24, 2021, 06:40:37 PM
That looks ok, I am wondering if you can include the logs when IDS fails, It seems that it is running successfully.
hi @klamath
OPNsense 21.7.3_3-amd64
FreeBSD 12.1-RELEASE-p20-HBSD
OpenSSL 1.1.1l 24 Aug 2021
Hardware: Dell R720
CPU 1 Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz Model 62 Stepping 4 2600 MHz 8core
CPU 2 Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz Model 62 Stepping 4 2600 MHz 8core
Ram : DDR-3 64.00 GB Presence Detected Dual Rank 1866 MHz
Ethernet:
NIC Slot 6 Intel(R) Ethernet Converged Network Adapter X540-T2 (WAN,DMZ)
Integrated NIC 1 Intel(R) GbE 4P I350-t rNDC (LAN,MANAGEMENT)
When Suricata is enabled with IDS/IPS protection the max WAN speed is capped at around 650-670Mbps, with IPS mode disabled I can achieve full 827Mb/s down.
I can't say that the ethernet cards we use are not compatible with suricata IPS running on freebsd, because you have witnessed that it works properly in the previous kernel.
At the same time, when I follow the dpinger service, the situation is as follows:
2021-11-12T02:35:16 dpinger[78904] send_interval 1000ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr
2021-11-11T13:01:05 dpinger[62032] WAN_GWv4_ X: sendto error: 55
2021-11-11T02:35:29 dpinger[72741] GATEWAY ALARM: WAN_GWv4_ (Addr: XAlarm: 0 RTT: 13002us RTTd: 125us Loss: 0%)
2021-11-11T02:35:29 dpinger[62032] WAN_GWv4_ X.255.0.37: Clear latency 13002us stddev 125us loss 0%
2021-11-11T02:35:17 dpinger[38016] GATEWAY ALARM: WAN_GWv4_ (Addr: X.255.0.37 Alarm: 1 RTT: 12983us RTTd: 102us Loss: 25%)
2021-11-11T02:35:17 dpinger[62032] WAN_GWv4_ X.255.0.37: Alarm latency 12983us stddev 102us loss 25%
2021-11-11T02:35:14 dpinger[62032] send_interval 1000ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr X.255.0.37 bind_addr X.255.0.38 identifier "WAN_GWv4_ "
2021-11-10T17:00:24 dpinger[89102] WAN_GWv4_ X.255.0.37: sendto error: 55
It would be great if we could find a solution and suggestion for this problem, thank you for your valuable information sharing.
Hello!
If you have a chance please review https://www.academia.edu/33882347/Suricata_Extreme_Performance_Tuning
I had to disabled most of the intel prefetching options in the BIOS and reduce the TX and RX queues for the nics. Once I did that I could run IDS/IPS without having any speed issues.
Note that when you start/stop Suricata it will cause dpinger to output errors like you listed.
Quote from: h4ck3r on November 12, 2021, 01:01:06 PM
hi @klamath
OPNsense 21.7.3_3-amd64
FreeBSD 12.1-RELEASE-p20-HBSD
OpenSSL 1.1.1l 24 Aug 2021
Hardware: Dell R720
CPU 1 Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz Model 62 Stepping 4 2600 MHz 8core
CPU 2 Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz Model 62 Stepping 4 2600 MHz 8core
Ram : DDR-3 64.00 GB Presence Detected Dual Rank 1866 MHz
Ethernet:
NIC Slot 6 Intel(R) Ethernet Converged Network Adapter X540-T2 (WAN,DMZ)
Integrated NIC 1 Intel(R) GbE 4P I350-t rNDC (LAN,MANAGEMENT)
When Suricata is enabled with IDS/IPS protection the max WAN speed is capped at around 650-670Mbps, with IPS mode disabled I can achieve full 827Mb/s down.
I can't say that the ethernet cards we use are not compatible with suricata IPS running on freebsd, because you have witnessed that it works properly in the previous kernel.
At the same time, when I follow the dpinger service, the situation is as follows:
2021-11-12T02:35:16 dpinger[78904] send_interval 1000ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr
2021-11-11T13:01:05 dpinger[62032] WAN_GWv4_ X: sendto error: 55
2021-11-11T02:35:29 dpinger[72741] GATEWAY ALARM: WAN_GWv4_ (Addr: XAlarm: 0 RTT: 13002us RTTd: 125us Loss: 0%)
2021-11-11T02:35:29 dpinger[62032] WAN_GWv4_ X.255.0.37: Clear latency 13002us stddev 125us loss 0%
2021-11-11T02:35:17 dpinger[38016] GATEWAY ALARM: WAN_GWv4_ (Addr: X.255.0.37 Alarm: 1 RTT: 12983us RTTd: 102us Loss: 25%)
2021-11-11T02:35:17 dpinger[62032] WAN_GWv4_ X.255.0.37: Alarm latency 12983us stddev 102us loss 25%
2021-11-11T02:35:14 dpinger[62032] send_interval 1000ms loss_interval 2000ms time_period 60000ms report_interval 0ms data_len 0 alert_interval 1000ms latency_alarm 500ms loss_alarm 20% dest_addr X.255.0.37 bind_addr X.255.0.38 identifier "WAN_GWv4_ "
2021-11-10T17:00:24 dpinger[89102] WAN_GWv4_ X.255.0.37: sendto error: 55
It would be great if we could find a solution and suggestion for this problem, thank you for your valuable information sharing.