Call for testing: official netmap kernel

Started by mb, September 16, 2020, 06:53:51 PM

Previous topic - Next topic
Quote from: sy on September 28, 2020, 06:21:02 PM
Hi @athurdent,
Traffic Graph problem existed with the early 20.7 release but it isn't anymore with 20.7.2 and 3.What is your OPNsense version?
Odd, I'm on OPNsense 20.7.3-amd64 with this experimental kernel, double checked. Happens with igb and vtnet here.
Cannot attach my screenshot from the iPad as it is too big it seems, so you have to take my word for it. 😅

I confirm that I do not get a kernel panic when using vmxnet3 in ESXI and IDS or Sensei and the 20.7.3-netmap kernel.   

But that's all I can confirm is no kernel panic, it is still very slow.  My simple but very trusted/reliable test is to use iperf3 on both the opnsense vm and another local vm on in the same ESXI host.

iperf3 -c 192.168.1.1 -R (using -R to simulate download speed to machine)

With IDS off
[ ID] Interval           Transfer     Bitrate                    Retr
[  5]   0.00-10.01  sec  2.27 GBytes  1.95 Gbits/sec    0

With IDS on
[ ID] Interval           Transfer     Bitrate                    Retr
[  5]   0.00-10.00  sec  1.06 GBytes   915 Mbits/sec  2375


Look at the number of retransmissions.  It's not bound by memory or CPU.  CPU never goes over 20% and memory has many GB's left. 
E3-1285V6 - 4 CPU's allocated and 12GB ram.



Hi @gauthgig, thanks for the figures. Couple of questions:

1. By "slow", do you mean 915Mbps or do you have lower speeds than this?
2. How was the situation with OPNsense 20.1.x?
3. How does Sensei bypass mode behave?



By slow I mean a drop from 1.95Gbs to 0.915Gbs, 50% reduction.
In 20.1.x I was seeing about 1.7Gbs, so much less drop when netmap enabled.

I only showed the Suricata on LAN.  I'll re-run with sensei normal and bypass more and send the results.   By the way, my ELK stack is on another ESXI with a 10Gbs link, so the ELK CPU/Memory load will not impact opnsense/sensei.

Thanks, looking forward to Sensei results.

Does anyone know if OPNsense 20.7.3 and 20.1.9 have different Suricata releases, or is it the same major release?

@mb as far as I know it's suricata 4x vs suricata 5.x


Sent from iPhone via Tapatalk

Quote from: gauthig on September 29, 2020, 03:07:42 AM
By slow I mean a drop from 1.95Gbs to 0.915Gbs, 50% reduction.
In 20.1.x I was seeing about 1.7Gbs, so much less drop when netmap enabled.

I only showed the Suricata on LAN.  I'll re-run with sensei normal and bypass more and send the results.   By the way, my ELK stack is on another ESXI with a 10Gbs link, so the ELK CPU/Memory load will not impact opnsense/sensei.

this is normal for an IDS, it inspects every packet, if you enable all rules this is even optimistic. disabling some rules may show a noticeable performance increase with IDS enabled

Good news - I got the 20.7.3-netmap working great with both Suricata (WAN) and Sensei (LANs) working great.

I posted some stats yesterday that showed a 50% drop which never happened in 20.1.x (normally about a 3-5% drop on my hardware).  Here are the new results:

Internal Testing  - controlled no other network noise
iperf3 -c 192.168.1.1 -R (using -R to simulate download speed to machine)

With Sensei off
[ ID] Interval                Transfer         Bitrate            Retr
[  5]   0.00-10.37  sec  4.04 GBytes  3.34 Gbits/sec   0

With Sensei on
[ ID] Interval                Transfer          Bitrate         Retr
[  5]   0.00-10.37  sec  3.92 GBytes  3.14 Gbits/sec  0

Going in or out from the LAN to the internet (Passing both Suricata and Sensei Netmap interfaces)
Speedtest
Both IDS off
Download: 903.26 Mbit/s   Upload: 956.85 Mbit/s
Both IDS on
Download: 891.26 Mbit/s   Upload: 910.85 Mbit/s

What Changed - from yesterday - ESXI, recently I did a upgrade from ESXI6.7 to 7.0.  I found an article about memory delays which a patch is coming for.  In the meantime it was recommended to RESERVE ALL Memory for latency sensitive VMs.   I did that and now netmap is working great.  Looking forward for the test kernel to be rolled into the news production update along with the ESXI patch.   I was even able to drastically reduce the resources available to opnsense, here is what the final test numbers above were on:
E3-1285V6 - 2 cpus, 6G RAM

Hi @gauthig,

I tried the reserve all memory option for my OPNsense VM in ESXi 6.7, unfortunately it made no difference for my setup. Currently I can not upgrade to ESXi 7 due to hardware compatibility issues. However my OPNsense upgrade was in place and was the only thing that changed in my environment. So I'm guessing that the performance issues are with OPNsense 20.7 and the new netmap kernel.

Just wanted to add in my thanks, working on my end and has fixed my flakey gig connection ranging from 400-600Mbps to sit solid at a gig!

I was able to improve my download bandwidth by increasing the network transmit and receive descriptors, I'm now getting around 600mbps. So not as high when I was on OPNsense 20.1.9, but I'm fairly satisfied with my download speed now. Hopefully there are further improvements in the future that will increase performance even more.

Just as a fyi... I did try a number of VMware ESXi settings such as memory reservation, CPU affinity, CPU reservation and setting the latency of the VM in VM options/advanced to High with no improvements.

With the Sensei 1.6.1 update I am now getting around 700mbps due to the SSL/TLS performance improvements. During the test it actually reaches speeds into the 750-780mbps but can not sustain it and drops to around 700mbps. So great work in improving the throughput in Sensei 1.6.1.

I noticed that this increase was only noticed in SSL/TLS connections (HTTPS) as that is what they were meant to address. Can these improvements also be made for unencrypted connections? On an HTTP only speed test the results stay around 530mbps for me.

Yup, agree, Ookla from my ISP now gives 138 Mbps instead of 60 to 75 Mbps. Ookla measures by SSL connections.

Also, there is no longer a large speed difference between unencrypted and encrypted Usenet downloads.
OPNsense HW:

Minisforum Venus series UN100C, 16 GB RAM, 512 GB SSD
T-bao N9N Pro, 16 GB RAM, 512 GB SSD

@gauthig, @keanu @ @almodovaris, @xpendable, Thanks for the update and insights.

Glad to hear that TLS/SSL speeds are up. We're also scrutinizing the HTTP processor to see if we can save more cpu cycles.

Quote from: athurdent on September 28, 2020, 09:01:19 AM
Using Netmap, e.g. by turning on Sensei on LAN or Suricata in IPS mode on WAN, the traffic graph on the dashboard stops working for those interfaces, and Zabbix agent is unable to gather interface statistics, too.

Curious if there has been any progress in resolving this?  Feels like it's been broken a long time.