OK, back to basics here. I couldn't leave well enough alone and I did more testing tonight because I just couldn't believe that my CPU couldn't even do single threaded gigabit. Here's my test scenario:Test Scenario 1:Physical Linux Server (CentOS 7) on VLAN 2 (iperf3 client)Virtual Linux Server (CentOS 7) on VLAN 24 (iperf3 server)Dell PowerEdge R430 w/Intel X520-SR2 and HardenedBSD 12-STABLE (BUILD-LATEST 2020-08-31)Single Threaded:[ ID] Interval Transfer Bandwidth Retr[ 4] 0.00-10.00 sec 1.00 GBytes 863 Mbits/sec 0 sender[ 4] 0.00-10.00 sec 1.00 GBytes 860 Mbits/sec receiver6 Parallel Threads:[ ID] Interval Transfer Bandwidth Retr[SUM] 0.00-10.00 sec 2.23 GBytes 1.91 Gbits/sec 938 sender[SUM] 0.00-10.00 sec 2.22 GBytes 1.90 Gbits/sec receiverNotice a common theme here with the ~850 Mbps single threaded test. It's pretty close to what I get with OPNsense. Note this is THROUGH the firewall and not from the firewall. Also note my system did have IPv6 addresses from my ISP on each of the interfaces, though, I was only testing IPv4 traffic.Test Scenario 2:Physical Linux Server (CentOS 7) on VLAN 2 (iperf3 client)Virtual Linux Server (CentOS 7) on VLAN 24 (iperf3 server)Dell PowerEdge R430 w/Intel X520-SR2 and FreeBSD 12.1-RELEASESingle Threaded:[ ID] Interval Transfer Bandwidth Retr[ 4] 0.00-10.00 sec 9.75 GBytes 8.38 Gbits/sec 573 sender[ 4] 0.00-10.00 sec 9.75 GBytes 8.38 Gbits/sec receiver6 Parallel Threads:[ ID] Interval Transfer Bandwidth Retr[SUM] 0.00-10.00 sec 10.5 GBytes 9.05 Gbits/sec 3607 sender[SUM] 0.00-10.00 sec 10.5 GBytes 9.04 Gbits/sec receiverI couldn't believe my eyes as I had to do a triple check that it was in fact pushing 8.38 Gbps THROUGH the FreeBSD 12.1 server and it wasn't taking some magical alternate path somehow. It was, in fact, going through the FreeBSD router. As you can see, parallel test is about 1 Gbps less than wire speed. Excellent! Also note my system did have IPv6 addresses from my ISP on each of the interfaces, though, I was only testing IPv4 traffic.I thought I would enable pfctl on the FreeBSD 12.1 router to see how that affected performance. Not sure how much adding rules impacts throughput but I did notice a measurable drop in the single thread test (6.23 Gbps) but the parallel thread test was negligible (8.94 Gbps).As of right now, it seems so so so strange to me that HardenedBSD exhibits the same exact single threaded throughput and likewise low parallel thread throughput over FreeBSD.I am willing to accept that I am not accounting for something here; however, near wire speed throughput on the same exact hardware on FreeBSD versus HardenedBSD, it seems to me something is very different with HardenedBSD.What are your thoughts?
@minimugmailQuote from: hax0rwax0r on September 02, 2020, 07:34:01 amOK, back to basics here. I couldn't leave well enough alone and I did more testing tonight because I just couldn't believe that my CPU couldn't even do single threaded gigabit. Here's my test scenario:Test Scenario 1:Physical Linux Server (CentOS 7) on VLAN 2 (iperf3 client)Virtual Linux Server (CentOS 7) on VLAN 24 (iperf3 server)Dell PowerEdge R430 w/Intel X520-SR2 and HardenedBSD 12-STABLE (BUILD-LATEST 2020-08-31)Single Threaded:[ ID] Interval Transfer Bandwidth Retr[ 4] 0.00-10.00 sec 1.00 GBytes 863 Mbits/sec 0 sender[ 4] 0.00-10.00 sec 1.00 GBytes 860 Mbits/sec receiver6 Parallel Threads:[ ID] Interval Transfer Bandwidth Retr[SUM] 0.00-10.00 sec 2.23 GBytes 1.91 Gbits/sec 938 sender[SUM] 0.00-10.00 sec 2.22 GBytes 1.90 Gbits/sec receiverNotice a common theme here with the ~850 Mbps single threaded test. It's pretty close to what I get with OPNsense. Note this is THROUGH the firewall and not from the firewall. Also note my system did have IPv6 addresses from my ISP on each of the interfaces, though, I was only testing IPv4 traffic.Test Scenario 2:Physical Linux Server (CentOS 7) on VLAN 2 (iperf3 client)Virtual Linux Server (CentOS 7) on VLAN 24 (iperf3 server)Dell PowerEdge R430 w/Intel X520-SR2 and FreeBSD 12.1-RELEASESingle Threaded:[ ID] Interval Transfer Bandwidth Retr[ 4] 0.00-10.00 sec 9.75 GBytes 8.38 Gbits/sec 573 sender[ 4] 0.00-10.00 sec 9.75 GBytes 8.38 Gbits/sec receiver6 Parallel Threads:[ ID] Interval Transfer Bandwidth Retr[SUM] 0.00-10.00 sec 10.5 GBytes 9.05 Gbits/sec 3607 sender[SUM] 0.00-10.00 sec 10.5 GBytes 9.04 Gbits/sec receiverI couldn't believe my eyes as I had to do a triple check that it was in fact pushing 8.38 Gbps THROUGH the FreeBSD 12.1 server and it wasn't taking some magical alternate path somehow. It was, in fact, going through the FreeBSD router. As you can see, parallel test is about 1 Gbps less than wire speed. Excellent! Also note my system did have IPv6 addresses from my ISP on each of the interfaces, though, I was only testing IPv4 traffic.I thought I would enable pfctl on the FreeBSD 12.1 router to see how that affected performance. Not sure how much adding rules impacts throughput but I did notice a measurable drop in the single thread test (6.23 Gbps) but the parallel thread test was negligible (8.94 Gbps).As of right now, it seems so so so strange to me that HardenedBSD exhibits the same exact single threaded throughput and likewise low parallel thread throughput over FreeBSD.I am willing to accept that I am not accounting for something here; however, near wire speed throughput on the same exact hardware on FreeBSD versus HardenedBSD, it seems to me something is very different with HardenedBSD.What are your thoughts?
pfSense 2.4.5p1 1500MTU receiving from WAN, vmx3 NICs, all hardware offloading disabled, default ruleset[ ID] Interval Transfer Bitrate Retr[ 5] 0.00-60.00 sec 31.5 GBytes 4.50 Gbits/sec 11715 sender[ 5] 0.00-60.00 sec 31.5 GBytes 4.50 Gbits/sec receiverOpenWRT 19.07.3 1500MTU receiving from WAN, vmx3 NICs, default ruleset[ ID] Interval Transfer Bitrate Retr[ 5] 0.00-60.00 sec 47.5 GBytes 6.81 Gbits/sec 44252 sender[ 5] 0.00-60.00 sec 47.5 GBytes 6.81 Gbits/sec receiverOPNsense 20.7.2 1500MTU receiving from WAN, vmx3 NICs, all hardware offloading disabled, default ruleset[ ID] Interval Transfer Bitrate Retr[ 5] 0.00-60.00 sec 6.83 GBytes 977 Mbits/sec 459 sender[ 5] 0.00-60.00 sec 6.82 GBytes 977 Mbits/sec receiver
PS E:\Util> .\iperf3.exe -c 192.168.10.8 -p 26574Connecting to host 192.168.10.8, port 26574[ 4] local 192.168.12.4 port 50173 connected to 192.168.10.8 port 26574[ ID] Interval Transfer Bandwidth[ 4] 0.00-1.00 sec 49.1 MBytes 412 Mbits/sec[ 4] 1.00-2.00 sec 52.5 MBytes 440 Mbits/sec[ 4] 2.00-3.00 sec 51.8 MBytes 434 Mbits/sec[ 4] 3.00-4.00 sec 52.4 MBytes 439 Mbits/sec[ 4] 4.00-5.00 sec 52.1 MBytes 438 Mbits/sec[ 4] 5.00-6.00 sec 52.6 MBytes 441 Mbits/sec[ 4] 6.00-7.00 sec 52.4 MBytes 440 Mbits/sec[ 4] 7.00-8.00 sec 46.4 MBytes 389 Mbits/sec[ 4] 8.00-9.00 sec 49.0 MBytes 411 Mbits/sec[ 4] 9.00-10.00 sec 51.6 MBytes 433 Mbits/sec- - - - - - - - - - - - - - - - - - - - - - - - -[ ID] Interval Transfer Bandwidth[ 4] 0.00-10.00 sec 510 MBytes 428 Mbits/sec sender[ 4] 0.00-10.00 sec 510 MBytes 428 Mbits/sec receiver