OPNsense Forum

English Forums => Hardware and Performance => Topic started by: cleverfoo on January 18, 2020, 08:55:59 pm

Title: OPNsense 4x slower than PFSense on same hardware
Post by: cleverfoo on January 18, 2020, 08:55:59 pm
Howdy folks, I'm running some tests on OPNSense 19.7.9_1-amd64  vs PFSense 2.4.4-RELEASE-p3 (amd64)  - both of them are running as virtual machines on the same host with no tuning but all patches applied. All I'm doing is installing iperf3 and running it in server mode for the tests, here are the results:

OPNsense:

% iperf3 -c 172.16.160.204
Connecting to host 172.16.160.204, port 5201
[  5] local 172.16.160.144 port 50482 connected to 172.16.160.204 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  6.99 MBytes  58.6 Mbits/sec                 
[  5]   1.00-2.00   sec  32.9 MBytes   276 Mbits/sec                 
[  5]   2.00-3.00   sec  33.0 MBytes   277 Mbits/sec                 
[  5]   3.00-4.00   sec  32.4 MBytes   272 Mbits/sec                 
[  5]   4.00-5.00   sec  31.9 MBytes   268 Mbits/sec                 
[  5]   5.00-6.00   sec  31.0 MBytes   260 Mbits/sec                 
[  5]   6.00-7.00   sec  31.1 MBytes   261 Mbits/sec                 
[  5]   7.00-8.00   sec  30.8 MBytes   259 Mbits/sec                 
[  5]   8.00-9.00   sec  31.2 MBytes   261 Mbits/sec                 
[  5]   9.00-10.00  sec  31.0 MBytes   260 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec   292 MBytes   245 Mbits/sec                  sender
[  5]   0.00-10.00  sec   292 MBytes   245 Mbits/sec                  receiver

PFsense:
 % iperf3 -c 172.16.160.190
Connecting to host 172.16.160.190, port 5201
[  5] local 172.16.160.144 port 49663 connected to 172.16.160.190 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  96.7 MBytes   811 Mbits/sec                 
[  5]   1.00-2.00   sec   112 MBytes   935 Mbits/sec                 
[  5]   2.00-3.00   sec   111 MBytes   935 Mbits/sec                 
[  5]   3.00-4.00   sec   112 MBytes   935 Mbits/sec                 
[  5]   4.00-5.00   sec   112 MBytes   936 Mbits/sec                 
[  5]   5.00-6.00   sec   112 MBytes   939 Mbits/sec                 
[  5]   6.00-7.00   sec   112 MBytes   938 Mbits/sec                 
[  5]   7.00-8.00   sec   112 MBytes   939 Mbits/sec                 
[  5]   8.00-9.00   sec   109 MBytes   914 Mbits/sec                 
[  5]   9.00-10.00  sec   112 MBytes   944 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec  1.07 GBytes   923 Mbits/sec                  sender
[  5]   0.00-10.01  sec  1.07 GBytes   922 Mbits/sec                  receiver

The virtual machines are running under proxmoxve (linux/kvm) with same hardware settings (see screenhots). Needless to say I can get full gigabit performance through pfsense but about 4x lower using pfsense - is this expected?

Big thanks for a great product and great community
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: cleverfoo on January 18, 2020, 08:56:45 pm
VM settings
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: cleverfoo on January 18, 2020, 08:57:22 pm
and the second one
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: Antaris on January 18, 2020, 11:27:07 pm
I'm not sure that using Prox firewall over firewall distro is OK at all...
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: allebone on January 18, 2020, 11:33:52 pm
Im using unraid and not having an issue. As a test can you give the opnsense vm 2 cpus rather than one to check of its a cpu bottleneck? Otherwise might be not liking the nic drivers. Some people have tried e1000 to fox similar issues on pfsense so not sure if its a similar thing here.
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: newsense on January 19, 2020, 03:38:58 am
It surely looks like the tests have the firewall as endpoint(s), which is rather irrelevant.

Did you try iperf between two endpoints on each side of the firewalls ?


Other than that, I'm also considering a config/driver issue that's not up to par on OPNsense yet there's not much info to troubleshoot.
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: mimugmail on January 19, 2020, 06:54:58 am
Try to Install speedtest-cli as iperf on local is painfully slow
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: opnfwb on January 19, 2020, 08:48:02 pm
Here are my numbers. Both of these are fresh out of the box installs, OPNsense 19.7.9 and pfSense 2.4.4p3, both are X86_64.

Hypervisor Specs:
VMware ESXi 6.7u3
2x Intel Xeon E5620
All VMs are running open-vm-tools, including the firewalls

Specs on both firewall VMs are as follows:
2x CPU
4GB RAM
2x VMXnet3 NICs (one WAN, one LAN)

I have two other VMs running as iperf3 server and client. The "server" VM is on the WAN side of these firewalls, the client VM is on the "LAN" side. This is to test traffic throughput of the router itself. Never try to run these tests with the router/firewall acting as a client or server, you will not get accurate results.

pfSense 2.4.4p3:
Code: [Select]
Accepted connection from 192.168.1.230, port 56492
[  5] local 192.168.1.231 port 5201 connected to 192.168.1.230 port 45828
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   314 MBytes  2.64 Gbits/sec
[  5]   1.00-2.00   sec   459 MBytes  3.85 Gbits/sec
[  5]   2.00-3.00   sec   407 MBytes  3.41 Gbits/sec
[  5]   3.00-4.00   sec   393 MBytes  3.30 Gbits/sec
[  5]   4.00-5.00   sec   351 MBytes  2.94 Gbits/sec
[  5]   5.00-6.00   sec   372 MBytes  3.12 Gbits/sec
[  5]   6.00-7.00   sec   424 MBytes  3.55 Gbits/sec
[  5]   7.00-8.00   sec   410 MBytes  3.44 Gbits/sec
[  5]   8.00-9.00   sec   443 MBytes  3.71 Gbits/sec
[  5]   9.00-10.00  sec   393 MBytes  3.30 Gbits/sec
[  5]  10.00-11.00  sec   448 MBytes  3.76 Gbits/sec
[  5]  11.00-12.00  sec   428 MBytes  3.59 Gbits/sec
[  5]  12.00-13.00  sec   404 MBytes  3.39 Gbits/sec
[  5]  13.00-14.00  sec   419 MBytes  3.51 Gbits/sec
[  5]  14.00-15.00  sec   445 MBytes  3.73 Gbits/sec
[  5]  15.00-15.04  sec  16.1 MBytes  3.26 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-15.04  sec  0.00 Bytes  0.00 bits/sec                  sender
[  5]   0.00-15.04  sec  5.98 GBytes  3.42 Gbits/sec                  receiver

OPNsense 19.7.9 (no tuning, Unbound using lots of CPU)
Code: [Select]
Accepted connection from 192.168.1.232, port 15150
[  5] local 192.168.1.231 port 5201 connected to 192.168.1.232 port 46858
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   304 MBytes  2.55 Gbits/sec
[  5]   1.00-2.00   sec  88.9 MBytes   746 Mbits/sec
[  5]   2.00-3.00   sec   371 MBytes  3.11 Gbits/sec
[  5]   3.00-4.00   sec   164 MBytes  1.38 Gbits/sec
[  5]   4.00-5.00   sec   420 MBytes  3.52 Gbits/sec
[  5]   5.00-6.00   sec  79.4 MBytes   666 Mbits/sec
[  5]   6.00-7.00   sec   400 MBytes  3.36 Gbits/sec
[  5]   7.00-8.00   sec  97.7 MBytes   820 Mbits/sec
[  5]   8.00-9.00   sec   403 MBytes  3.38 Gbits/sec
[  5]   9.00-10.00  sec   399 MBytes  3.35 Gbits/sec
[  5]  10.00-11.00  sec   104 MBytes   872 Mbits/sec
[  5]  11.00-12.00  sec   374 MBytes  3.14 Gbits/sec
[  5]  12.00-13.00  sec  74.0 MBytes   621 Mbits/sec
[  5]  13.00-14.00  sec   289 MBytes  2.42 Gbits/sec
[  5]  14.00-15.00  sec   135 MBytes  1.13 Gbits/sec
[  5]  15.00-15.04  sec  3.24 MBytes   675 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-15.04  sec  0.00 Bytes  0.00 bits/sec                  sender
[  5]   0.00-15.04  sec  3.62 GBytes  2.07 Gbits/sec                  receiver

OPNsense 19.7.9 (set unbound to use Quad9 DoT using forwarding mode)
Code: [Select]
Accepted connection from 192.168.1.232, port 58840
[  5] local 192.168.1.231 port 5201 connected to 192.168.1.232 port 16760
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   214 MBytes  1.80 Gbits/sec
[  5]   1.00-2.00   sec   268 MBytes  2.25 Gbits/sec
[  5]   2.00-3.00   sec   312 MBytes  2.61 Gbits/sec
[  5]   3.00-4.00   sec   315 MBytes  2.64 Gbits/sec
[  5]   4.00-5.00   sec   273 MBytes  2.29 Gbits/sec
[  5]   5.00-6.00   sec   259 MBytes  2.17 Gbits/sec
[  5]   6.00-7.00   sec   201 MBytes  1.69 Gbits/sec
[  5]   7.00-8.00   sec   279 MBytes  2.34 Gbits/sec
[  5]   8.00-9.00   sec   311 MBytes  2.61 Gbits/sec
[  5]   9.00-10.00  sec   120 MBytes  1.01 Gbits/sec
[  5]  10.00-11.00  sec   237 MBytes  1.99 Gbits/sec
[  5]  11.00-12.00  sec   298 MBytes  2.50 Gbits/sec
[  5]  12.00-13.00  sec   322 MBytes  2.70 Gbits/sec
[  5]  13.00-14.00  sec   291 MBytes  2.44 Gbits/sec
[  5]  14.00-15.00  sec   303 MBytes  2.54 Gbits/sec
[  5]  15.00-15.03  sec  8.95 MBytes  2.26 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-15.03  sec  0.00 Bytes  0.00 bits/sec                  sender
[  5]   0.00-15.03  sec  3.92 GBytes  2.24 Gbits/sec                  receiver

As we can see, OPNsense does seem to have some throughput limits out of the box. Still, I am seeing much higher throughput values than you are so it's important to make sure your tests are using servers/clients on the WAN and LAN sides of the firewall.

Finally, here's a screenshot of what a 'top -aSCHIP' looks like on the OPNsense 19.7.9 VM, you can see the high CPU usage for some reason with unbound. You may want to check if your OPNsense VM exhibits the same high CPU behavior, as that can also take away from the overall throughput.

Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: opnfwb on January 19, 2020, 10:37:43 pm
A quick reply regarding the OPNsense CPU utilization. In my case this seemed to be related to DHCP6 being enabled out of the box. I'm not sure if OPNsense was trying to delegate a prefix to the LAN side over and over and causing high CPU usage on unbound? My logs are filled with this:

Code: [Select]
kernel: pflog0: promiscuous mode disabled
kernel: pflog0: promiscuous mode enabled

I was seeing these events spamming the logs constantly every second. As soon as I disabled DHCP6 on WAN, these errors went away and idle CPU usage on OPNsense returned to normal.

Here are the results of a current iperf3 test, using the same VMs described in my post above. These throughput numbers are much more consistent now that OPNsense has normal CPU usage.
Code: [Select]
Accepted connection from 192.168.1.232, port 4084
[  5] local 192.168.1.231 port 5201 connected to 192.168.1.232 port 24664
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-1.00   sec   185 MBytes  1.55 Gbits/sec
[  5]   1.00-2.00   sec   323 MBytes  2.71 Gbits/sec
[  5]   2.00-3.00   sec   315 MBytes  2.64 Gbits/sec
[  5]   3.00-4.00   sec   344 MBytes  2.88 Gbits/sec
[  5]   4.00-5.00   sec   316 MBytes  2.65 Gbits/sec
[  5]   5.00-6.00   sec   357 MBytes  2.99 Gbits/sec
[  5]   6.00-7.00   sec   353 MBytes  2.96 Gbits/sec
[  5]   7.00-8.00   sec   349 MBytes  2.93 Gbits/sec
[  5]   8.00-9.00   sec   356 MBytes  2.98 Gbits/sec
[  5]   9.00-10.00  sec   345 MBytes  2.89 Gbits/sec
[  5]  10.00-11.00  sec   305 MBytes  2.56 Gbits/sec
[  5]  11.00-12.00  sec   348 MBytes  2.92 Gbits/sec
[  5]  12.00-13.00  sec   341 MBytes  2.86 Gbits/sec
[  5]  13.00-14.00  sec   343 MBytes  2.87 Gbits/sec
[  5]  14.00-15.00  sec   331 MBytes  2.77 Gbits/sec
[  5]  15.00-15.04  sec  14.8 MBytes  3.04 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bandwidth
[  5]   0.00-15.04  sec  0.00 Bytes  0.00 bits/sec                  sender
[  5]   0.00-15.04  sec  4.81 GBytes  2.75 Gbits/sec                  receiver
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: allebone on January 19, 2020, 11:07:55 pm
Interesting. Good find man :)
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: cleverfoo on January 24, 2020, 04:55:35 pm
Hey folks, thanks for all the replies and apologies for the slow response, I did not know that I had to ask to be notified as the creator of a thread. Gonna try to answers all the questions on the thread

1. unbound doesn't seem to be the issue, see attached screenshot for a run. As a matter of fact, the box seems to be snoozing with reasonably high idle time.

2. Run with 2 CPUs (1 socket 2 cores). No marked difference
Code: [Select]
% iperf3 -c 172.16.160.204
Connecting to host 172.16.160.204, port 5201
[  5] local 172.16.160.144 port 53463 connected to 172.16.160.204 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  31.0 MBytes   260 Mbits/sec                 
[  5]   1.00-2.00   sec  29.7 MBytes   249 Mbits/sec                 
[  5]   2.00-3.00   sec  27.6 MBytes   231 Mbits/sec                 
[  5]   3.00-4.00   sec  26.1 MBytes   219 Mbits/sec                 
[  5]   4.00-5.00   sec  25.9 MBytes   217 Mbits/sec                 
[  5]   5.00-6.00   sec  25.8 MBytes   216 Mbits/sec                 
[  5]   6.00-7.00   sec  24.8 MBytes   208 Mbits/sec                 
[  5]   7.00-8.00   sec  24.6 MBytes   206 Mbits/sec                 
[  5]   8.00-9.00   sec  25.9 MBytes   218 Mbits/sec                 
[  5]   9.00-10.00  sec  25.2 MBytes   211 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec   267 MBytes   224 Mbits/sec                  sender
[  5]   0.00-10.00  sec   266 MBytes   223 Mbits/sec                  receiver

iperf Done.

Code: [Select]
3. Run with 1 CPU core and epro1000 NIC
% iperf3 -c 172.16.160.204
Connecting to host 172.16.160.204, port 5201
[  5] local 172.16.160.144 port 53902 connected to 172.16.160.204 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  5.47 MBytes  45.9 Mbits/sec                 
[  5]   1.00-2.00   sec  27.0 MBytes   227 Mbits/sec                 
[  5]   2.00-3.00   sec  22.5 MBytes   189 Mbits/sec                 
[  5]   3.00-4.00   sec  28.3 MBytes   237 Mbits/sec                 
[  5]   4.00-5.00   sec  28.0 MBytes   235 Mbits/sec                 
[  5]   5.00-6.00   sec  28.2 MBytes   236 Mbits/sec                 
[  5]   6.00-7.00   sec  27.9 MBytes   234 Mbits/sec                 
[  5]   7.00-8.00   sec  27.9 MBytes   234 Mbits/sec                 
[  5]   8.00-9.00   sec  28.5 MBytes   239 Mbits/sec                 
[  5]   9.00-10.00  sec  28.1 MBytes   236 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec   252 MBytes   211 Mbits/sec                  sender
[  5]   0.00-10.00  sec   251 MBytes   210 Mbits/sec                  receiver

iperf Done.

4. I didn't install the speedtestcli since I'm running all the tests locally (on the same switch) not really trying to test out the IP circuit.

5. I disabled IPv6 on the WAN (I saw some of the same "promiscuous" log entries) and here are the results:

Code: [Select]
% iperf3 -c 172.16.160.204
Connecting to host 172.16.160.204, port 5201
[  5] local 172.16.160.144 port 53960 connected to 172.16.160.204 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  18.6 MBytes   156 Mbits/sec                 
[  5]   1.00-2.00   sec  28.6 MBytes   240 Mbits/sec                 
[  5]   2.00-3.00   sec  28.7 MBytes   240 Mbits/sec                 
[  5]   3.00-4.00   sec  28.7 MBytes   241 Mbits/sec                 
[  5]   4.00-5.00   sec  29.0 MBytes   243 Mbits/sec                 
[  5]   5.00-6.00   sec  28.8 MBytes   242 Mbits/sec                 
[  5]   6.00-7.00   sec  28.2 MBytes   237 Mbits/sec                 
[  5]   7.00-8.00   sec  29.1 MBytes   244 Mbits/sec                 
[  5]   8.00-9.00   sec  28.5 MBytes   239 Mbits/sec                 
[  5]   9.00-10.00  sec  27.7 MBytes   232 Mbits/sec                 
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-10.00  sec   276 MBytes   231 Mbits/sec                  sender
[  5]   0.00-10.00  sec   275 MBytes   231 Mbits/sec                  receiver

iperf Done.

TLDR: things are still strangely slow, I really think this is a kernel or tuning issue but I'm unsure where to even being to look.
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: lewald on January 26, 2020, 11:44:52 am
Use "VirtIO (paravirtualized)" network in Proxmox.
And use "host" as CPu Type.

There is one thing that is broken with "VirtIO (paravirtualized)". It's IPS/Suricata.

Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: cleverfoo on January 26, 2020, 03:50:08 pm
Thanks but the original numbers (at the very top of the thread) were using VirtIO without IPS enabled.
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: opnfwb on January 29, 2020, 04:40:42 pm
Just to confirm, does your setup look like the below diagram? OPNsense is not hosting the client or server portion of iperf, correct?
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: cleverfoo on January 29, 2020, 05:27:36 pm
Nope it's not, Im not using the WAN interface at all, I'm just testing the speed of the LAN port hard wired to a Gigabit switch, like this:

[ OPNSesnse  running iperf3 in server mode ] LAN <=== [ Gitgabit Swith ] <=== [Macbook laptop with a 1Gb NIC]

I don't doubt that I'll be able to get more using both interfaces (WAN/LAN) flowing traffic that way but it will still not equate to full 1Gbps as a single interface is only able to route about 1/4 of that currently. Most importantly, the setup is the same for PfSense but its performance is much higher.
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: opnfwb on January 29, 2020, 06:53:02 pm
Yes, I see what you mean with pfSense being able to sustain a higher throughput under the same circumstance.

Unfortunately, I'm not sure if this is useful data because it isn't telling us how fast either solution can actually route packets. In actual use, we'd be using pfSense or OPNsense as a firewall/router setup and we would want to see how quickly they can push traffic through themselves, rather than serving traffic directly through one interface.

I suppose the last things I would check would be to verify that OPNsense and pfSense both have OpenVMTools installed/running. And run a 'top -aSCHIP' on an SSH console for both of them and see what their CPU usage is when running your transfer test. Watching them under load may reveal a bottleneck, especially on the OPNsense router since that one seems to be under performing in your tests.

Finally, if you feel like digging in a bit more, I would recommend doing a test using Proxmox client and server VMs, and have two switches inside Proxmox. One can be used for the WAN port and one switch is a private switch with no physical uplinks, we'll use this for the LAN port. Doing this method will allow you to simulate actual routing performance of both solutions and you won't need an extra physical client.
Title: Re: OPNsense 4x slower than PFSense on same hardware
Post by: mimugmail on January 29, 2020, 07:13:55 pm
One of my clients pushes 2,6Gbit with ESX6.7 and vmxnet3. With Chelsio 40G cards I was able to route 22Gbit