BUG: Low data throughput (bug in Opnsense or FreeBSD?)

Started by schnipp, November 15, 2023, 06:04:23 PM

Previous topic - Next topic
Since migrating from Opnsense 19.7 to 20.1, data throughput when routing between internal network interfaces has degraded significantly. The primary reason was initially the migration to the new network stack based on "iflib", which also required changes to the network driver code. As I recall, there were initially major problems with the ISR (Interrupt Service Routine) in the Intel network drivers, which led to performance degradation. This problem appears to have now been resolved. In fact, the firewall provides full gigabit throughput between LAN interfaces as long as the "Strongswan" and "Samplicate" services are disabled.

When the "Strongswan" service is activated, throughput drops massively. The same thing happens when the "Samplicate" service is activated.


  • Strongswan activated: 115MB/s --> approx. 80MB/s
  • Samplicate activated: 115MB/s --> approx. 65-75MB/s
  • Strongswan and Samplicate activated: 115MB/s --> approx. 50MB/s

Unfortunately, the low data throughput problem has been present since version 20.1 and affects all network traffic, regardless of whether it is intended for the IPsec tunnel or not. I could still live with deactivating "Samplicate", but I have to rely on IPsec (Strongswan). The problem doesn't only occur with my firewall (see link).

Detailed investigations regarding IPsec have shown that "Strongswan" (IKE daemon) itself is not responsible for the lack of data throughput. The data throughput drops as soon as there is at least one entry in the IPsec SPD (i.e. IPsec part of the kernel). I wasn't able to do a more detailed analysis of "Samplicate".

Example:
=========

  • "Strongswan" activated and "Samplicate" deactivated
  • Data throughput approx. 80MB/s
  • Deletion of all entries in the Security Association Database (SAD)
      $ setkey -F
  • Data throughput approx. 80MB/s
  • Additionally, deletion of all entries in the Security Policy Database (SPD)
      $ setkey -FP
  • Data throughput approx. 115MB/s
  • Create any security policy outside the IP address range of my own network (policy does not apply)
      $ echo 'spdadd 192.168.222.0/24 192.168.223.0/24 any -P in ipsec esp/tunnel/1234-1235/unique:30;' | setkey -c
  • Data throughput approx. 80MB/s

I don't know if this bug is in Opnsense itself or in the underlying FreeBSD 13.2. Has anyone here in the forum made similar observations or also had problems with data throughput in this regard? It would be great if the performance problem will finally be solved after several years.


  • Board: Supermicro a2sdi-4c-hln4f
  • Opnsense: 20.7.8_1
OPNsense 24.7.11_2-amd64

Surprisingly, disabling strongswan and manually creating a VTI-based IPsec tunnel via


$ ifconfig ipsec0 create reqid 987
$ ifconfig ipsec0 inet tunnel 192.168.222.3 192.168.223.5


resulting in the following SPD:


$ setkey -DP

0.0.0.0/0[any] 0.0.0.0/0[any] any
in ipsec
esp/tunnel/192.168.223.5-192.168.222.3/unique:987
spid=452 seq=3 pid=4853 scope=ifnet ifname=ipsec0
refcnt=1
::/0[any] ::/0[any] any
in ipsec
esp/tunnel/192.168.223.5-192.168.222.3/unique:987
spid=454 seq=2 pid=4853 scope=ifnet ifname=ipsec0
refcnt=1
0.0.0.0/0[any] 0.0.0.0/0[any] any
out ipsec
esp/tunnel/192.168.222.3-192.168.223.5/unique:987
spid=453 seq=1 pid=4853 scope=ifnet ifname=ipsec0
refcnt=1
::/0[any] ::/0[any] any
out ipsec
esp/tunnel/192.168.222.3-192.168.223.5/unique:987
spid=455 seq=0 pid=4853 scope=ifnet ifname=ipsec0
refcnt=1


does not degrade performance.

So the question again: Does anybody has problems with the internal network throughput (similar to the description in the initial post) when IPsec (policy based) is activated at the same time?
OPNsense 24.7.11_2-amd64

Loading "enc" module? I remember this was a design choice back in the day.


Cheers,
Franco

I did some more tests. Removing the ,,enc" interface by unloading its kernel module doesn't matter, no performance improvement. So, I unloaded some more kernel modules to check whether they have an impact:

  • pfsync.ko
  • if_gre.ko
  • if_lagg.ko
  • if_infiniband.ko
  • if_bridge.ko
  • bridgestp.ko
  • if_wg.ko

But, none of these have an effect on performance penalty. Unloading the ipsec kernel module indeed improves performance, since this one is the same like deleting all security policies. But, IPsec is not the only performance issue. As I already mentioned, enabling ,,samplicate" also degrades performance. I looks like that the performance penalty with ,,samplicate" only occurs if the interfaces involved in data transmission are monitored. Removing all flowd netgraph nodes of the network interfaces used in data transmission will increase performance again (and comes back to maximum in case already mentioned IPSec security policies are deleted during this test).

Current insights are:

  • At least one global IPSec security policy for specific src/dst networks degrades performance whereas interface related security policies as shown in #2 do not degrade performance
  • Flowd network nodes in netgraph (= monitored interfaces by samplicate) involved in data transmission degrade performance

I don't have a clue where the performance penalty occurs. The next steps I will take are:

  • Playing around with different IPsec security policies and their impact on network performance
  • Trying to perform network profiling using DTrace

Maybe, has anybody further hints for solving the puzzle?
OPNsense 24.7.11_2-amd64

At least flowd reminds me that maybe there are some side-effects triggered that cause increased disk I/O?

Flowd itself writes to a database, maybe there is something writing more logs? Sometimes, old packages work but upon updates generate warnings or other output that normally go unnoticed - i.e. besides possible performance impacts...
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

Disk I/O measured with ,,systat" is almost zero. So, I think the disk is not a bottleneck.


But, ,,systat -vmstat" showed something interesting:

  • In case ,,strongswanÌ£" and ,,samplicate" are disabled, cpu-time consumed by interrupt handlers is <1% during data transfer at ca. 115MB/s
  • In case ,,samplicate" and/or ,,strongswan" are enabled, cpu-time consumed by interrupt handlers raises up (from 4% till 23%).

In my eyes, the second case looks weird. Generally, cpu-time in interrupt handlers should be kept as short as possible and data processing occurs outside the ISR.

What is your opinion?
OPNsense 24.7.11_2-amd64

Missing hardware support for crypto?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Yes, and that may also explain why the processes themselves do not get billed: Maybe the cryptography is handled within the kernel (with Linux, this is the case).

Also, different ciphers can mean that some are implemented more efficiently than others. I learned that the hard way when a bugfix for some vulnerability exploit lead to AVX instructions not being used for ZFS any more, thus slowing ZFS AES encryption down by 50%.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

November 22, 2023, 07:01:52 PM #8 Last Edit: November 22, 2023, 07:05:06 PM by schnipp
No, regarding ,,strongswan" it has nothing to do with crypto. As already mentioned in #1 the performance degrades when at least one global security policy is in the SPD whereas the traffic which suffers from degrades performance does not apply to that policy. So, there is no IPsec traffic in this case.

The reason why the performance issue does not occur if I manually create a VTI interface needs more investigation. Maybe, in medium terms a workaround could be migrating all IPsec connections to VTI.
OPNsense 24.7.11_2-amd64

I am just trying to shed a little more light on it and do a few more tests with IPsec policies before I start with DTrace.

But I am a little bit surprised that apparently no one has observed this behavior in themselves. Am I really the only one?
OPNsense 24.7.11_2-amd64

This seems to be an endless problem. I still encounter similar problems. LAN throughput is poor when (policy-based) IPsec VPN is enabled.

I ran a few tests again. In order to exclude as many external influences as possible, I deliberately kept the test setup simple. There are simply two servers (each in its own subnet) and OPNsense in between. The devices are connected via GBit full-duplex link . Performance tests are done with iPerf (utilizing TCP-connections).

iPerf gives the following bandwidth between the two servers:

strongswan activated:     ~600 MBit/s
strongswan deactivated: ~930 MBit/s

As you can see there is a significant drop in performance when strongswan is activated. This is a surprise to me since the traffic between the two servers is not subject to any IPsec policy at all.

Unfortunately, I have no idea how to further analyze the problem - if this is even possible. I'm afraid that one has to really dig deep into the kernel to understand what's happening behind the scenes.

I fear observed performance degradation could be linked to the following:

QuoteIn order to pass traffic over an IPsec tunnel, we need a policy matching the traffic. By default when adding a phase 2 (or child) policy a "kernel route" is installed as well, which traps traffic before normal routing takes place.

I would actually have expected that normal routing would have a higher priority and preference over the IPsec policies.

Maybe the developers can elaborate on the "kernel route" thing. Especially, it would be important to know whether the "kernel route" traps ALL traffic or if the trap just applies to traffic selectors defined in phase 2.

Asking developer questions here is fair, but I don't think anyone but FreeBSD developers can answer them:

https://github.com/opnsense/src/commit/542970fa2d3


Cheers,
Franco

Quote from: schnipp on November 27, 2023, 06:06:55 PM
I am just trying to shed a little more light on it and do a few more tests with IPsec policies before I start with DTrace.

But I am a little bit surprised that apparently no one has observed this behavior in themselves. Am I really the only one?

It looks like DTrace is not fully integrated. My only idea is to address this issue via the FreeBSD bug tracker. To do this, I need as much technical information and experience as possible.

So again: Who has similar experiences to those described in #1? Above all, I am interested in the experiences of users who have the same main board in use and can make throughput measurements in the LAN between two interfaces with IPsec activated/deactivated.
OPNsense 24.7.11_2-amd64