OPNsense Forum

English Forums => 25.1, 25.4 Production Series => Topic started by: dirtyfreebooter on February 10, 2025, 06:50:36 AM

Title: Wireguard Speed Issue ** SOLVED **
Post by: dirtyfreebooter on February 10, 2025, 06:50:36 AM
** SOLVED ** it was the firewall setting, Disable Reply-To. By default its unchecked. After checking it, all my test setups immediately went to full speed in both directions.

(https://i.imgur.com/LkEHFk0.png)



i am seeing a weird issue when with wireguard speeds going out of the firewall, but not in.

client (Debian 12 i7-13700 with Mellanox Connect-X5 25g NIC) -> opnsense -> server (Windows 11 AMD 7950X3d with Intel E810 25g NIC)
No internet, this is a local test

the test run on the client is:

iperf3 --client <server ip> --no-delay --parallel 8
iperf3 --client <server ip> --no-delay --parallel 8 --reverse

every time i do the --reverse test, i can't seem to get any faster than 545 Mbits/sec.

i have now tested this with 6 different nics and 3 versions of opnsense on 3 different setups.

Setup 1:
Supermicro X11SCL-iF with Intel Xeon E 2278g (8c/16t 5Ghz)

NICs tested:

Setup 2:
Lenovo P3 Tiny with Intel i3 14100t (4c/8t 4.4Ghz)

NICs tested:

Setup 3:
Odroid H4 Ultra with Intel N-305 (8c/8c 3.8Ghz)

NICs tested:

All of these systems have enough CPU on OPNsense to do more than 500 Mbit/sec with Wireguard.

I have tried each with system with 3 versions of OPNsense:

this is a vanilla install. install cpu-microcode-intel. configure wireguard via the road warrior instructions (https://docs.opnsense.org/manual/how-tos/wireguard-client.html). nothing else is added or configured on the systems.

Results


Looking at top when this is happening, the CPUs on all 3 systems, all 3 versions of OPNsense are basically idle. using maybe 5%-10% cpu max.

so all systems, no matter what i change on the iperf3 side, etc, all stuck at 545 Mbit/sec in that one direction. I have tried various ethernet cables, OM3 fiber, DAC cables. its all the same, 545 Mbit/sec.

MTUs on the systems are 1500 for NIC and 1420 for wireguard interfaces. upload direction is great performance. download 100% terrible all the time, 100% reproducible. from 24.7.1 to 25.1.

if i don't go through the wireguard interface, if i setup a NAT port forward, i get full speeds in both directions.

this must be some error or bad configuration on my part?
Title: Re: Wireguard Speed Issue
Post by: bartjsmit on February 10, 2025, 07:24:25 AM
Sounds like the wireguard encryption is slower than the decryption. There may also be an issue with the wireguard process being restricted to one thread. The protocol supports it but OPNsense may not: https://www.wireguard.com/performance/
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 10, 2025, 07:29:48 AM
the CPUs are all nearly idle when this is happening, the 545 Mbit/sec. i can see the wireguard kernel threads in top, 1 per cpu instances and they are all doing 0.5% - 1%. in the other direction the cpu usage scales with the NIC i am using, 1g/2.5g/10g/25g.

i guess its possible there is some sort of bug, but i tried 24.7.1 - 25.1 and same results, so no one noticed that for more than a year?

which is why it all seems like it has to be some sort of configuration error on my part...
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 11, 2025, 12:20:04 AM
because i hate myself, lol, i went and installed pfSense CE 2.7.2 on the supermicro x11scl-if with Xeon 2278g. using the i226V 4 port NIC, installed pfSense. installed the wireguard package. configured wireguard, added a single peer and connected and immediately was able to do 2.5g up and down.

iperf3 --client 192.168.1.103 --omit 1 --time 5 --parallel 4 -f g
Connecting to host 192.168.1.103, port 5201
[  5] local 192.168.40.2 port 35428 connected to 192.168.1.103 port 5201
[  7] local 192.168.40.2 port 35430 connected to 192.168.1.103 port 5201
[  9] local 192.168.40.2 port 35446 connected to 192.168.1.103 port 5201
[ 11] local 192.168.40.2 port 35454 connected to 192.168.1.103 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  73.7 MBytes  0.62 Gbits/sec    0    521 KBytes       (omitted)
[  7]   0.00-1.00   sec  74.0 MBytes  0.62 Gbits/sec    0    524 KBytes       (omitted)
[  9]   0.00-1.00   sec  65.4 MBytes  0.55 Gbits/sec    0    450 KBytes       (omitted)
[ 11]   0.00-1.00   sec  64.2 MBytes  0.54 Gbits/sec    0    468 KBytes       (omitted)
[SUM]   0.00-1.00   sec   277 MBytes  2.33 Gbits/sec    0             (omitted)
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   0.00-1.00   sec  68.4 MBytes  0.57 Gbits/sec    0    546 KBytes
[  7]   0.00-1.00   sec  68.2 MBytes  0.57 Gbits/sec    0    549 KBytes
[  9]   0.00-1.00   sec  65.7 MBytes  0.55 Gbits/sec    0    450 KBytes
[ 11]   0.00-1.00   sec  66.0 MBytes  0.55 Gbits/sec    0    468 KBytes
[SUM]   0.00-1.00   sec   268 MBytes  2.25 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   1.00-2.00   sec  68.5 MBytes  0.57 Gbits/sec    0    546 KBytes
[  7]   1.00-2.00   sec  68.6 MBytes  0.58 Gbits/sec    0    549 KBytes
[  9]   1.00-2.00   sec  65.7 MBytes  0.55 Gbits/sec    0    450 KBytes
[ 11]   1.00-2.00   sec  65.9 MBytes  0.55 Gbits/sec    0    468 KBytes
[SUM]   1.00-2.00   sec   269 MBytes  2.25 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   2.00-3.00   sec  67.3 MBytes  0.56 Gbits/sec    0    546 KBytes
[  7]   2.00-3.00   sec  69.1 MBytes  0.58 Gbits/sec    0    573 KBytes
[  9]   2.00-3.00   sec  64.8 MBytes  0.54 Gbits/sec    0    450 KBytes
[ 11]   2.00-3.00   sec  65.9 MBytes  0.55 Gbits/sec    0    468 KBytes
[SUM]   2.00-3.00   sec   267 MBytes  2.24 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   3.00-4.00   sec  68.4 MBytes  0.57 Gbits/sec    0    546 KBytes
[  7]   3.00-4.00   sec  70.2 MBytes  0.59 Gbits/sec    0    573 KBytes
[  9]   3.00-4.00   sec  65.8 MBytes  0.55 Gbits/sec    0    470 KBytes
[ 11]   3.00-4.00   sec  65.9 MBytes  0.55 Gbits/sec    0    468 KBytes
[SUM]   3.00-4.00   sec   270 MBytes  2.27 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   4.00-5.00   sec  68.6 MBytes  0.58 Gbits/sec    0    546 KBytes
[  7]   4.00-5.00   sec  69.1 MBytes  0.58 Gbits/sec    0    573 KBytes
[  9]   4.00-5.00   sec  66.3 MBytes  0.56 Gbits/sec    0    470 KBytes
[ 11]   4.00-5.00   sec  65.0 MBytes  0.55 Gbits/sec    0    468 KBytes
[SUM]   4.00-5.00   sec   269 MBytes  2.26 Gbits/sec    0
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.00   sec   341 MBytes  0.57 Gbits/sec    0             sender
[  5]   0.00-5.00   sec   342 MBytes  0.57 Gbits/sec                  receiver
[  7]   0.00-5.00   sec   345 MBytes  0.58 Gbits/sec    0             sender
[  7]   0.00-5.00   sec   345 MBytes  0.58 Gbits/sec                  receiver
[  9]   0.00-5.00   sec   328 MBytes  0.55 Gbits/sec    0             sender
[  9]   0.00-5.00   sec   328 MBytes  0.55 Gbits/sec                  receiver
[ 11]   0.00-5.00   sec   329 MBytes  0.55 Gbits/sec    0             sender
[ 11]   0.00-5.00   sec   329 MBytes  0.55 Gbits/sec                  receiver
[SUM]   0.00-5.00   sec  1.31 GBytes  2.25 Gbits/sec    0             sender
[SUM]   0.00-5.00   sec  1.31 GBytes  2.25 Gbits/sec                  receiver

and reverse

iperf3 --client 192.168.1.103 --omit 1 --time 5 --parallel 4 -f g -R
Connecting to host 192.168.1.103, port 5201
Reverse mode, remote host 192.168.1.103 is sending
[  5] local 192.168.40.2 port 35472 connected to 192.168.1.103 port 5201
[  7] local 192.168.40.2 port 35480 connected to 192.168.1.103 port 5201
[  9] local 192.168.40.2 port 42704 connected to 192.168.1.103 port 5201
[ 11] local 192.168.40.2 port 42706 connected to 192.168.1.103 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec  24.6 MBytes  0.21 Gbits/sec                  (omitted)
[  7]   0.00-1.00   sec  66.9 MBytes  0.56 Gbits/sec                  (omitted)
[  9]   0.00-1.00   sec   130 MBytes  1.09 Gbits/sec                  (omitted)
[ 11]   0.00-1.00   sec  45.6 MBytes  0.38 Gbits/sec                  (omitted)
[SUM]   0.00-1.00   sec   267 MBytes  2.24 Gbits/sec                  (omitted)
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   0.00-1.00   sec  29.2 MBytes  0.24 Gbits/sec
[  7]   0.00-1.00   sec  63.2 MBytes  0.53 Gbits/sec
[  9]   0.00-1.00   sec   131 MBytes  1.10 Gbits/sec
[ 11]   0.00-1.00   sec  45.2 MBytes  0.38 Gbits/sec
[SUM]   0.00-1.00   sec   269 MBytes  2.25 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   1.00-2.00   sec  31.9 MBytes  0.27 Gbits/sec
[  7]   1.00-2.00   sec  61.6 MBytes  0.52 Gbits/sec
[  9]   1.00-2.00   sec   129 MBytes  1.08 Gbits/sec
[ 11]   1.00-2.00   sec  46.5 MBytes  0.39 Gbits/sec
[SUM]   1.00-2.00   sec   269 MBytes  2.25 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   2.00-3.00   sec  35.9 MBytes  0.30 Gbits/sec
[  7]   2.00-3.00   sec  57.8 MBytes  0.49 Gbits/sec
[  9]   2.00-3.00   sec   130 MBytes  1.09 Gbits/sec
[ 11]   2.00-3.00   sec  44.6 MBytes  0.37 Gbits/sec
[SUM]   2.00-3.00   sec   269 MBytes  2.25 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   3.00-4.00   sec  41.1 MBytes  0.34 Gbits/sec
[  7]   3.00-4.00   sec  51.4 MBytes  0.43 Gbits/sec
[  9]   3.00-4.00   sec   136 MBytes  1.14 Gbits/sec
[ 11]   3.00-4.00   sec  39.7 MBytes  0.33 Gbits/sec
[SUM]   3.00-4.00   sec   268 MBytes  2.25 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[  5]   4.00-5.00   sec  44.2 MBytes  0.37 Gbits/sec
[  7]   4.00-5.00   sec  54.4 MBytes  0.46 Gbits/sec
[  9]   4.00-5.00   sec   134 MBytes  1.12 Gbits/sec
[ 11]   4.00-5.00   sec  36.4 MBytes  0.31 Gbits/sec
[SUM]   4.00-5.00   sec   269 MBytes  2.25 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-5.00   sec   183 MBytes  0.31 Gbits/sec    0             sender
[  5]   0.00-5.00   sec   182 MBytes  0.31 Gbits/sec                  receiver
[  7]   0.00-5.00   sec   289 MBytes  0.48 Gbits/sec    1             sender
[  7]   0.00-5.00   sec   288 MBytes  0.48 Gbits/sec                  receiver
[  9]   0.00-5.00   sec   660 MBytes  1.11 Gbits/sec    0             sender
[  9]   0.00-5.00   sec   660 MBytes  1.11 Gbits/sec                  receiver
[ 11]   0.00-5.00   sec   213 MBytes  0.36 Gbits/sec    3             sender
[ 11]   0.00-5.00   sec   212 MBytes  0.36 Gbits/sec                  receiver
[SUM]   0.00-5.00   sec  1.31 GBytes  2.26 Gbits/sec    4             sender
[SUM]   0.00-5.00   sec  1.31 GBytes  2.25 Gbits/sec                  receiver


CPU usage is ~ 15% per core, except the main WG core which is 40%
https://imgur.com/a/SDtQK5I
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 11, 2025, 07:18:00 PM
looking at the kernels for pfSense 2.7.2 CE and OPNsense 24.10.2, the wireguard implementation is nearly identical. so it seems like the OPNsense issue is somewhere else in the kernel.

since the behavior is the same for multiple NICs and drivers, igb, igc, ix, ixl, mce, it probably not a driver issue.

which leaves some sort of pf or networking issue in the OPNsense kernel.

i guess next steps are maybe to try vanilla FreeBSD and see if it also occurs.
Title: Re: Wireguard Speed Issue
Post by: meyergru on February 11, 2025, 09:28:32 PM
I can reach >= 700 Mbits/s between two Linux hosts connected over Wireguard on an N100 box, so there is no principal problem.

But I need more than one connection with -R and one thread only, so: did you enable RSS?
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 11, 2025, 10:29:22 PM
when i enabled RSS on the Xeon 2278g (8c/8t with hyperthreads disabled)

net.isr.bindthreads = 1
net.isr.maxthreads = -1
net.inet.rss.enabled = 1
net.inet.rss.bits = 3

my throughput went down to ~300 Mbit/sec (but again only in the 1 direction)

these are all fresh installs of 24.7.1, 24.10.2, and 25.1 with no configuration other than intel-cpu-microcode and wireguard.

i just tested
- FreeBSD 14.2 on the supermicro xeon 2278g and running iperf3 server on freebsd and had no problems running in either direction. (so no firewalling)
- OpenWRT x86 and no problem maxing out 2.5g on the N305 system

Title: Re: Wireguard Speed Issue
Post by: meyergru on February 12, 2025, 11:29:47 AM
Disable Spectre Mitigation: "sysctl hw.ibrs_disable=1", should work immediately, without reboot.
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 12, 2025, 04:54:54 PM
i have tried with

vm.pmap.pti=0
hw.ibrs_disable=1

and there is no difference. on OPNsense, the CPUs are idle when traffic is flowing..

but even so, all of these setups have so much more Ghz than 500 Mbit/sec. the Xeon E-2278G is 8c/16t 5ghz with 4.3ghz all core turbo. the i3-14000T is 4c/8t with 4.3 Ghz all turbo. additionally, the other OSes, vanilla FreeBSD 14.2, pfSense 2.7.2 all have these on by default.

i have installed FreeBSD, pfSense, OPNsense 24.7.1, 24.10.2, 25.1 on separate SSDs, so at least i can switch back and forth easy to try and test things, the combination of 3 PCs, 3 CPUs, 8 different NICs and only OPNsense has an issue and only in one direction is odd.

Title: Re: Wireguard Speed Issue
Post by: meyergru on February 12, 2025, 05:46:55 PM
Since it is not CPU-limited and also, several NICs show the same behavior, and also, pfSense does not show that, I can only suspect a one-directional problem with the network, like, e.g. flow control or hardware offloading? IDK if the settings differ between OSes.
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 12, 2025, 05:51:36 PM
+1 yea, i am going to try and do sysctl -a on all systems and diff those and see if i can find any differences to experiment with as my next steps.

ixl driver defaults to dev.ixl.0.fc=0 and i have all the hardware offloading turned off (OPNsense default), but i have also tried with it all turned on and just checksuming turned on (pfSense default). no change whatsoever. for igb and igc drivers have tried flowcontrol on/off but no change either
Title: Re: Wireguard Speed Issue
Post by: joezeppy on February 14, 2025, 03:49:29 PM
I'm no expert and I'm curious as to why your performance is poor in just one direction.  With OPNsense 25.1.1 on a DEC850 with 10G ports, I'm getting around 2Gbps both Up/Dn single threaded.  I've tried higher -P values, but the results are about the same even though I have applied some RSS tweaks.  Here's a screen shot using my Windows PC with WireGuard activated through the firewall to an iperf instance on my NAS server (all 10G path).

(https://i.postimg.cc/xkB78qkx/WIre-Guard-Speed-Test-Result.png) (https://postimg.cc/xkB78qkx)

My configuration is dual stack (IPv4/IPv6) and also I use a lower WireGaurd MTU of 1360 because my phone is cellular CGNAT even though my test result above is from my PC which is not using cellular:

(https://i.postimg.cc/VJL2HxQR/My-Wire-Guard.png) (https://postimg.cc/VJL2HxQR)
(https://i.postimg.cc/7C5dqSLJ/Wire-Guard-Instance.png) (https://postimg.cc/7C5dqSLJ)
(https://i.postimg.cc/MccHrgtQ/Wire-Guard-Outbound.png) (https://postimg.cc/MccHrgtQ)

I also have a firewall normalization entry:

(https://i.postimg.cc/QFcwCcGF/Firewall-Normalization.png) (https://postimg.cc/QFcwCcGF)


root@OPNsense:~ # netstat -Q
Configuration:
Setting                        Current        Limit
Thread count                         8            8
Default queue limit                256        10240
Dispatch policy                 direct          n/a
Threads bound to CPUs          enabled          n/a

Protocols:
Name   Proto QLimit Policy Dispatch Flags
ip         1   4096    cpu   hybrid   C--
igmp       2    256 source  default   ---
rtsock     3    256 source  default   ---
arp        4    256 source  default   ---
ether      5    256    cpu   direct   C--
ip6        6   1000    cpu   hybrid   C--
ip_direct     9    256    cpu   hybrid   C--
ip6_direct    10    256    cpu   hybrid   C--
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 14, 2025, 06:21:31 PM
wow thanks for that info. that will help a lot with me being able to verify my setup.

yea, i will be able to get back to testing this weekend. it does seem like its something in my environment, but my environment is 3 directly connected PCs and i just swap out the router SSD with another OS like pfSense or OpenWRT and get full speeds.
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 15, 2025, 07:25:13 PM
so i have not made any progress.

going back and forth with SSDs, pfSense 2.7.2 CE is always hitting max speeds in both directions. i've tried diff'ing sysctl -a between the two systems and they are not really that different. any changes i made to make OPNsense match pfSense sysctls, made no differences whatsoever.

i shortened all the iperf3 output for display. but i have saved all the data. this is all 100% reproducible. followed the OPNsense official docs to setup Wireguard and OpenVPN.

iperf3 --client <ip> --no-delay --parallel 4 [--reverse]
Supermicro X11SCL-iF with Intel Xeon E-2278G (8c/16t 5Ghz), Intel X710-DA2 SFP+ NIC with v9.53 firmware

pfSense 2.7.2 CE

NAT Port Forward
Connecting to host 192.168.160.10, port 5201
[SUM]   0.00-5.00   sec  5.46 GBytes  9380 Mbits/sec  2010             sender
[SUM]   0.00-5.00   sec  5.46 GBytes  9376 Mbits/sec                  receiver

Reverse mode, remote host 192.168.160.10 is sending
[SUM]   0.00-5.00   sec  5.48 GBytes  9417 Mbits/sec    0             sender
[SUM]   0.00-5.00   sec  5.48 GBytes  9415 Mbits/sec                  receiver

Wireguard
Connecting to host 192.168.1.101, port 5201
[SUM]   0.00-4.00   sec  3.55 GBytes  7616 Mbits/sec  364             sender
[SUM]   0.00-4.00   sec  3.55 GBytes  7615 Mbits/sec                  receiver

Reverse mode, remote host 192.168.1.101 is sending
[SUM]   0.00-4.00   sec  3.12 GBytes  6694 Mbits/sec  200             sender
[SUM]   0.00-4.00   sec  3.12 GBytes  6692 Mbits/sec                  receiver

OPNsense 25.1.1

Wireguard
Connecting to host 192.168.1.101, port 5201
[SUM]   0.00-4.00   sec  3.76 GBytes  8077 Mbits/sec  554             sender
[SUM]   0.00-4.00   sec  3.76 GBytes  8071 Mbits/sec                  receiver

Reverse mode, remote host 192.168.1.101 is sending
[SUM]   0.00-4.00   sec   149 MBytes   312 Mbits/sec  272             sender
[SUM]   0.00-4.00   sec   142 MBytes   299 Mbits/sec                  receiver

Odroid H4 Ultra with Intel n305 (8c), Intel i226V 2.5g NIC

OPNsense 25.1.1

Wireguard
Connecting to host 192.168.1.101, port 5201
[SUM]   0.00-4.00   sec  1.04 GBytes  2229 Mbits/sec  303             sender
[SUM]   0.00-4.00   sec  1.04 GBytes  2228 Mbits/sec                  receiver

Reverse mode, remote host 192.168.1.101 is sending
[SUM]   0.00-4.00   sec   248 MBytes   519 Mbits/sec  1157             sender
[SUM]   0.00-4.00   sec   241 MBytes   506 Mbits/sec                  receiver

at a loss for what to try next, i setup OpenVPN with DCO on the Odroid H4 Ultra.

OpenVPN (DCO)
Connecting to host 192.168.1.101, port 5201
[SUM]   0.00-5.00   sec  1.32 GBytes  2266 Mbits/sec    0             sender
[SUM]   0.00-5.04   sec  1.33 GBytes  2268 Mbits/sec                  receiver

Reverse mode, remote host 192.168.1.101 is sending
[SUM]   0.00-5.00   sec   329 MBytes   551 Mbits/sec    0             sender
[SUM]   0.00-5.00   sec   327 MBytes   548 Mbits/sec                  receiver


So NAT port forwarding tests so line rate, 10g or 2.5g from both pfSense and OPNsense.

pfSense CE with wireguard shows 7.6 gbit/sec and 6.6 gbit/sec.

OPNsense with wireguard shows 8.0 gbit/sec (2.2 gbit/sec on i226v) and 300-500 mbit/sec (messing with MTU/MSS/normalization rules only reduces throughput)

OPNsense with openvpn (i226v) shows 2.2 gbit/sec and 548 mbit/sec

using top, when systems are doing 500 mbit/sec the cpus are idle. the power draw from the outlet is even at the idle power watts. there must be some sort of kernel lock in OPNsense??



VPN provider doesn't matter. Wireguard or OpenVPN shows the same issues. reverse direction is locked to 500 mbit/sec. these are vanilla / new installs. same systems. i am just literally swapping the SSDs and rebooting between pfSense and OPNsense and getting 100% reproducible results.

i don't know what to do next. i have tried multiple systems (for clients and servers), but seemingly i the only person seeing this?

1 reddit post seeing similar behavior: https://www.reddit.com/r/opnsense/comments/1gwzkye/opnsense_and_wireguard_why_is_wireguard_limited/ but user give up and bought pfSense plus.
Title: Re: Wireguard Speed Issue
Post by: EricPerl on February 15, 2025, 08:22:00 PM
Using the router that gives you asymmetrical results, I would physically swap the client and server.
You have the choice of initiating the test from either end.

OPNsense actually seems to have better results that pFsense in one direction...

Your results indicate a bad interaction between one machine and the side of the router it is connected to.
I'd be looking for low level statistics on retries.
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 15, 2025, 08:30:55 PM
it just appears this is too much information for ppl to read all the text and see that i have tried multiple combinations of hardware, software, nic, cables, etc.... that is a bit frustrating, but understandable, i guess.

i have been swapping multi PCs for both client, server, and router.

AMD 7950X3D
Intel Xeon E-2278G (X11SCL-iF)
Intel Xeon E-2414 (X13SCH-LN4F)
Intel Pentium G7400 (X13SCL-iF)
Odroid H4 Ultra Intel n-305
Lenovo P3 Tiny Intel i3-14100t
Intel i7-13700t (X13SAE-F)

with the exception of the 7950X3D, i have just swapped the SSDs around between client (Ubuntu 22.04), router (OPNsense 24.10.2 and 25.1.1, pfSense 2.7.2 CE), server (FreeBSD 14.2, Ubuntu 22.04)

i have also tried the following NICs in all combinations of PCs and NICs:
Intel E810-XVVA2 (25g)
Mellanox Connect5 (25g)
Mellanox Connect3 (10g)
Intel X710-DA2 fw 9.53
Intel X710-DA2 fw 8.10
Intel X520-DA2
Intel i225V-B3
Intel i226V
Intel i350-T4
Intel i210 (onboard NICs for supermicro motherboards)

and i have used 50 different patch cables, SPF+ transceivers, UniFi DAC cables, etc.

the only common issue is OPNsense that i have discovered so far. running through all those combinations took days and days. both pfSense and OPNsense act the same way on every combination. pfSense is line rate or CPU bound. OPNsense is line rate or CPU bound in one direction and then kernel bound to ~500 mbit/sec in one direction when routing through Wireguard or OpenVPN. OPNsense can NAT port forward at 10g easily.

No matter the NIC speed or driver in OPNsense, routing through OpenVPN or Wireguard results in ~ 500 mbit/sec throughput in one direction
Title: Re: Wireguard Speed Issue
Post by: EricPerl on February 16, 2025, 03:58:20 AM
I saw evidence of swapping the router and HW on the router, less so of swapping client & server or where the test is initiated from.

I question your testing methodology. Going wide might lead to a combination that works, not necessarily for a root cause of the mismatch.
OPN is a common factor but it would be easier to blame if that was the case in both directions.
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 16, 2025, 07:55:08 AM
if i disable the firewall
pfctl -dbut still use the wireguard interface, i can see the wireguard kernel threads using CPU and i get symmetrical speeds with intel i350-t4, i226v, x710-da2. i didn't test anymore, since the before/after was 100% reproducible.

there is 100% some bug in the OPNsense firewall / pf side of things or some configuration that comes with a vanilla install that is causing this.

i am blaming OPN because the exact setups all work when i try pfSense 2.7.2 CE, OpenWRT x86_64 24.10.0. the same client/servers, hardware, just replace the router software.
Title: Re: Wireguard Speed Issue
Post by: EricPerl on February 16, 2025, 08:26:00 PM
You might be getting somewhere.
So FW + Wireguard + iperf3 reverse?

And the steps are:
Client -> OPN-WAN / OPN-LAN -> Server
Wireguard connection to OPN server initiated from Client
Then from client: iperf3 --client <server ip> --no-delay --parallel 8 --reverse
Rule on the Wireguard interface?

I'm actually a little curious about what the actual traffic looks like so I'll set something up after I get confirmation of the entire test environment.
I have no idea what FW state is going to be created as a result of such experiment so this is a learning opportunity.
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 16, 2025, 10:23:46 PM
my basic setup looks like this:

(https://i.imgur.com/kZBRBg7.png)

3 computers are

wireguard is setups using the official pfSense documentation (https://docs.netgate.com/pfsense/en/latest/vpn/wireguard/index.html) and OPNsense road warrior documentation (https://docs.opnsense.org/manual/how-tos/wireguard-client.html).

iperf3 commands on the client:
iperf3 --client 192.168.1.100 --no-delay --omit 1 --time 5 --parallel 4 --format m
iperf3 --client 192.168.1.100 --no-delay --omit 1 --time 5 --parallel 4 --format m --reverse

to verify the setup, i setup a NAT port forward on port 5201 to 192.168.1.100 and run iperf3 against the WAN IP from client
iperf3 --client 192.168.160.10 --no-delay --omit 1 --time 5 --parallel 4 --format m
iperf3 --client 192.168.160.10 --no-delay --omit 1 --time 5 --parallel 4 --format m --reverse

pfSense, OPN 24.10.2, OPN 25.1.1 all showed ~9.45 Gbit/sec in both directions.

wireguard results




if i disable pf via
pfctl -d and re-run the iperf3 commands but still going through the wireguard interfaces


For this test the CPUs are nearly 100% in use with the kernel threads and wireguard threads. This is basically the CPU bound max of the 2278g setup.



pfSense and OPN are setup with a wireguard interface and have a single firewall rule: allow from wg0 net to any.

i have tried various MTUs, outbound NAT rules, and firewall normalization rules suggested in the documentation or on these forums. those extra rules made no difference whatsoever to the results. the reverse iperf3 direction is always stuck ~ 500 Mbits and the CPUs are nearly 100% idle during transmission.

no matter which CPU/nic i used in the router, e-2278g, e-2414, G7400, n305, i3-14000t, i5-13400t, --reverse direction is always ~ 500 Mbit/sec. some kernel level delay blocking CPU from working as hard as it can.



the installs and setups are as out-of-box as possible. pfSense i had to install the wireguard package. OPNsense i install the cpu-microcode-intel package.

i have gone through various tunables, but none make any real impact. 8 Gbps vs 500 Mbps isn't going to be tweaked, unless its some sort of bug that can be worked around

i have also tried many other machines as noted earlier for client, router, server. i also put in my AMD 7950X3D windows 11 desktop into the mix as client and server. there is no difference in behavior whatsoever.



i have setup a OpenVPN tunnel via the OPNsense OpenVPN road warrior documentation and i get the same behavior. so its not wireguard. i did not try an IPsec tunnel.

i also diff'd the output of sysctl -a of pfSense 2.7.2 and OPNsense 25.1.1 and saw no real meaningful differences.
Title: Re: Wireguard Speed Issue
Post by: EricPerl on February 17, 2025, 07:34:10 PM
Hmm, I didn't get a chance to try this out yesterday and I'm not going to be able to for another couple of days.

This said, something occurred to me overnight.
You mention a static IP on the WAN side and a directly attached machine. So no gateway? Or gateway set to the attached machine?
I ask because I've been bit by a fairly obscure setting (Firewall > Settings > Advanced > Disable reply-to) in the past that affects how WAN traffic is routed.
By default, all out traffic on the WAN is directed at the gateway for that network, which is back holed if the gateway is another firewall.

Maybe it's irrelevant but I wonder how that "feature" interacts with your test environment.
You might be better off adding a switch, having an actual gateway and disabling reply-to.
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 17, 2025, 07:44:20 PM
yea i have run with an actual gateway as well, adding the WAN to my normal network, i do this initially so that i can install any needed packages and say update 25.1 to 25.1.1.

in either case, there is no difference in behavior. i tried eliminating the external network after a few days to try and isolate it more, but the results are exactly the same either way, unfortunately
Title: Re: Wireguard Speed Issue
Post by: jezza007 on February 19, 2025, 10:58:23 AM
Poor speed or disconnects, usually MTU is wrong, set too high at either side of the tunnel
Title: Re: Wireguard Speed Issue
Post by: meyergru on February 19, 2025, 12:27:47 PM
Well, if you ignore the official instructions (https://docs.opnsense.org/manual/how-tos/wireguard-s2s.html) and do not set MSS clamping for Wireguard, then, yes, of course...
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 19, 2025, 03:46:02 PM
yea i followed the documentation, verified mtu, with or without the firewall normalization rule, tried messing with MTUs on both sides. there is no difference.

i also verified the out-of-box MTUs are the same as pfSense and OpenWRT according to ifconfig
Title: Re: Wireguard Speed Issue
Post by: EricPerl on February 20, 2025, 03:56:37 AM
Quote from: dirtyfreebooter on February 17, 2025, 07:44:20 PMyea i have run with an actual gateway as well, adding the WAN to my normal network, i do this initially so that i can install any needed packages and say update 25.1 to 25.1.1.

in either case, there is no difference in behavior. i tried eliminating the external network after a few days to try and isolate it more, but the results are exactly the same either way, unfortunately
And reply-to was disabled on the OPN being tested, right?
Otherwise, all reply traffic between OPN and the desktop on the WAN side bounces via the WAN gateway.

I managed to run enough of a test to look at states and traffic (for my education).
It's actually pretty darn simple. Simple UDP tunnel between the wireguard client and OPN WAN + one TCP connection per iperf thread from client's wireguard IP and target machine.
It's no surprise loads are negligible when traffic is choked up. I have no idea where the bottleneck could be.
I don't know that a packet capture would reveal anything.

FWIW, my test environment was way worse than yours (yet sufficient for my investigation):
Ubuntu VM on my prod N305 based proxmox (where my OPN also lives), in a separate VLAN so I don't need to deal with reply-to -> main OPN for inter-VLAN -> managed switch -> OPN with Wireguard (also virtualized on N100 hardware) -> unmanaged switch -> target Ubuntu.

I still managed to get 900Mbps in both directions, very symmetrical... It's mindboggling you can't exceed 550Mbps on your hardware.
Title: Re: Wireguard Speed Issue
Post by: dirtyfreebooter on February 21, 2025, 12:40:54 AM
O-M-G ** SOLVED ** THANK YOU eric!!

it was the reply-to... changed it to
(https://i.imgur.com/LkEHFk0.png)
and immediately all OPN installs worked in both directions... from 23.1 to 25.1.1 on all my hardware setups...

The Intel X E-2278G with X710-DA2

Up: 8.60 Gbits/sec
Down: 7.27 Gbits/sec

iperf3 --client 192.168.1.20 --omit 1 --time 5 --parallel 16 --format g
...
[SUM]   0.00-5.00   sec  5.00 GBytes  8.59 Gbits/sec  3327             sender
[SUM]   0.00-5.00   sec  5.00 GBytes  8.60 Gbits/sec                  receiver

iperf3 --client 192.168.1.20 --omit 1 --time 5 --parallel 16 --format g --reverse
Reverse mode, remote host 192.168.1.20 is sending
...
[SUM]   0.00-5.00   sec  4.23 GBytes  7.27 Gbits/sec  7211             sender
[SUM]   0.00-5.00   sec  4.23 GBytes  7.27 Gbits/sec                  receiver
Title: Re: Wireguard Speed Issue ** SOLVED **
Post by: dirtyfreebooter on February 21, 2025, 12:45:16 AM
i turned off the firewall normalization rule the Road Warrior Docs say to use and now i consistently get

Up:[SUM]   0.00-5.00   sec  5.23 GBytes  8.98 Gbits/sec   34             sender
[SUM]   0.00-5.00   sec  5.23 GBytes  8.98 Gbits/sec                  receiver

Down:[SUM]   0.00-5.00   sec  4.62 GBytes  7.93 Gbits/sec  5610             sender
[SUM]   0.00-5.00   sec  4.62 GBytes  7.93 Gbits/sec                  receiver
Title: Re: Wireguard Speed Issue ** SOLVED **
Post by: EricPerl on February 21, 2025, 04:03:51 AM
Nice!

I kinda liked that theory because it explained the discrepancy, but it was arguably just a theory until you verified it.

I don't know what the gateway was but it was likely not multi-gig.
The downgrade all the way to ~550 might have been caused by collisions.

As to what was going on with the client directly connected is still unknown.
But that's such an atypical use case that it's not worth investigating further.
Sometimes simplifying the test bench to an extreme has unpredictable side effects.

When I got bit by this setting, no traffic went through because my main OPN rejected it (state violation) since it was reply traffic to requests that it never saw. That's usually easier to troubleshoot than performance issues... Especially since I had FW logs.
Title: Re: Wireguard Speed Issue ** SOLVED **
Post by: tessus on March 13, 2025, 07:49:56 AM
This was an interesting thread, but I have a question about the "reply-to" setting.

I have only one physical WAN interface, but I use one WG0 for incoming Wireguard clients and I have 2 other interfaces that handle outgoing VPN (OpenVPN and Wireguard) connections via separate gateways.
I guess this means I should also disable "reply-to" on WAN rules, correct? 
Title: Re: Wireguard Speed Issue ** SOLVED **
Post by: EricPerl on March 13, 2025, 07:46:52 PM
I'm not entirely sure how you use your VPNs but reply-to is primarily meant to handle multi-WAN use cases.
Per the documentation:
QuoteWith Multi-WAN you generally want to ensure traffic leaves the same interface it arrives on, hence reply-to is added automatically by default.
Title: Re: Wireguard Speed Issue ** SOLVED **
Post by: tessus on March 13, 2025, 08:41:37 PM
Thanks for the reply. I've setup something like this https://docs.opnsense.org/manual/how-tos/wireguard-selective-routing.html for Wireguard and something similar for OpenVPN. So certain devices on my network are routed through/via either Wireguard or OpenVPN.

I've read the documentation you quoted, but what I wanted to know was whether this only applies to fail-over multi-wan with multiple physical ports, or also applies to my setup. Because even though I only have one physical WAN port, I have different WAN interfaces (WAN, WAN_WireGuard, WAN_OpenVPN).
I highly suspect that my setup counts as multi-wan, but I wanted to confirm with the experts.
Title: Re: Wireguard Speed Issue ** SOLVED **
Post by: EricPerl on March 14, 2025, 02:45:16 AM
I'm no expert in anything networking related and have 0 experience with multi-WAN or multi-VPN, very limited VPN experience altogether.
I'm more familiar with the side effects of reply-to in the context of OPN deployed on an internal network, which is what this thread was about.

I'd leave it on and suggest you create a new thread if you encounter issues.