Hi everyone,
I have a NAS behind CGNAT which hosts all of my services. In order to access them externally I rented a cheap VPS with an IPv4.
Before switching to OPNsense I used wireguard on the NAS itself connected to the VPS, from where I'd forward requests from specific ports with masquerading.
A few days ago I decided to change my setup. I'm now using a different VPS on which I've installed OPNsense (this is my remote OPNsense). The remote OPNsense is connected to my local OPNsense box with a site-to-site wireguard setup.
I've configured the remote OPNsense as a gateway for my NAS in my local OPNsense box. I'm then forwarding TCP ports to my NAS from the remote OPNsense. Now my setup is no longer requiring masquerading and my NAS can see the IPs properly.
This setup seemed to work really well, until I noticed that I'm getting significantly lower speeds with this new setup. While I could download a file with 6MB/s from my nextcloud with the old setup, it's now going down to 1MB/s.
As a test I've spun up the old wireguard setup which connects my NAS directly with the old VPS (which now goes through the wireguard gateway and tunnel), which results in the old 6MB/s download.
So for some reason
NAS -> wireguard tunnel -> local OPNsense -> wireguard tunnel -> remote OPNsense -> VPS -> client
is faster than
NAS -> local OPNsense -> wireguard tunnel -> remote OPNsense -> client.
I'm out of ideas here, this makes no sense to me. I suspected this might have something to do with TCP window scaling, but shouldn't the remote OPNsense simply forward these requests?
I've also tried avoiding the remote OPNsense by using the VPS as a gateway, which also seems to increase speeds. So I suspect it's something with the remote OPNsense.
Here are the specs of my setup:
Local OPNsense:
OPNsense 25.7.7_4-amd64
Intel N100
16GB RAM
4x 2.5 Gbit Intel NIC
500 Mbit download
50 Mbit upload
Remote OPNsense:
OPNsense 25.7.7_4-amd64
2 Cores
4GB RAM
1 Gbit download
1 Gbit download
VPS:
Debian 12
1 Core
1GB RAM
1 Gbit download
1 Gbit download
Did you assign an interface to the local Wireguard instance and defined the firewall rules for the forwarded traffic on this interface?
And did you remove all pass rules from the Wireguard rules tab?
Quote from: viragomann on November 18, 2025, 07:54:36 PMDid you assign an interface to the local Wireguard instance and defined the firewall rules for the forwarded traffic on this interface?
Yeah, I did. The setup itself is working, the issue is the slow speed.
Quote from: viragomann on November 18, 2025, 07:54:36 PMAnd did you remove all pass rules from the Wireguard rules tab?
Not sure what you mean by this.
Check hardware offloading in the VPC OPNsense:
sysctl hw.vtnet.csum_disable
Quote from: Patrick M. Hausen on November 18, 2025, 09:10:32 PMCheck hardware offloading in the VPC OPNsense:
sysctl hw.vtnet.csum_disable
# sysctl hw.vtnet.csum_disable
hw.vtnet.csum_disable: 1
OK, so you are not triggering the offloading bug that can be present in KVM based clouds.
Sorry, in that case I do not have a suggestion, either.
For the OpnSense VM: Is a CPU emulation selected that accelerates Wireguard encryption (i.e. "host" would be best)?
If vtnet "hardware" is being used: Is multiqueue enabled on the NICs? RSS is enabled (https://docs.opnsense.org/troubleshooting/performance.html)?
Also, see: https://forum.opnsense.org/index.php?topic=44159.0, notes on network "hardware".
Quote from: meyergru on November 18, 2025, 10:25:53 PMFor the OpnSense VM: Is a CPU emulation selected that accelerates Wireguard encryption (i.e. "host" would be best)?
Unfortunately I don't think the hosting provider (netcup) allows control over things like that. As a test I spun up a cheap cx23 server on Hetzner, just to rule out any issues with netcup, but the issue persists.
Quote from: meyergru on November 18, 2025, 10:25:53 PMIf vtnet "hardware" is being used: Is multiqueue enabled on the NICs? RSS is enabled (https://docs.opnsense.org/troubleshooting/performance.html)?
Also, see: https://forum.opnsense.org/index.php?topic=44159.0, notes on network "hardware".
I'm using the virtio driver. Netcup allows me to choose between the following:
Edit: Uploading screenshots doesn't seem to work. I can choose between virtio, e1000, e1000e rtl8139, vmxnet3.
RSS should be enabled, `netstat -Q` shows:
# netstat -Q
Configuration:
Setting Current Limit
Thread count 2 2
Default queue limit 256 10240
Dispatch policy direct n/a
Threads bound to CPUs enabled n/a
Protocols:
Name Proto QLimit Policy Dispatch Flags
ip 1 1000 cpu hybrid C--
igmp 2 256 source default ---
rtsock 3 256 source default ---
arp 4 256 source default ---
ether 5 256 cpu direct C--
ip6 6 1000 cpu hybrid C--
ip_direct 9 256 cpu hybrid C--
ip6_direct 10 256 cpu hybrid C--
Workstreams:
WSID CPU Name Len WMark Disp'd HDisp'd QDrops Queued Handled
0 0 ip 0 351 0 2324816 0 1838889 4163705
0 0 igmp 0 0 0 0 0 0 0
0 0 rtsock 0 2 0 0 0 19 19
0 0 arp 0 0 3644425 0 0 0 3644425
0 0 ether 0 0 8061941 0 0 0 8061941
0 0 ip6 0 8 0 179268 0 2410 181678
0 0 ip_direct 0 0 0 0 0 0 0
0 0 ip6_direct 0 0 0 0 0 0 0
1 1 ip 0 642 0 444109 0 1575536 2019645
1 1 igmp 0 0 0 0 0 0 0
1 1 rtsock 0 0 0 0 0 0 0
1 1 arp 0 0 40907 0 0 0 40907
1 1 ether 0 0 287910 0 0 0 287910
1 1 ip6 0 29 0 2582 0 214522 217104
1 1 ip_direct 0 0 0 0 0 0 0
1 1 ip6_direct 0 0 0 0 0 0 0
I just did some iperf3 tests over WAN and there seems to be a massive difference between TCP and UDP.
Using UDP I'm getting the full bandwidth of my home connection, while using TCP I'm barely reaching that. That would also explain the speed increase when tunneling the other wireguard connection through the OPNsense to the other VPS.
Any ideas what's going on here?
TCP Client -> Server:
$ iperf3 -c myvps
Connecting to host myvps, port 5201
[ 5] local 192.168.1.10 port 52986 connected to myvps port 5201
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.00 sec 768 KBytes 6.29 Mbits/sec 6 22.0 KBytes
[ 5] 1.00-2.00 sec 768 KBytes 6.29 Mbits/sec 4 22.0 KBytes
[ 5] 2.00-3.00 sec 512 KBytes 4.19 Mbits/sec 2 23.4 KBytes
[ 5] 3.00-4.00 sec 640 KBytes 5.24 Mbits/sec 4 23.4 KBytes
[ 5] 4.00-5.00 sec 384 KBytes 3.15 Mbits/sec 3 26.1 KBytes
[ 5] 5.00-6.00 sec 896 KBytes 7.34 Mbits/sec 1 30.2 KBytes
[ 5] 6.00-7.00 sec 768 KBytes 6.29 Mbits/sec 4 27.5 KBytes
[ 5] 7.00-8.00 sec 768 KBytes 6.29 Mbits/sec 7 16.5 KBytes
[ 5] 8.00-9.00 sec 256 KBytes 2.10 Mbits/sec 4 13.8 KBytes
[ 5] 9.00-10.00 sec 384 KBytes 3.15 Mbits/sec 4 15.1 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.00 sec 6.00 MBytes 5.03 Mbits/sec 39 sender
[ 5] 0.00-10.03 sec 5.88 MBytes 4.91 Mbits/sec receiver
iperf Done.TCP Server -> Client:
-----------------------------------------------------------
Server listening on 5201 (test #9)
-----------------------------------------------------------
Accepted connection from myhomeip, port 26805
[ 5] local myvps port 5201 connected to myhomeip port 26779
[ ID] Interval Transfer Bitrate Retr Cwnd
[ 5] 0.00-1.01 sec 3.62 MBytes 30.1 Mbits/sec 14 182 KBytes
[ 5] 1.01-2.02 sec 6.00 MBytes 49.7 Mbits/sec 0 208 KBytes
[ 5] 2.02-3.02 sec 4.62 MBytes 39.0 Mbits/sec 4 150 KBytes
[ 5] 3.02-4.02 sec 5.12 MBytes 42.8 Mbits/sec 0 169 KBytes
[ 5] 4.02-5.01 sec 5.50 MBytes 46.5 Mbits/sec 0 193 KBytes
[ 5] 5.01-6.01 sec 4.25 MBytes 35.7 Mbits/sec 9 100 KBytes
[ 5] 6.01-7.01 sec 3.50 MBytes 29.3 Mbits/sec 0 126 KBytes
[ 5] 7.01-8.01 sec 3.75 MBytes 31.6 Mbits/sec 0 153 KBytes
[ 5] 8.01-9.01 sec 4.88 MBytes 40.8 Mbits/sec 0 179 KBytes
[ 5] 9.01-10.01 sec 6.25 MBytes 52.3 Mbits/sec 0 206 KBytes
[ 5] 10.01-10.07 sec 384 KBytes 53.7 Mbits/sec 1 71.5 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-10.07 sec 48.0 MBytes 40.0 Mbits/sec 28 senderUDP Client -> Server:
-----------------------------------------------------------
Server listening on 5201 (test #13)
-----------------------------------------------------------
Accepted connection from myhomeip, port 38020
[ 5] local myvps port 5201 connected to myhomeip port 53046
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.01 sec 5.27 MBytes 43.7 Mbits/sec 0.220 ms 31/3954 (0.78%)
[ 5] 1.01-2.00 sec 5.80 MBytes 49.0 Mbits/sec 0.306 ms 126/4445 (2.8%)
[ 5] 2.00-3.01 sec 5.89 MBytes 49.1 Mbits/sec 0.242 ms 274/4659 (5.9%)
[ 5] 3.01-4.00 sec 5.79 MBytes 48.9 Mbits/sec 0.214 ms 74/4384 (1.7%)
[ 5] 4.00-5.00 sec 5.84 MBytes 49.0 Mbits/sec 0.223 ms 50/4399 (1.1%)
[ 5] 5.00-6.00 sec 5.81 MBytes 48.8 Mbits/sec 0.226 ms 149/4474 (3.3%)
[ 5] 6.00-7.00 sec 5.82 MBytes 48.7 Mbits/sec 0.225 ms 93/4425 (2.1%)
[ 5] 7.00-8.00 sec 5.76 MBytes 48.2 Mbits/sec 0.197 ms 173/4459 (3.9%)
[ 5] 8.00-9.01 sec 5.88 MBytes 49.0 Mbits/sec 0.219 ms 69/4451 (1.6%)
[ 5] 9.01-10.00 sec 5.79 MBytes 48.9 Mbits/sec 0.193 ms 115/4425 (2.6%)
[ 5] 10.00-10.07 sec 378 KBytes 48.3 Mbits/sec 0.287 ms 9/284 (3.2%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[SUM] 0.0-10.1 sec 5 datagrams received out-of-order
[ 5] 0.00-10.07 sec 58.0 MBytes 48.3 Mbits/sec 0.287 ms 1163/44359 (2.6%) receiver
UDP Server -> Client:
$ iperf3 -c myvps -R --udp -b 500Mbit
Connecting to host myvps, port 5201
Reverse mode, remote host myvps is sending
[ 5] local 192.168.1.10 port 39579 connected to myvps port 5201
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-1.00 sec 52.1 MBytes 437 Mbits/sec 0.009 ms 5055/43877 (12%)
[ 5] 1.00-2.00 sec 60.5 MBytes 507 Mbits/sec 0.021 ms 249/45287 (0.55%)
[ 5] 2.00-3.00 sec 60.3 MBytes 506 Mbits/sec 0.037 ms 43/44932 (0.096%)
[ 5] 3.00-4.00 sec 59.4 MBytes 499 Mbits/sec 0.020 ms 16/44280 (0.036%)
[ 5] 4.00-5.00 sec 59.5 MBytes 499 Mbits/sec 0.045 ms 1/44321 (0.0023%)
[ 5] 5.00-6.00 sec 59.7 MBytes 501 Mbits/sec 0.020 ms 9/44477 (0.02%)
[ 5] 6.00-7.00 sec 59.5 MBytes 499 Mbits/sec 0.026 ms 2/44301 (0.0045%)
[ 5] 7.00-8.00 sec 59.8 MBytes 501 Mbits/sec 0.019 ms 0/44509 (0%)
[ 5] 8.00-9.00 sec 59.7 MBytes 501 Mbits/sec 0.020 ms 1/44437 (0.0023%)
[ 5] 9.00-10.00 sec 58.8 MBytes 493 Mbits/sec 0.019 ms 0/43798 (0%)
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
[ 5] 0.00-10.05 sec 599 MBytes 500 Mbits/sec 0.000 ms 0/445904 (0%) sender
[SUM] 0.0-10.0 sec 7030 datagrams received out-of-order
[ 5] 0.00-10.00 sec 589 MBytes 494 Mbits/sec 0.019 ms 5376/444219 (1.2%) receiver
iperf Done.
You should try with "iperf -P4". Also, you could look at the MTU size of your tunnel (you must deduct the VPN overhead from the normal MTU). As a first check, use 1400.