Curious if it may be known that wireguard is almost 2x or 3x faster than FreeBSD implementation?
I have been benchmarking wireguard performance to a local datacenter VPS that will become my gateway soon (I have 5G internet at home and CGNAT).
I setup wireguard clients on 3 platforms and did the same tests.
Debian 10 (wireguard kernel mod):
-- downloading data from VPS to home
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.03 sec 453 MBytes 127 Mbits/sec 49 sender
[ 5] 0.00-30.00 sec 442 MBytes 124 Mbits/sec receiver
-- uploading
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 24.1 MBytes 6.75 Mbits/sec 19 sender
[ 5] 0.00-30.05 sec 23.9 MBytes 6.68 Mbits/sec receiver
pfsense 2.5.0 CE (built-in wireguard in kernel according to them):
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.25 sec 176 MBytes 48.7 Mbits/sec 1 sender
[ 5] 0.00-30.00 sec 170 MBytes 47.5 Mbits/sec receiver
iperf Done.
[2.5.0-RELEASE][root@pfSense.pf]/root:
--upload
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 24.6 MBytes 6.89 Mbits/sec 20 sender
[ 5] 0.00-30.26 sec 23.7 MBytes 6.56 Mbits/sec receiver
opnsense 21.1 (go-wireguard plugin):
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.25 sec 104 MBytes 28.9 Mbits/sec 1860 sender
[ 5] 0.00-30.00 sec 101 MBytes 28.4 Mbits/sec receiver
-- upload
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 15.4 MBytes 4.32 Mbits/sec 22 sender
[ 5] 0.00-30.26 sec 15.1 MBytes 4.19 Mbits/sec receiver
iperf3 without tunnel from opnsense:
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.28 sec 352 MBytes 97.5 Mbits/sec 999 sender
[ 5] 0.00-30.00 sec 343 MBytes 96.0 Mbits/sec receiver
-- upload
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 13.4 MBytes 3.74 Mbits/sec 34 sender
[ 5] 0.00-30.26 sec 13.3 MBytes 3.68 Mbits/sec receiver
For good measure, speed test from debian 10 box outside tunnel:
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.08 sec 321 MBytes 89.6 Mbits/sec 1278 sender
[ 5] 0.00-30.00 sec 311 MBytes 87.1 Mbits/sec receiver
-- upload
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 31.0 MBytes 8.67 Mbits/sec 33 sender
[ 5] 0.00-30.04 sec 30.8 MBytes 8.59 Mbits/sec receiver
speedtest.net results https://www.speedtest.net/my-result/d/f2d48b53-2bfe-4ed8-b76e-9c2fecdd8137
I really wish that FreeBSD had similar performance to the Debian 10 VM - the setup is identical for all wireguard clients. Same ISP network router is upstream, connecting via IPv6 to the VPS server (due to my ISP being IPv6 only network its best to avoid IPv4 hosts direct connects due to CGNAT). I may end up using this debian box as secondary WAN for opnsense so that I can keep the performance gains for my traffic.
What hardware does this run on? On XEON I can achieve 1,9Gbit in both directions.
Sadly Wireguard runs bad on low-priced hardware.
Quote from: mimugmail on March 01, 2021, 05:16:17 PM
What hardware does this run on? On XEON I can achieve 1,9Gbit in both directions.
Sadly Wireguard runs bad on low-priced hardware.
All tests were from a virtualized pfsense, opnsense and debian 10 box. The VM specs for pf and opn are exactly the same - 2GB memory, 4 cores, 32gb disk, AES CPU passthru enabled.
The parent host is a intel xeon 24-core e5-2697 v2 - 128gb ddr3 on promox.
I was expecting the performance to be close to the same, the only tweaks that Linux has that FreeBSD does not is that TCP BBR is setup and I also did these tweaks on linux (not sure if there are similar for BSD)
# They optimize the server networking protocols
echo '* soft nofile 51200' >> /etc/security/limits.conf
echo '* hard nofile 51200' >> /etc/security/limits.conf
ulimit -n 51200
echo 'fs.file-max = 51200
net.core.rmem_max = 67108864
net.core.wmem_max = 67108864
net.core.netdev_max_backlog = 250000
net.core.somaxconn = 4096
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_fin_timeout = 30
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.ip_local_port_range = 10000 65000
net.ipv4.tcp_max_syn_backlog = 8192
net.ipv4.tcp_max_tw_buckets = 5000
net.ipv4.tcp_fastopen = 3
net.ipv4.tcp_mem = 25600 51200 102400
net.ipv4.tcp_rmem = 4096 87380 67108864
net.ipv4.tcp_wmem = 4096 65536 67108864
net.ipv4.tcp_mtu_probing = 1
net.ipv4.tcp_congestion_control = hybla' > /etc/sysctl.conf
sysctl -p
modprobe tcp_bbr
sh -c 'echo "tcp_bbr" >> /etc/modules-load.d/modules.conf'
sh -c 'echo "net.core.default_qdisc=fq" >> /etc/sysctl.conf'
sh -c 'echo "net.ipv4.tcp_congestion_control=bbr" >> /etc/sysctl.conf'
lsmod | grep bbr
Linux uses the kernel implementation, opnsense the go usermode implementation.
There is a huge performance impact. But kernel implementation for bsd is on it's way.
Edit: nevermind didn't read, but virtio support on bsd is lacking, I think that's the issue.
Quote from: Voodoo on March 01, 2021, 06:09:21 PM
Edit: nevermind didn't read, but virtio support on bsd is lacking, I think that's the issue.
This made me realize a possible bottleneck. I can try to add e1000 interfaces to both pfsense and opnsense VMs. Swap the configuration and re-test.
Would E1000 driver on BSD be the best performant on proxmox/QEMU/KVM?
But its limited to 1G only
Quote from: mimugmail on March 02, 2021, 06:11:46 AM
But its limited to 1G only
thanks. I did switch my VM settings to E1000 - performance bottleneck seems to continue to exist.
When I run iperf3 on the debian VM - the first downloads come at double the speed where-as in pf and opnsense the downloads begin at 30mbps and slowly build up to 80-120mbps. The linux box gets to these speeds within 2-3 seconds.
I'm bottlenecking my home bandwidth (500/500Mbit) with Wireguard in virtual OPNSense to Wireguard running on TrueNAS (FreeBSD) at my offsite backup (1Gbit).
So looking at your results there must be something else causing the bad performance and not the Wireguard BSD implementation. You are not wasting CPU cycles by running iperf on the same hosts that you are running wireguard on are you?