WireGuard Performance - What bandwith to expect?

Started by o58rHtfJdDiU3p, July 28, 2025, 10:43:41 AM

Previous topic - Next topic
Hello,

I'm currently experimenting with segmenting my home network using WireGuard as an alternative to VLANs, which would require appropriate hardware support.

I seem to have the WireGuard instances and individual peers under control, but now I'm getting into the nitty-gritty of the expected performance.

My OPNsense runs under Proxmox on an older i7-6700 machine. Both CPU and RAM seem to have sufficient performance margins.
The device has an Intel X520 dual SFP+ network card that supports 10Gbit/s.

I am testing with Crystal Disk Mark to test on a CIFS network share on a file server that is on the 10G network. So all devices, file server, test notebook and OPNsense router are on 10G network.
Here are my results.

1.) direct connection (without OPNsense routing)
R: 505.19 MByte/s
W: 479.73 MByte/s

2.) Wireguard activated between Notebook and OPNsense.
OPNsense has paravirtualized NIC so Intel X520 is initialized by Proxmox.
R: 67.61 MByte/s
W: 50.55 MByte/s

3.) WireGuard with NIC PCIe passthrough in OPNsense VM. So OPNsense should have exclusive access to the 10Gbit/s network card.
R: 57.13 MByte/s
W: 27.25 MByte/s

So the question is now: What do I see with these results?

Is it totally normal that with the WG tunnel the performance drops by a factor 10?
I do know that WG com is encrypted and this slows down the communication but I don't see any bottleneck on the HW side.

Is it possible that my network card is still running on 1Gbit/s instead of 10Gbit/s?

Hope someone can help!
Thanks!

July 28, 2025, 11:01:01 AM #1 Last Edit: July 28, 2025, 11:02:53 AM by meyergru
First off, I do not get how WireGuard could be an alternative to VLANs: If your switches do not support VLANs, you would have to put all of your clients into one network. In that situation, there is no security benefit if clients may only connect to your OpnSense via WG tunnels.

Consider that within your LAN, any client can communicate / infiltrate any other client directly, without going through OpnSense. That is the whole point of VLAN separation, besides making collision domains smaller, thus having less broadcast traffic.

Your physical connection is not at 1 Gbit/s, otherwise you would not see 500 MByte/s traffic.

The wireguard speed looks about right, with old CPUs, it is usually at ~600 MBit/s, which you see. That is not a factor of 10, but a limit imposed by the encryption speed of your CPU(s).

Depending on the physical CPU and the chosen CPU emulation, you can or cannot use AES instructions. For these types of applications, I would switch the CPU type to "host" in Proxmox.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

July 28, 2025, 01:32:28 PM #2 Last Edit: July 28, 2025, 01:41:53 PM by o58rHtfJdDiU3p
Quote from: meyergru on July 28, 2025, 11:01:01 AMFirst off, I do not get how WireGuard could be an alternative to VLANs: If your switches do not support VLANs, you would have to put all of your clients into one network. In that situation, there is no security benefit if clients may only connect to your OpnSense via WG tunnels.

then please let me explain:
my plan is to configure all sensible clients in some WireGuard subnets. then the communication data is forced to be routed over the OPNsense router and I gain control over which subnet and device is allowed to communicate with which. And that's although on HW level all are connected over the same LAN equipment.
Of course you are right, this is just more secure if the WireGuard tunnels are really used and on the clients whole communication is routed through the tunnel.
I would say the point of encrypted VPN tunnels is that other clients can see that something is transmitted but can't see what, until the packets go to the router. There the router decides who is allowed to get in touch with whom. So it's not true if you say that every untrustworthy LAN device can communicate with other sensible devices... is it?

Quote from: meyergru on July 28, 2025, 11:01:01 AMConsider that within your LAN, any client can communicate / infiltrate any other client directly, without going through OpnSense. That is the whole point of VLAN separation, besides making collision domains smaller, thus having less broadcast traffic.

Your physical connection is not at 1 Gbit/s, otherwise you would not see 500 MByte/s traffic.

not sure about that.
the 500 MByte/s are just when I connect directly without any WG tunnel. since both devices are in the same subnet in this configuration, the traffic goes directly without the need of any routing of the OPNsense device. So I still can't say if the OPNsense device really makes use of the whole 10Gbit/s.

Quote from: meyergru on July 28, 2025, 11:01:01 AMThe wireguard speed looks about right, with old CPUs, it is usually at ~600 MBit/s, which you see. That is not a factor of 10, but a limit imposed by the encryption speed of your CPU(s).

That's disapointing. The CPU runs typically on 20 or 30% for 6 cores and what I have read, WG should be able to make use of multiple cores.

Quote from: meyergru on July 28, 2025, 11:01:01 AMDepending on the physical CPU and the chosen CPU emulation, you can or cannot use AES instructions. For these types of applications, I would switch the CPU type to "host" in Proxmox.

CPU host was set.

So currently I am still not sure if you are right and that's all what I can expect form my setup... O.o

BTW, allthough it is old, it is still a i7 and faster then evern N350, Pentium Gold or other typical micro appliance devices. CPU benchmarks shouldn't be that bad...

The thing why I am asking is, I have read that OPNsense or more precise the FreeBSD has troubles with many 10Gbit/s cards especially Realtek chipsets. This is why I was getting careful when I saw these numbers.
But I can't destinguish between WG performance and NIC performance since I can't force the traffic through the router (no VLAN support)

If your NICs are paravirtualized (vtnetX), then Proxmox does the handling and you should be able to see at what speed they connect. It cannot be 1 Gbit/s, because 500 MByte/s is 5x more than that.

Also, OpnSense has problems with Realtek adapters, yes, but that only pertains to device it controls itself (i.e. passthrough or bare-metal NICs). Mostly, Realtek NICs cannot do 10 GBit/s anyway. There may exist models, but more often, we speak about 1 or 2.5 Gbit/s models.

And I still do not get what you try to achieve by using WireGuard in this context. What does "WireGuard subnets" even mean? You will have to connect your clients to the network and that is about it. Any client that is connected to your network can reach any other client by virtue of their LAN IPs. You cannot "force" them to use WireGuard tunnels by external (network) means. The best thing you could do is to force routing through WireGuard tunnels.


Then again, there are many clients where this is not feasible and on those where it is, you obviously have to do some network trickery, which you could as well replace by a local firewall instead if you want to keep other clients from infiltrating them in the first place.

I think you are doing it the hard way without achieving anything relevant.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

QuoteSo it's not true if you say that every untrustworthy LAN device can communicate with other sensible devices... is it?
not so sure ;-) What would prevent them from communicating via the LAN address to the sensitive device? If sensitive and un-sensitive devices are within the same layer2 subnet then they always can connect directly using the LAN address. Okay you could firewall the sensitive boxes to allow sensitive traffic only on the wireguard interface. But that imho is really a lot of effort and quite error prone. You would have to maintain firewalls on all sensitive boxes. The only way out of that would be to make two segregated subnets: one for the sensitive and one for the un-sensitive boxes and connect them via the OpnSense. Then you could firewall the traffic between the two subnets in one central place, although then imho no need for wireguard :-)

Really the best way would be to invest into VLAN-aware switches and segregate trusted from untrusted via VLAN tags. And one other point (others may not see that too strict): but a firewall on a virtualized host is imho (almost) never the way to go. That adds an unnecessary layer of complexity to the setup. Means the firewall cannot protect you from a bug in the virtualization itself. If there is such a bug the host system that runs your firewall can be compromised without any chance for the firewall to prevent it. I would really recommend to spend some bucks into a bare-metal for the firewall and VLAN-aware switches. Most provider routers can be run in bridge mode. Such a router can be connected to the WAN interface of the firewall. Then there is not need that the router supports VLAN.

Cheers

tobi

July 29, 2025, 09:57:47 AM #6 Last Edit: July 29, 2025, 10:02:58 AM by o58rHtfJdDiU3p
Thank you Tobi and Meyerguru

True that each WG connected device is still open on LAN for Layer2. I am aware that but I guess this problem exists in any case if you are using WG from a starbucks or the airport and there it is just considered as compromise that you could get.

I also know that best would be having a separation over different physical LANs. I've seen the migration from VLAN to physical separation in at my work.
Second best is VLANs, it was considered to be best in the past but nowerdays considered also to be kind of a compromise.
And my "third" best solution would be to take what I do have and try to improve on a SW level.
And fourth would be doing nothing and just having a big LAN without any control of who is accessing who.

In my case my thinking was to just consider the whole LAN (direct physical connection to OPNsense) as insecure and choose the sensible devices and put them in different WG subnets that are considered more secure than LAN. Then I force a routing over the OPNsense router (as it would be with VLANs) and I can control the routing and have options with the OPNsense firewall.

The disadvantage is of course that everything must be encrypted and decrypted through the tunnel and can't communicate directly between the devices.

This brings me back to my original question since I think I have understood the goods and the bads of my planned solution.

The real question was, what performance is to expect from WG?
Where are the bottle necks and why can't I see them in the stats?
Meyerguru thinks that I can be sure that my NIC runs on 10G. That could have been a good explanation if he would have been wrong and it runs on 1G. But maybe he is right and the WG performance I see is already the end of the road? If so, although I am no expert I would have expected much more bandwith in my configuration with 10G NIC and an old i7 CPU with good clock and more than enough RAM. Everybody is talking that WG is that fast and I am sure it is compared to it's successors but still I would have expected more at least over 200MByte/sec or at least seeing any CPU running hot due to the cryptography of the tunnel.
But all devices seem to fine and relaxed.

About performance: it depends on a lot of factors and also it depends very much on what and how you test. Generally Wireguard is way faster that OpenVPN and in many cases also faster than IPSec. For reliable testing you should use a tool like iperf(3) on both client and server and always perform the same test via a non-wireguard connection to compare. On OpnSense the iperf can be installed as well (from plugins/packages). It can be a good idea to play with the iperf params (ex parallel connections etc).
Quote... or at least seeing any CPU running hot due to the cryptography of the tunnel.
Wireguard is quite efficient in CPU usage, so even if you hit the max of the tunnel it does not necessarily mean that your CPUs are running on 100% usage. I have not found many tests with 10Gb cards but here in a Reddit thread there are some numbers: https://www.reddit.com/r/linux/comments/9bnowo/wireguard_benchmark_between_two_servers_with_10/ but keep in mind that they used huge MTU (8.5k) to achieve the speed. Also here https://www.netgate.com/blog/wireguard-in-pfsense-2-5-performance some performance although only with 1Gb card

Comparing WireGuard Performance on different Platforms is mostly guesswork. If you search this Forum for that, you will find recent discussions that will Show you that Even Opnsense and pfSense are different because of ,,reasons".

Suffice it to say that on OpnSense, you will See about the Speed you measure. You can probably more by enabling RSS.

That aside, I still think the approach is unsuitable.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+