OPNSense on vSphere 8.0.3 Trunk Throughtput

Started by colni, November 12, 2024, 11:28:42 AM

Previous topic - Next topic
Hi All,

Im in the process of standing up a POC , the hope for this is to replace our current vFTD with a pair of Opnsense boxes deployed as HA.

Im having some issues with throughput when using a trunked connection which seems to be getting a limited speed when iperf testing using the opnsense plugin

The POC is running on vSphere 8.0.3 and OPNsense 24.7.8-amd64
I have one distributed switch
I then have a few distributed port groups
Ive set one of the distributed port groups to a trunk and added my vlan tags

Opnsense vm has been deployed with

4 x vcpu
8 x ram
200 x hd
vmx0 - wan
vmx1 - trunk to the distributed port group

under interfaces ive assigned vmx1(opt1) to the trunk and enabled it
under interfaces > other types > vlan (opt2), created a new vlan  assigned it to the vmx1 trunk interface
under interfaces ive enabled the opt2 interface and gave it a static ip

i then installed the iperf plugin for opnsense and ran a test

[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.84 GBytes  1.58 Gbits/sec   46             sender
[  5]   0.00-10.00  sec  1.83 GBytes  1.57 Gbits/sec                  receiver

this seems really low, considering all the interfaces are showing they are connected at 10gb and this is only in and out from the same network segment

after a lot of looking about i thought i would build a second vm , but i wouldnt use a trunk interface , just connect it up to the Distributed Port Group for wan and lan

after building the same machine apart from the trunk interface and running a test i can see i get
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.9 GBytes  9.34 Gbits/sec  364             sender
[  5]   0.00-10.00  sec  10.9 GBytes  9.34 Gbits/sec                  receive


clearly there is something wrong here but im not sure what

anybody have any ideas?



i had some further success and failure...

i was able to resolve the trunk issue by going to

interface > settings > untick

Hardware CRC
Hardware TSO
Hardware LRO

reboot and rerun the test

- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.9 GBytes  9.36 Gbits/sec   66             sender
[  5]   0.00-10.00  sec  10.9 GBytes  9.35 Gbits/sec                  receiver


however when using an interface which is a virtual ip shared between the HA nodes the throughput drops again

- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.68 GBytes  1.44 Gbits/sec  262             sender
[  5]   0.00-10.00  sec  1.68 GBytes  1.44 Gbits/sec                  receiver

It's generally recommended to create port groups with matching VLAN tags in vSphere and a separate interface for each VLAN in the VM(s).

Can you try that?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Hi Patrick,

I built out another HA cluster and added the virtual ip and tested iperf again , below are the results

- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.7 GBytes  9.20 Gbits/sec    0             sender
[  5]   0.00-10.00  sec  10.7 GBytes  9.20 Gbits/sec                  receiver


As you have said when its not a trunk port it functions at higher

Would this limitation be only be on a virtual appliances?





There is no limitation in OPNsense. Doing all layer 2 in the vSwitches and not in guests is the vSphere recommendation.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Thanks Patrick

I can see how that would make sense

ive actually found another issue which i can start another post if required


Going from an inside lan1 address to the same vlan / subnet virtual ip on master opensense vm i can iperf test at 9 Gbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  10.7 GBytes  9.19 Gbits/sec    0             sender
[  5]   0.00-10.01  sec  10.7 GBytes  9.18 Gbits/sec                  receiver

Going from an inside lan1 (subnet&vlan1) address to lan2 (subnet&vlan2) using the opensense virtual ips as default routes its getting

- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.85 MBytes  1.55 Mbits/sec  324             sender
[  5]   0.00-10.04  sec  1.75 MBytes  1.46 Mbits/sec                  receiver


when i change settings in interfaces > settings

Hardware CRC
Hardware TSO
Hardware LRO

it will bring the speed to
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.02 GBytes  1.73 Gbits/sec   74             sender
[  5]   0.00-10.04  sec  2.01 GBytes  1.72 Gbits/sec                  receiver

toggling any of the above hardware options on seems to impact the iperf testing but i cant get the line speed back to 9GBits without turning all the options off which leads to the slow speeds between the lan 1 and lan 2 nets


Which type of virtual interface are you using? VTNET3 or E1000? E1000 is recommended.

Also, if you have the physical ports available, the really best and most performant solution is PCIe passthrough of physical ports. That moves the VLANs back to OPNsense, of course, but then it's not virtualised, anymore.

You can even passthrough just a single port and build a "router on a stick" with all interfaces (WAN, LAN, DMZ, ...) just VLANs. That means shared bandwidth, but depending on the uplink speed that might be enough.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Ok so i rebuilt the HA using only E1000E (noted you mentioned E1000 :O )

i do get more stable performance with lan1 to opensense vip and lan1 to lan2 roughly the same throughput , just not full or close to 10Gbits

- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  2.25 GBytes  1.93 Gbits/sec   25             sender
[  5]   0.00-10.04  sec  2.25 GBytes  1.92 Gbits/sec                  receive

i guess i could redo the HA with E1000 just need another few hours :D

So my vSphere cluster has two uplinks per esxi host , the uplinks are trunked into a Distributed Port Group
which i have my Distributed switches off that with each vlan

im not to sure what effect it might have if i passed through one of the nics from the host as it would be a trunk


Then these 2 Gbit/s is all the hardware, hypervisor (ESXi) and OPNsense stack combined can achieve ... all of this involves a heck of a lot of in-memory copies, context switches, cache misses etc.

If you really need 10 Gbit/s OPNsense throughput I'd recommend something like this:

https://shop.opnsense.com/dec3800-series-update-2024/

And if you want to run IDS/IPS at 10 Gbit/s, you definitely need the top appliance:

https://shop.opnsense.com/dec4200-series-opnsense-enterprise-datacenter-rack-security-appliance/

But still they guarantee only 7.5 Gbit/s for IDS/IPS ...
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

thanks patrick

certainly in this discovery phase im grateful for having the options to present for a business case


appreciate all the help

1 Gbit/s virtualised is absolutely no big deal. I got that much on an Atom 3xxx with virtualised network adapters as well as with passthrough and OPNsense running on the FreeBSD bhyve Hypervisor. Great for us home lab folks.

10 Gbit/s is a bit of a challenge even with dedicated hardware. It should be achievable if you have PCIe passthrough network cards. Do your servers have free PCIe slots? You could invest in a dual port Intel X520 or similar first and try that approach. If your hosts have spare CPU cycles and memory that is going to be a lot cheaper than dedicated appliances.

Easier way of course is dedicated hardware. Depending on your environment (I assume business by the hints at two rather serious vSphere servers) that might be not a budget problem, then it might - I don't know.

Mark that if you go virtualised plus PCIe passthrough, you cannot snapshot the VMs for a live backup and you cannot use dynamic memory allocation (ballooning). For a VM with a PCIe device memory is always fixed and snapshots disabled.

HTH,
Patrick
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Just starting down the vSphere road... I assume with a pass through PCIe card you also can't vMotion the VM to another host if the first host needs an update and reboot? XCP-NG would be the same, I couldn't even migrate a VM that used a pass through video card, and all the hosts had that same video card present.

Now if your ESXi hosts all have an extra ethernet that is passed through (wan to switch to host ports), could you then vMotion the VM to another host in that group that have the pass through connections to the WAN? I think this way might work with XCP-NG, but not tried.

Quote from: Greg_E on November 13, 2024, 07:16:15 PM
Just starting down the vSphere road... I assume with a pass through PCIe card you also can't vMotion the VM to another host if the first host needs an update and reboot?

Correct.

Quote from: Greg_E on November 13, 2024, 07:16:15 PM
Now if your ESXi hosts all have an extra ethernet that is passed through (wan to switch to host ports), could you then vMotion the VM to another host in that group that have the pass through connections to the WAN?

Still no, but you could - as I did back then - build a two firewall HA cluster on top of two ESXi hosts. I used Sidewinder at the time but of course you can do the same with OPNsense.

No experience with XCP-NG. Found it rather underwhelming, specifically that the "nice UI" is a paid add on and not part of the base package. Running Proxmox currently in my home lab. And of course still TrueNAS CORE and SCALE, one host each. The SCALE one shares the box with Proxmox, NVMe SSDs and network interfaces PCIe passthrough, so two systems sharing CPU and memory but having everything else dedicated.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Quote from: Patrick M. Hausen on November 13, 2024, 06:24:38 PM
1 Gbit/s virtualised is absolutely no big deal. I got that much on an Atom 3xxx with virtualised network adapters as well as with passthrough and OPNsense running on the FreeBSD bhyve Hypervisor. Great for us home lab folks.

10 Gbit/s is a bit of a challenge even with dedicated hardware. It should be achievable if you have PCIe passthrough network cards. Do your servers have free PCIe slots? You could invest in a dual port Intel X520 or similar first and try that approach. If your hosts have spare CPU cycles and memory that is going to be a lot cheaper than dedicated appliances.

Easier way of course is dedicated hardware. Depending on your environment (I assume business by the hints at two rather serious vSphere servers) that might be not a budget problem, then it might - I don't know.

Mark that if you go virtualised plus PCIe passthrough, you cannot snapshot the VMs for a live backup and you cannot use dynamic memory allocation (ballooning). For a VM with a PCIe device memory is always fixed and snapshots disabled.

HTH,
Patrick

Hi Patrick,

Yes we may look at adding dedicated ports for testing , tho we heavily use vMotion so we would need to step through that process to see what the effect may be.

:D