OPNsense Forum

English Forums => Hardware and Performance => Topic started by: xiaotuzi on March 08, 2017, 08:20:03 pm

Title: Max thoughput on EsXi
Post by: xiaotuzi on March 08, 2017, 08:20:03 pm
Hi,

I have a Mac Mini (i5 with 16 GB Ram) running Esxi and one of of the VM's is OpnSense which works as my FW.
Recently I have had my WAN speed upgraded to 500/500 but when I do speedtest max speed I get is 180/180.

With the laptop directly on my modem I get full speed.

So now my thoughs are that my current setup is not powerfull enough - does that sounds reasonable ?

I am willing to buy stand-alone hardware to use for OPNSense - what would you estimate the minimum specs are for a decent computer / router to handle app 1 GB throughput ?
Title: Re: Max thoughput on EsXi
Post by: Arakangel Michael on March 20, 2017, 03:19:29 am
In order to run Suricata IDS / IPS along with the basic firewall I would go with a gen 3 i5 or better, at least a dual core, 3 Ghz+, and at least 4 GB of RAM.

Make it quad core if you want to do VPN near that speed, add more RAM (8-16 GB) if you want to run the Proxy (SQUID).

The biggest thing for just the firewall would be a multi port Intel NIC 2x, or 4x should do if you want to have more than just LAN / WAN separation. You can 'multiply' ports using VLANs, but those are less secure then a dedicated interface, and more complicated to setup, or remember.

If you want to keep ESXi, at least get something with multiple network ports. For troubleshooting make sure that you have all available 'cores' assigned. The terms are confusing, but basically if it is a dual core i5, ESXi will likely show 4 'Logical Processors', due to 'hyper-threading' doubling the core count. For testing, just assign all of them to the OpnSense VM. You can 'over commit' them by giving a 2nd VM 4 Logical Processors as well. There will only be 'resource contention' when one of them wants to use the whole chip.
Title: Re: Max thoughput on EsXi
Post by: bartjsmit on March 20, 2017, 09:18:23 am
Creating VM's that use more vCPU's than there are execution units available will reduce performance, not increase it.

Hyper threading offers two threads on a single core. That is fine if the workloads are split into multiple smaller VM's that take advantage of the better scheduling from hyper threading. A single VM that takes all threads will suffer a lot of CPU ready wait and co-stop time. If you want to over-commit your CPU resource (and you should), do so by creating multiple smaller VM's.

Going back to the example; you will find that an OPNsense VM with two vCPU will perform better than one with four unless the workload is capable of using all cores, in which case you are better off running it directly on bare metal.

More details are in the VMware KB https://kb.vmware.com/kb/1017926 with a bit more explanation in blogs like this one: http://wahlnetwork.com/2013/09/30/hyper-threading-gotcha-virtual-machine-vcpu-sizing/

Bart...
Title: Re: Max thoughput on EsXi
Post by: Arakangel Michael on May 10, 2017, 08:11:50 pm
Creating VM's that use more vCPU's than there are execution units available will reduce performance, not increase it.

The theoretical knowledge doesn't really help anyone, if they can't apply it.

You're being pedantic, and your statement isn't technically accurate.

It depends on your workload:

If it is more latency sensitive, and not consuming the whole CPU, then user Hyperthreading / SMT, and commit all logical processors.

If it is more compute intensive, or completely single threaded then it may be better to switch Hyperthreading / SMT off in the BIOS, and use the exact number of physical cores available.

Blog posts like that discount the nuance of actual life. You cannot possibly recommend an ideal configuration without knowing the workload. NUMA locality doesn't apply with a single socket machine.

In the OP's instance where he is possibly going to buy a box dedicated for this a dual / quad core single cpu should suffice just fine for all the bells, and whistles.

Given multiple packages running it makes sense to enable Hyperthreading, and count the logical processors.

This is a logical argument, real world testing is always better.

Said more nicely:
*BSD, OpnSense, and Suricata are multithreaded, and to my knowledge the other important packages are. So as many cores as possible with a 'smarter' scheduler (HT / SMT) makes more sense, since it isn't likely to chew through all available compute power. For 500 mbps of traffic.

Bare Metal install is better in my book as you don't have to deal with additional vulnerabilities, and patching. Performance difference should be within a few percent of ESXi if your hardware has room to breath (bottlenecks are not the same). Hardened BSD will arguably be more secure than ESXi.


To the OP, basically get a quad core, with a dedicated gigabit network interface for at least the LAN, and WAN, if you want all of the bells, and whistles. Your network card is likely the bottleneck.