[Solved] OPNSense heavily relying on one core?

Started by Gnomesenpai, July 20, 2021, 02:10:09 AM

Previous topic - Next topic
July 20, 2021, 02:10:09 AM Last Edit: August 09, 2021, 04:53:20 PM by Gnomesenpai
Hey all,

Been eyeing up my core router recently and noticed that out of the 4 virtual cores assigned only 1 is actually getting load pushed onto it, the setup is very basic just a small OSPF area and some basic firewall rules, is this behaviour normal when only pushing at max 500mbp/s of traffic? The VM is using VMX interfaces on ESXI 7

(sorry about the image size)


Edit:
Updated title to something that wasn't wrote by 1am me...

In the default settings, all network traffic is handled by core 0 only; this is done to enforce strong ordering for protocols requiring it, while keeping cpu affinity. You can set the following sysctl tunables: net.isr.maxthreads="-1" and net.isr.bindthreads="1" to enable traffic to be handled by all cores.

See this https://calomel.org/freebsd_network_tuning.html to find all kind of tunables to play with ;)
In theory there is no difference between theory and practice. In practice there is.

I am also experiencing a "speedtest.net" issue where my fiber is 1G but speedtest is only maxing out at aroudn 500gb up and down.  When running htop, it only shows 1 core maxing out and not using the other cores.  I have a Intel® Xeon® processor D-1518.

Would this tuneable potentially work as well? Thoughts?

Thanks!

I'd just give it a go; retest and see what happens. It's an easy change to roll back if you notice any issues.
In theory there is no difference between theory and practice. In practice there is.

Quote from: dinguz on July 24, 2021, 10:15:50 AM
In the default settings, all network traffic is handled by core 0 only; this is done to enforce strong ordering for protocols requiring it, while keeping cpu affinity. You can set the following sysctl tunables: net.isr.maxthreads="-1" and net.isr.bindthreads="1" to enable traffic to be handled by all cores.

See this https://calomel.org/freebsd_network_tuning.html to find all kind of tunables to play with ;)

This appears to have made very minimal difference in my case, most of the loading is still sitting on that core, is their any other way to balance it?

WAN type is important. I assume this is PPPoE?


Cheers,
Franco

If it isn't PPPoE, you can check Interfaces -> Overview to see the interrupt assignments for VMX - it should at least indicate the use of multiple hardware queues. If you have access to a shell, you can check 'vmstat -i' to do the same.

If only one queue is used, that would be the main problem.

WAN is RJ45 Direct out to my Colocation provider (through esxi d-switch), surely I wouldn't be able to use hardware queues as this is virtualised correct? Only one interrupt is assigned to each interface on checking

Hardware queues can be configured based on the host system configuration. I am unsure what the defaults on ESXi are, but a quick search reveals that multi-queue is supported: https://docs.paloaltonetworks.com/vm-series/9-0/vm-series-deployment/set-up-a-vm-series-firewall-on-an-esxi-server/performance-tuning-of-the-vm-series-for-esxi/enable-multi-queue-support-for-nics-on-esxi

Performance may be hampered by loads of different factors, so rule out any factors leading up to the VM first.

Also, read up on the documentation for VMX: https://www.freebsd.org/cgi/man.cgi?query=vmx&sektion=4 - this reveals a lot about your current situation.

If only core 0 is queued up in your situation, RSS will be the deciding factor whether packets end up on another queue at all (this must be configured on the host side). Only then will the netisr settings have a noticable effect. You can furthermore try experimenting with changing the netisr dispatch settings: net.isr.dispatch=hybrid (https://github.com/opnsense/src/blob/master/sys/net/netisr.c#L132).

Reporting back to let ya'll know that i resolved this by adding:
hw.pci.honor_msi_blacklist="0"

This has let the VM use all 4 cores and the load distribution is very even and i'm getting significantly increased throughput! Happy days!

January 27, 2022, 01:42:02 PM #10 Last Edit: January 27, 2022, 02:12:47 PM by JayST
looking into this as well. curious, was the fix you reported in addition to setting net.isr.maxthreads="-1" and net.isr.bindthreads="1" ?
Did you also have to change the default value of net.isr.dispatch tunable?

Quote from: tuto2 on August 06, 2021, 09:16:03 AM
If it isn't PPPoE, you can check Interfaces -> Overview to see the interrupt assignments for VMX - it should at least indicate the use of multiple hardware queues. If you have access to a shell, you can check 'vmstat -i' to do the same.

If only one queue is used, that would be the main problem.

Franco,

I am not running  a VM but on a dedicated box.  The processor is Intel(R) Xeon(R) CPU D-1518 @ 2.20GHz (8 cores).  Only 1 core is utilized, limiting speeds to 60% of 1G fiber connection.   Tried the max threads, but no dice.

Any other suggestions?

Well is it PPPoE? And how much are you trying to push in terms of traffic volume?


Cheers,
Franco

Not PPOE.

Trying to get full thru put and use more cores.

Well next question would be if you tried RSS yet... and what NIC drivers/cards you have. It's like an entirely different question from what this was originally about and the thread is marked solved...


Cheers,
Franco