Q20331G9 high cpu usage at 10Gb/s

Started by alexandre.dulche, December 03, 2024, 03:03:21 AM

Previous topic - Next topic
Hello,

I decided to go back physical from VMware VM to a Qotom Q20331G9.
I reloaded/validated my configuration bit by bit on the new box and I'm now in the process of benchmarking performance before putting it into production.

My concern is about CPU usage when trying to push to 10 Gb/s using iperf3.

Source VM / Vlan A : iperf3 -c xxx.xxx.xxx.xxx -P 40 -t 600
Destination VM  / Vlan B : iperf3 -s
Opnsense 24.7.9_1

Through my Opnsense VM I can reach 8-9 Gbit/s and the CPU won't move up (load average stays <= 1.0).
It's running 4 vCPU out of a Ryzen 5 3600, 4 GB of RAM, 2 VMXNET3 vNIC . ESXi 8 host iwth intel x520 10 Gb/s uplinks.

Same test but through the Qotom box, I reach the same speed (which is good news) however the CPU easily reaches a load average over 30.0  making webUI and SSH hardly usable  :-[

Test 1 :
Checksum offloading : disabled (=default)
LRO/TSO : disabled  (=default)
VLAN offloading : disabled (=default)
--> Load >= 30.0 (~95% system) & ~9 Gb/s

Test 2 :
Checksum offloading : enabled
LRO/TSO : disabled  (=default)
VLAN offloading : enabled
--> Load ~ 13.0 (~95% system) & ~9 Gb/s

I'm not running any IDS/IPS (yet), and just about a dozen FW rules on the test interfaces.
For the test the flow is going in and out of the same NIC (trunk port with both tagged VLANs)
I tried to disable stuff like netflow local collector but no improvement.

Any game changing tunables i should look into?
Or is the C3758R just too slow to sustain 10Gb/s without choking?

Any Qotom Q20331G9 owners who may share their experience with this unit ?

Thank you.

Test 3 :
Checksum offloading : enabled
LRO/TSO : disabled  (=default)
VLAN offloading : enabled
vm.pmap.pti= 1 -> 0
hw.ibrs_disable= 0 -> 1
--> Load ~ 13.0 (~95% system) & ~9 Gb/s

No change compared to test #2 where hardware offloading seems to have helped quite a bit.


Test 4:
Checksum offloading : enabled
LRO/TSO : disabled  (=default)
VLAN offloading : enabled
vm.pmap.pti= 1 -> 0
hw.ibrs_disable= 0 -> 1
net.isr.bindthreads=1
net.isr.dispatch=hybrid
net.isr.bindthreads=1
net.isr.maxthreads=-1
net.inet.rss.enabled=1
net.inet.rss.bits=3


--> Load <= 10.0 (~20% system ~50% interrupt ~25% idle) & ~9 Gb/s

Throughput stable and the system definitly seems more usable  :)

Quote from: alexandre.dulche on December 03, 2024, 03:03:21 AM
Any Qotom Q20331G9 owners who may share their experience with this unit ?

I've seen three units of the rack (1u) version, all failing to recognize a module in the top left SFP+ cage (ix3). Tried all usable suspects, genuine Intel, Juniper, Cisco and FS (Intel coding) modules and DAC, all no go in that single slot, other slots do work.

The platform is wel known, used for many years by SuperMicro in their SYS-E300-9A systems (A2SDI motherboard). Although Qotom using a slightly newer/faster Atom R variant, the platform itself is almost 6 years old. The idea is/was nice, a cheap (1/3 price of SuperMicro in 2019) multi 2.5/10Gb platform with decent power consumption.

Because the lack of documentation, a rather shady support page with some unspecified firmware versions and inconsistent SFP behaviour it's not a platform I would recommend.

What's your experience with the SFP+ cages ?!?!

Quote from: netnut on December 05, 2024, 05:37:30 PMWhat's your experience with the SFP+ cages ?!?!

I have all 4 SFP+ ports up and running without issues so far.

I ordered 4 DAC from 10Gtek on Amazon:
  • 2 Intel-coded (because I feared I may run into issues with my usual DAC)
  • 2 Cisco-coded (the same i usually have for all my Ubiquiti stuff).

All 4 SFP+ ports are connected via those DAC to 2 Unifi US-16-XG switches.
No recognition or bandwidth issues.

I put the unit into production a week ago with the following settings.

Checksum offloading : enabled (unchecked)
TSO : disabled (checked)
LRO : disabled (checked)
VLAN hardware filtering : enabled
Hardware acceleration : QAT
PowerD : enabled (Hiadaptive)
hw.ibrs_disable : 1
hw.intr_storm_threshold : 10000
hw.ix.enable_aim : 1
hw.ix.flow_control : 3
hw.ix.unsupported_sfp : 1 (but my 4 DAC worked without that as well)
net.inet.ip.intr_queue_maxlen : 4096
kern.ipc.nmbclusters : 2000000
kern.ipc.nmbjumbop : 524288
machdep.hyperthreading_intr_allowed : 1
net.inet.rss.bits : 3
net.inet.rss.enabled : 1
net.isr.bindthreads : 1
net.isr.dispatch : hybrid
net.isr.maxthreads : -1
vm.pmap.pti : 0

It's been running good so far.

Quote from: alexandre.dulche on December 16, 2024, 08:44:55 PMIt's been running good so far.

Nice! I see you're using DAC cables (which I couldn't), did you try any single SFP+ modules  (no DAC) ?!?!