Hi,
As per subject, I've just noticed I have no ether dispatched traffic going via CPU1, not sure what's wrong, if indeed anything is wrong, just assumed all my 4 Cores would be somewhat balanced, see key config and netstat results below, any ideas?
I should mention I've tried changing the dispatch policy to direct, hybrid and deferred and makes no difference.
[Key Config/Info]
OPNsense 24.1.a_38-amd64
FreeBSD 13.2-RELEASE-p2
OpenSSL 1.1.1v 1 Aug 2023
WAN is PPPOE
machdep.hyperthreading_allowed=0
net.isr.dispatch=hybrid
net.inet.rss.bits=2
net.inet.rss.enabled=1
dev.igb.0.iflib.override_nrxds=1024 (I use 1024, rather that 2048 as supposed to improve low latency for gaming, that's more important than throughput in my house)
dev.igb.0.iflib.override_nrxqs=2 (Note have amended down to 2 from 4 as I think that's the max for my nic)
dev.igb.0.iflib.override_ntxds=1024 (I use 1024, rather that 2048 as supposed to improve low latency for gaming, that's more important than throughput in my house)
dev.igb.0.iflib.override_ntxqs=2 (Note have amended down to 2 from 4 as I think that's the max for my nic)
dev.igb.0.iflib.override_qs_enable=1
dev.igb.0.iflib.rx_budget=65535
dev.igb.1.iflib.override_nrxds=1024
dev.igb.1.iflib.override_nrxqs=2 (Note have amended down to 2 from 4 as I think that's the max for my nic)
dev.igb.1.iflib.override_ntxds=1024
dev.igb.1.iflib.override_ntxqs=2 (Note have amended down to 2 from 4 as I think that's the max for my nic)
dev.igb.1.iflib.override_qs_enable=1
dev.igb.1.iflib.rx_budget=65535
[netstat -Q Output]
Configuration:
Setting Current Limit
Thread count 4 4
Default queue limit 256 10240
Dispatch policy hybrid n/a
Threads bound to CPUs enabled n/a
Protocols:
Name Proto QLimit Policy Dispatch Flags
ip 1 1000 cpu hybrid C--
igmp 2 256 source default ---
rtsock 3 256 source default ---
arp 4 256 source default ---
ether 5 256 cpu direct C--
ip6 6 1000 cpu hybrid C--
ip_direct 9 256 cpu hybrid C--
ip6_direct 10 256 cpu hybrid C--
Workstreams:
WSID CPU Name Len WMark Disp'd HDisp'd QDrops Queued Handled
0 0 ip 0 574 0 42253356 0 32874971 75128327
0 0 igmp 0 0 0 0 0 0 0
0 0 rtsock 0 5 0 0 0 9249 9249
0 0 arp 0 0 0 0 0 0 0
0 0 ether 0 0 413619592 0 0 0 413619592
0 0 ip6 0 254 0 5024974 0 1157454 6182428
0 0 ip_direct 0 0 0 0 0 0 0
0 0 ip6_direct 0 0 0 0 0 0 0
1 1 ip 0 197 0 1791065 0 112231187 114022252
1 1 igmp 0 0 0 0 0 0 0
1 1 rtsock 0 0 0 0 0 0 0
1 1 arp 0 0 0 0 0 0 0
1 1 ether 0 0 0 0 0 0 0
1 1 ip6 0 96 0 2756 0 5212677 5215433
1 1 ip_direct 0 0 0 0 0 0 0
1 1 ip6_direct 0 0 0 0 0 0 0
2 2 ip 0 182 0 22572352 0 64929554 87501906
2 2 igmp 0 0 0 0 0 0 0
2 2 rtsock 0 0 0 0 0 0 0
2 2 arp 0 3 0 934187 0 3032 937219
2 2 ether 0 0 42907359 0 0 0 42907359
2 2 ip6 0 112 0 2200054 0 9231036 11431090
2 2 ip_direct 0 0 0 0 0 0 0
2 2 ip6_direct 0 0 0 0 0 0 0
3 3 ip 0 85 0 17177460 0 243726123 260903583
3 3 igmp 0 0 0 0 0 0 0
3 3 rtsock 0 0 0 0 0 0 0
3 3 arp 0 0 0 0 0 0 0
3 3 ether 0 0 62508272 0 0 0 62508272
3 3 ip6 0 115 0 1235672 0 4609712 5845384
3 3 ip_direct 0 0 0 0 0 0 0
3 3 ip6_direct 0 0 0 0 0 0 0
[Key Sysctl Information]
dev.igb.1.iflib.rxq1.cpu: 3
dev.igb.1.iflib.rxq0.cpu: 2
dev.igb.1.iflib.txq1.cpu: 3
dev.igb.1.iflib.txq0.cpu: 2
dev.igb.0.iflib.rxq1.cpu: 1
dev.igb.0.iflib.rxq0.cpu: 0
dev.igb.0.iflib.txq1.cpu: 1
dev.igb.0.iflib.txq0.cpu: 0
[Key dmesg output]
igb0: Using 2048 TX descriptors and 2048 RX descriptors
igb0: Using 2 RX queues 2 TX queues
igb0: Using MSI-X interrupts with 3 vectors
igb0: netmap queues/slots: TX 2/2048, RX 2/2048
igb1: Using 2048 TX descriptors and 2048 RX descriptors
igb1: Using 2 RX queues 2 TX queues
igb1: Using MSI-X interrupts with 3 vectors
igb1: netmap queues/slots: TX 2/2048, RX 2/2048
[vmstat -i ouput]
interrupt total rate
irq6: uart4 1 0
irq9: acpi0 102 0
irq42: sdhci_pci0 13 0
cpu0:timer 386864295 995
cpu1:timer 20699302 53
cpu2:timer 27962835 72
cpu3:timer 33559782 86
irq128: ahci0 4333382 11
irq129: igb0:rxq0 181995425 468
irq130: igb0:rxq1 64331389 165
irq131: igb0:aq 2 0
irq132: igb1:rxq0 65561270 169
irq133: igb1:rxq1 143084477 368
irq134: igb1:aq 2 0
irq135: xhci0 50 0
Total 928392327 2388
It's all documented here in the first post actually
QuoteIt is also possible that a driver does not expose this ability to the user, in which case you'd want to look up whether the NIC/driver supports RSS at all – using online datasheets or a simple google search. For example, igb enables RSS by default, dut does not reflect this in any configuration parameter. However, since it uses multiple queues:
Code: [Select]
dmesg | grep vectors
igb0: Using MSI-X interrupts with 5 vectors
igb1: Using MSI-X interrupts with 5 vectors
igb2: Using MSI-X interrupts with 5 vectors
igb3: Using MSI-X interrupts with 5 vectors
It will most likely have some form of packet filtering to distribute packets over the hardware queues. In fact, igb does RSS by default.
https://forum.opnsense.org/index.php?topic=24409.0 (https://forum.opnsense.org/index.php?topic=24409.0)
You have
QuoteDispatch policy hybrid
so it looks like you're all set.
This appears to be true for both igb and igc drivers.
I do think this ether dispatch scenario is related to me enabling RSS via that tuneable, need to validate that.
Previously I've used dispatch of deferred as that is a general recommendation for PPPoE connections (although others say to always use direct for a router) and I'm pretty sure that did distribute the traffic evenly over all 4 cores.
I'm not really sure whether having RSS tuning enabled will benefit me or not to be honest.
I do a lot of GeForce Now cloud gaming while my son is an avid PS5 gamer, and I'd say in the above configuration everything seems to run pretty well. However I like to tinker and get the best performance possible, and without having a specific lab to test this out at a more granular level, hard to say what is the best config I guess.
Also the iperf implementation is broken in opnsense, has been for a while now, raised a separate post on that one, so not easy for me to test the throughput side of things.