latencyspikes of seconds (<3s) during speed test on an atom E3940

Started by thelittleblackbird, June 26, 2026, 10:52:00 PM

Previous topic - Next topic
Hi all,

I think i tried to debug it until the  limit of my knowledge but i reached a point i will need a bit of support and guidance.

During a regular speed test via internet (a 350mbps connection) i realized that the "swi1: netisr" routines take 100% of the cpu, this is only noticeable if the FW is enabled, if the FW is disabled then the cpu usage for the same test is not going beyond 18% (the expectation here)

i dont have any idp/ids active, my services are limited to DNSmasq, unbound and tailscale and the nic are intel. so I am out of ideas.

could somebody be so kind to point me to where to look to see where the problems is originated?

thanks in advance

An interrupt handler issue? I haven't seen one myself. Do you have any unusual sysctls (tuneables) configured?

Can you paste a "top" capture?

here you have it, in the attached file.

I tried to implement the tunnable described in the opnsense documentation about performance:
https://docs.opnsense.org/troubleshooting/performance.html

if you need something else just ask

thanks


Huh. Stalling on interrupts (95%). Are you using Realtek Ethernets with the factory (not the plugin) driver? If so, try the plugin. If not, what Ethernet interfaces do you have?

Nope, intel NIC
root@OPNsense:~ # pciconf -lv | grep -A4 ethernet
    subclass   = ethernet
igb1@pci0:2:0:0: class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x1539 subvendor=0x8086 subdevice=0x0000
    vendor     = 'Intel Corporation'
    device     = 'I211 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
igb2@pci0:3:0:0: class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x1539 subvendor=0x8086 subdevice=0x0000
    vendor     = 'Intel Corporation'
    device     = 'I211 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
igb3@pci0:4:0:0: class=0x020000 rev=0x03 hdr=0x00 vendor=0x8086 device=0x1539 subvendor=0x10f3 subdevice=0x0101
    vendor     = 'Intel Corporation'
    device     = 'I211 Gigabit Network Connection'
    class      = network
    subclass   = ethernet

I dont think it has something to do with the NIC, remember that when i disable the FW the load of the system under the same test is < 18%....

I am pretty sure it is something related to the processing of the rules / NAT.  But i am surprised by the numbers i get and i can not imagine what is going on...

as extra information i run a speed test LAN - DMZ via iperf3:

Quoteiperf3 -c clouddocker.dmz.home.internal -p 4000 -M 400 -P 8 -l 9000

I tried to simulate a high packet rate with a small payload to see if this reproduces the issue, and while the interrupts and system tasks were significantly higher, both of them reached only around 70% of the processor (not great but it could be acceptable).

Could this test suggests that it has something to do with the NAT or perhaps with any WAN rule? (can we discard pf general performance issue?)