Weird CPU useage

Started by heyheyheyhey, November 04, 2021, 01:50:54 PM

Previous topic - Next topic
I have got 2 machine, both updated 21.7.4, both are showing the same symptom.  There is an ifconfig command taking up 1-3 cpu cores.  The CPU Core shows pegged in all monitoring.  Rebooting makes no difference, and trying to kill the pid returns not found. I backed up and reinstalled the whole OS and the issue immediately returned.

It doesn't persist forever, it dies at some point, but as soon as I open the UI again it occurs.

Anyone have any idea how to troubleshoot some more? Or have seen something similar?

Hi

After i did my upgrade to 21.7.4 the hole machine got superslow.
Everything took forever to do in GUI.

Found the solution in the german part of the forum.
just roll back the opnsense package to 21.7.3. running the following command.

# opnsense-revert -r 21.7.3 opnsense

After that my opnsense box is back to normal.




Thank you, when i did the roll back only two packages changed:
py38-dnspython2: 2.1.0
and pkg-1.16.3 was reinstalled.

Interesting, the rollback immediately resolved the weirdness I was seeing, without a reboot.

Hi,

Comparing the diff between 21.7.3 and 21.7.4 this is the only thing that stands out https://github.com/opnsense/core/commit/913afdbd196a1ba68d0f5b9e88491b97133157b9

It could be an issue with the -v switch on ifconfig in some scenario's, can you update to 21.7.4 again and revert this feature using the following command:


opnsense-patch 5acaca4 913afdb


A bit of context about the setup could also be helpful, like the number of configured interfaces and the type.

Thanks in advance,

Best regards,

Ad

Sure I will try that.

System is a Supermicro X10SLH-LN6TF with an additional X520-DA2 card.  I have 4 physical interfaces configured, and two routed IPsec tunnels configured.

thanks, any of the interfaces combined in a lag or some type of fiber optics used?

No lagg's. The X520-DA2 is a dual sfp+ card, one of the ports is populated with a DAC as the lan interface.

As soon as I upgraded to 21.7.4 the issue appeared.  As soon as I applied the patch it resolved it.

The X520-DA2 is an "ix" driver, right? sounds like a driver issue when asking verbose stats, can you check if the technical names are ix0, ix1, ... ? (should also be visible as "Device" in the interface settings)

You are correct using the ix driver.  This motherboard has 6 10GbE interfaces using the ix driver, and I have two more on the intel card.

I am using hardware for CSC, TSO, LRO, and VLAN filtering across the ix interfaces.

yes it is ix0, ix1, ix2....

although I don't expect the offloading features are related here, but can you try turning them off?

Your inference is correct, turning off the offloading features made no difference.

I think you have it narrowed to something fixed in the patch/commit that your shared is causing the issue.

not really actually, the commit exposes a driver issue (try to execute "ifconfig -m -v" and see what happens).

Skimming the commits in the ix code doesn't expose an immediate fix, but I think I have a similar card available somewhere to test.

How do undo

opnsense-patch 5acaca4 913afdb

so that I can test?

I usually reinstall the core package so I know I'm in the upstream state, using :


pkg install -f opnsense


But for the ifconfig command that doesn't influence the results, ifconfig is part of the bsae system.