Surricata causing 100% CPU load after upgrade on 23.1

Started by endurance, February 05, 2023, 10:59:15 AM

Previous topic - Next topic
February 05, 2023, 10:59:15 AM Last Edit: February 06, 2023, 01:13:05 PM by endurance
Hello,

just did the update from latest 22.7 to 23.1 and was wondering about WAN gateways showing offline and then I spotted 100% load on CPU caused by unbound and surricata.
Restarting unbound did not help but disabling surricata solves the issue immediately.
If then surricata is enabled again it seems like working but maybe after 10-30s CPU goes up again. currently 100% reproducable.

OPNSens 23.1_6 amd
os-etpro-telemetry (installed)   1.6_1

Any tipp how to troubleshoot?
Note: this is on my backup OPNsense with currently no traffic. Live 5-10% surricata was normal but not 100% and causing unbound to go crazy as well.

surricata enabled:
  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
15989 unbound       4  52    0    91M    53M kqread   2   1:01 198.03% unbound
31733 root          9  20    0  3093M   341M nanslp   2   0:40  86.33% suricata


normal load surricate disabled:
  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
80557 root          1  23    0    60M    33M select   1   0:00   0.74% php-cgi
47553 root          1  20    0    57M    37M accept   3   0:01   0.32% php-cgi
  273 root          2  52    0   109M    62M accept   3   0:17   0.20% python3.9


try without any ids rules enabled - still 100% but less on unbound:
  PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
70580 unbound       2  21    0    91M    54M uwait    1   1:43  99.97% unbound
79961 root          9  20    0  2821M   100M nanslp   3   1:39  95.72% suricata
32905 root          1  20    0    14M  4220K CPU0     0   0:01   0.02% top



February 05, 2023, 03:14:21 PM #2 Last Edit: February 05, 2023, 03:26:56 PM by endurance
nothing really indicating an error.

surricated start until stop (after reaching 100% CPU for a while):
<173>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 51681 - [meta sequenceId="1"] [100285] <Notice> -- This is Suricata version 6.0.9 RELEASE running in SYSTEM mode
<172>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 51681 - [meta sequenceId="2"] [100285] <Warning> -- [ERRCODE: SC_ERR_CONF_YAML_ERROR(242)] - App-Layer protocol sip enable status not set, so enabling by default. This behavior will change in Suricata 7, so please update your config. See ticket #4744 for more details.
<172>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 51681 - [meta sequenceId="3"] [100285] <Warning> -- [ERRCODE: SC_ERR_CONF_YAML_ERROR(242)] - App-Layer protocol rfb enable status not set, so enabling by default. This behavior will change in Suricata 7, so please update your config. See ticket #4744 for more details.
<172>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 51681 - [meta sequenceId="4"] [100285] <Warning> -- [ERRCODE: SC_ERR_CONF_YAML_ERROR(242)] - App-Layer protocol mqtt enable status not set, so enabling by default. This behavior will change in Suricata 7, so please update your config. See ticket #4744 for more details.
<172>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 51681 - [meta sequenceId="5"] [100285] <Warning> -- [ERRCODE: SC_ERR_CONF_YAML_ERROR(242)] - App-Layer protocol rdp enable status not set, so enabling by default. This behavior will change in Suricata 7, so please update your config. See ticket #4744 for more details.
<172>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 51681 - [meta sequenceId="6"] [100285] <Warning> -- [ERRCODE: SC_ERR_CONF_YAML_ERROR(242)] - App-Layer protocol http2 enable status not set, so enabling by default. This behavior will change in Suricata 7, so please update your config. See ticket #4744 for more details.
<172>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 51681 - [meta sequenceId="7"] [100285] <Warning> -- [ERRCODE: SC_ERR_CONF_YAML_ERROR(242)] - App-Layer protocol http2 enable status not set, so enabling by default. This behavior will change in Suricata 7, so please update your config. See ticket #4744 for more details.
<172>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="8"] [100317] <Warning> -- [ERRCODE: SC_ERR_NO_RULES_LOADED(43)] - 1 rule files specified, but no rules were loaded!
<173>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="9"] [100351] <Notice> -- opened netmap:igb1/R from igb1: 0x8066f4000
<173>1 2023-02-05T11:34:05+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="10"] [100351] <Notice> -- opened netmap:igb1^ from igb1^: 0x8066f4300
<173>1 2023-02-05T11:34:06+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="11"] [100368] <Notice> -- opened netmap:igb1^ from igb1^: 0x8310f4000
<173>1 2023-02-05T11:34:06+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="1"] [100368] <Notice> -- opened netmap:igb1/T from igb1: 0x8310f4300
<173>1 2023-02-05T11:34:06+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="2"] [100374] <Notice> -- opened netmap:igb3/R from igb3: 0x85c0f4000
<173>1 2023-02-05T11:34:06+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="3"] [100374] <Notice> -- opened netmap:igb3^ from igb3^: 0x85c0f4300
<173>1 2023-02-05T11:34:06+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="4"] [100383] <Notice> -- opened netmap:igb3^ from igb3^: 0x886af4000
<173>1 2023-02-05T11:34:06+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="5"] [100383] <Notice> -- opened netmap:igb3/T from igb3: 0x886af4300
<173>1 2023-02-05T11:34:06+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="6"] [100317] <Notice> -- all 4 packet processing threads, 4 management threads initialized, engine started.
<173>1 2023-02-05T11:35:13+01:00 OPNsense-Two.dmz.ok-edv.de suricata 70881 - [meta sequenceId="1"] [100317] <Notice> -- Signal Received.  Stopping engine.


in unbound I as well do not see any issues. If there are any logs to analyse high CPU let me know. Somehow it seems like surricata triggers the issues. I would assume extrem high DNS request if unbound is going crazy as well - but not able to see stats any more after it is on high load.
attached normal situation (IDS off) and unbound stats via GUI under load (empty)

If any other logs would be helpful let me know.


Sure I should read the question more careful :)

999.000231 [4026] netmap_transmit           igb1 drop mbuf that needs checksum offload

disabled hw offload - now all back to normal.

Try turning off Unbound statistics/loggin if you have them enabled:
https://forum.opnsense.org/index.php?topic=32331.msg156261#msg156261

I've disabled it and my CPU appears to have returned to normal after a reboot.

February 06, 2023, 01:09:05 PM #6 Last Edit: February 06, 2023, 01:11:45 PM by endurance
Thanks for the link - that might help in some cases. In mine it was def. caused by network card hw offload. Either this setting was lost while upgrade or the new version now needs it.

Meanwhile I also upgraded my main OPNFirewall (there hw offload was already disabled) and both are working without any issues including unbound stats and report graphs on.
I like the new graph and will now also use unbound for visualizing adblocks etc. for this (I used a dockerized adguard before).