21.7.3. - high CPU and MEM usage

Started by chemlud, September 22, 2021, 03:59:49 PM

Previous topic - Next topic
after reboot cpu and mem seem back to normal. Last fresh install was made in January 2018  :D

Same here. Reboot needed. Also syslog-ng crashed and fixed after reboot.

Quote from: chemlud on September 23, 2021, 01:38:25 PM
... I do a fresh install when the underlying OS get's a major update, as expected for 22.1, but not every 6 months. That would be overkill imho...

With 21.7 I did it more because of the change to zfs.

September 23, 2021, 07:43:08 PM #18 Last Edit: September 23, 2021, 07:53:36 PM by Fright
imho configd_ctl.py ("configd_ctl.py -e -t 0.5 system event config_changed") goes mad on stdin reading when pipe breaks (syslog-ng crash?)
perhaps adding some exception handling would be useful

No high memory or CPU issues on my Protectli vault.

Like i wrote in another thread, my problem with high load is the following, reboot did not fix it.

61047 root         85    0    28M    19M CPU0     0   0:06  98.33% /usr/local/bin/python3 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py (python3.8)

After deactivate
Reporting -> Netflow -> Capture local
it runs normal without high load. Maybe some incompatibility with the new python 3.8?

Quote from: Olli on September 25, 2021, 03:25:04 PM
Like i wrote in another thread, my problem with high load is the following, reboot did not fix it.

After deactivate
Reporting -> Netflow -> Capture local
it runs normal without high load. Maybe some incompatibility with the new python 3.8?

I have had this happen in the past after Netflow database corruption. If you reset the data files (Reporting -> Settings) and re-enable Netflow, does the high load come back?
In theory there is no difference between theory and practice. In practice there is.

Quote from: dinguz on September 25, 2021, 04:43:17 PM

I have had this happen in the past after Netflow database corruption. If you reset the data files (Reporting -> Settings) and re-enable Netflow, does the high load come back?

Thanks. That seems to work.
Maybe the update crashed the database  :)

same to me did the upgrade yesterday to  OPNsense 21.7.3_1 - CPU and Memory are close to full (100%).

Syslog-ng Daemon agent was not started.

Check top .. saying .. that sylog-ng and python 3.8 eats all the ressources on the system.

PID USERNAME    THR PRI NICE   SIZE    RES STATE    C   TIME    WCPU COMMAND
----------------------------------------------------------------------------------------------
74757 root          6  29    0    33M    10M kqread   0   9:57 109.57% syslog-ng
15412 root          1 101    0  1435M  1221M CPU0     0 429:56  97.99% python3.8
90861 root          1 101    0    11M  1916K CPU1     1  15:39  97.89% syslogd

Did not do a reboot till now ..

Seems this firmware release was not properly tested before populate it.





September 28, 2021, 07:42:47 PM #24 Last Edit: September 28, 2021, 08:12:00 PM by dcol
Same issue here. CPU seems to get overworked then the internet comes to a crawl. A reboot fixes it, but this should not be right. I can tell this happens because I use Monit and it alerts me when the CPU is overloaded. I have a simple setup only using Suricata with a few rules and Monit. I don't use netflow, Sensei, bridges, or VPN. I updated to 21.7.3_1 hoping for a fix, but happened again this morning. I don't see anything abnormal in the top logs. It's a mystery.

top shows Suricata averaging at 2%, but python 3.8 pops in at 60-80% every once in a while which correlates to the CPU temps rising.

Also I noticed the temps average 50-55 now. Was 30-35 with 21.7.2

Have you tried to reset RRD and Netflow data (in Reporting: Settings)? I have seen this behavior occasionally with database corruption.
In theory there is no difference between theory and practice. In practice there is.

I have RRD disabled and no interfaces selected in netflow. I do not care about insight data. But to test I turned them on and selected the interface. Issue still remains.

did a reboot this morning and now memory-/swap/-cpu consumption is normal.

So - reboot fixes this issue definitively

problem is we shouldn't have to reboot to get OPNsense 'back to normal'. These occurrences are due to a problem and needs to be fixed. We didn't use to see this issue.

https://github.com/opnsense/core/commit/17aec4ed46 hotfix pushed today to address this... thanks to kulikov-a for digging into this


Cheers,
Franco