21.7.3. - high CPU and MEM usage

Started by chemlud, September 22, 2021, 03:59:49 PM

Previous topic - Next topic
Now when I do an update to get this fix I get 'could not authenticate the selected mirror' due to a certificate verification error



any chance that you added cross-signed ISRG Root X1 cert to Trusted? (thats may be the reason)

I'm sorry for the "me too" post that mixes issues but it might warrant it.
I upgraded today from 21.7.2 to 21.7.3  and rebooted the firewall.
Yesterday I had disabled Netflow by leaving " Listening interfaces" and " WAN interfaces" as "Nothing selected".
Once all services had restarted post-boot, I gave it a few minutes to settle down and checked memory and cpu consumption. The cpu was high for a while.

htop gave me one process to check that looked like interesting.
/usr/local/bin/python3 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py (python3.8)
This had a WCPU value of 98%, constant for some 5 minutes.
A couple of ploints: an fstat showed the open file that the script was using was about 15 MB in size.
A quick look at the script I read: that it will rotate files at 10 MB and it's main method will check and repair it's sqlite db and log 'start watching flowd' to syslog.
It was there:
2021-09-30T17:58:32   /flowd_aggregate.py[54199]   vacuum done   
2021-09-30T17:58:31   /flowd_aggregate.py[54199]   vacuum interface_086400.sqlite   
2021-09-30T17:58:31   /flowd_aggregate.py[54199]   vacuum interface_003600.sqlite   
2021-09-30T17:58:31   /flowd_aggregate.py[54199]   vacuum interface_000300.sqlite   
2021-09-30T17:58:31   /flowd_aggregate.py[54199]   vacuum interface_000030.sqlite   
2021-09-30T17:58:15   /flowd_aggregate.py[54199]   vacuum dst_port_086400.sqlite   
2021-09-30T17:58:15   /flowd_aggregate.py[54199]   vacuum dst_port_003600.sqlite   
2021-09-30T17:58:15   /flowd_aggregate.py[54199]   vacuum dst_port_000300.sqlite   
2021-09-30T17:58:07   /flowd_aggregate.py[54199]   vacuum src_addr_086400.sqlite   
2021-09-30T17:58:07   /flowd_aggregate.py[54199]   vacuum src_addr_003600.sqlite   
2021-09-30T17:58:07   /flowd_aggregate.py[54199]   vacuum src_addr_000300.sqlite   
2021-09-30T17:57:48   dhclient[74977]   Creating resolv.conf   
2021-09-30T17:57:39   /flowd_aggregate.py[54199]   vacuum src_addr_details_086400.sqlite   
2021-09-30T17:57:08   /flowd_aggregate.py[54199]   start watching flowd   
2021-09-30T17:56:24   opnsense[53768]   plugins_configure newwanip (execute task : webgui_configure_do(,wan))   
2021-09-30T17:56:24   opnsense[53768]   plugins_configure newwanip (execute task : vxlan_configure_interface())   
2021-09-30T17:56:15   opnsense[53768]   plugins_configure newwanip (execute task : unbound_configure_do(,wan))   
2021-09-30T17:56:15   opnsense[53768]   plugins_configure newwanip (execute task : openssh_configure_do(,wan))   
2021-09-30T17:56:15   opnsense[53768]   plugins_configure newwanip (execute task : opendns_configure_do())   
2021-09-30T17:56:14   opnsense[53768]   plugins_configure newwanip (execute task : ntpd_configure_do())   
2021-09-30T17:56:13   opnsense[53768]   /usr/local/etc/rc.newwanip: Curl error occurred: Resolving timed out after 15001 milliseconds   
2021-09-30T17:56:12   /flowd_aggregate.py[54199]   startup, check database.

I thought it took longer than the 2 mins the logs show but maybe I my recollection is incorrect.

What I see now is that that's the "hoovering" routine that spikes the cpu. At first I thought new python version, needs to maybe convert the data in files or db and was needed as a one-off.
However the script runs on a loop that seems outside the in-script "vacuum_interval = (60*60*8) # 8 hour vacuum cycle"

Question: is it normal that it runs even when Netflow is disabled?

I've not sighup'ed it in case is needed and this is how is meant to be.

September 30, 2021, 08:06:06 PM #35 Last Edit: September 30, 2021, 08:10:05 PM by dcol
Getting back to main topic. with 21.7.3_3, I do see slightly reduced CPU usage and lower temps with less CPU spikes. All 4 cores average in the 40C range now. Using Intel i7-6700. Going to change to i5-6600T next week to reduce temps even further.

probably my post is too verbose but is on topic.
High cpu usage post upgrade to 21.7.3 and I'm trying to point to what might be a problem. Or rather seeking confirmation if is normal to have flowd_aggregate.py running if there is no netflow enabled.
Sorry if is not clear.

@cookiemonster
Quoteif is normal to have flowd_aggregate.py running if there is no netflow enabled
no. should not imho. but script determines your netflow as enabled with local target. try to clear the "Capture local" checkbox and apply

Indeed Fright. Clearing that seems to have stopped the python script from running and reducing the cpu usage.
Thank you.