Hi,
I have been experiencing this for quite long, but would now get to the roots of it. I installed telegraf, influxdb and grafana to see when and what starts going wrong. I see flowd_aggregate.py script at least keeps using lot cpu. But I can't find from logs what causes sudden memory usage, and raises cpu usage too. See grafana:
I didn't know where to put the image, as I can't upload it here, but see from mastodon: https://mementomori.social/@ikkeT/113957621410576425
(https://media.mementomori.social/media_attachments/files/113/957/620/119/081/032/original/f325ebcf41a5beae.png)
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
99283 root 1 120 0 51M 38M CPU0 0 46.7H 99.05% python3.11
root@OPNsense:~ # ps awfux|grep 99283
root 99283 83.5 0.9 52676 39024 - Rs 24Jan25 2804:23.00 /usr/local/bin/python3 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py (python3.11)
Any ideas what could cause this, or how to find the problema from logs?
flowd_aggregate is the netflow service. Since it provides statistics about every single connection it's a memory and CPU intensive process - depending on your bandwidth and number of users of course. Do you actually use that data? If yes you might consider not aggregating the raw data on the firewall but sending them to an NMS like Elastiflow for example.
Sorry only now noticed your reply, and thanks. I have tried to disable the collection of them, and I recall it still hung. I will disable it again after the next memleak to verify again.