OPNsense Forum
Archive => 20.1 Legacy Series => Topic started by: XabiX on April 23, 2020, 11:13:20 pm
-
Hello Team and Experts,
I am happy to have joined OPNsense since a long time on PFsense !
I was running 20.1.2 without any issue and since the upgrade to 20.1.5 my AMD Ryzen 7 3700X 8-Core Processor (2 cores) are at 100% because of Netflow. I tried removing the interfaces (clear all) to deactivate Netflow but still the same (so I put it back as it was).
Any idea of what can be the issue?
100.00% /usr/local/bin/python3 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py (python3.7)
ls -lah /var/netflow/*
-rw-r----- 1 root wheel 3.1M Apr 23 22:41 /var/netflow/dst_port_000300.sqlite
-rw-r----- 1 root wheel 61K Apr 23 22:41 /var/netflow/dst_port_000300.sqlite-journal
-rw-r----- 1 root wheel 848K Apr 23 22:41 /var/netflow/dst_port_003600.sqlite
-rw-r----- 1 root wheel 33K Apr 23 22:41 /var/netflow/dst_port_003600.sqlite-journal
-rw-r----- 1 root wheel 2.5M Apr 23 22:41 /var/netflow/dst_port_086400.sqlite
-rw-r----- 1 root wheel 61K Apr 23 22:41 /var/netflow/dst_port_086400.sqlite-journal
-rw-r----- 1 root wheel 7.1M Apr 23 22:41 /var/netflow/interface_000030.sqlite
-rw-r----- 1 root wheel 93K Apr 23 22:41 /var/netflow/interface_000030.sqlite-journal
-rw-r----- 1 root wheel 2.5M Apr 23 22:41 /var/netflow/interface_000300.sqlite
-rw-r----- 1 root wheel 37K Apr 23 22:41 /var/netflow/interface_000300.sqlite-journal
-rw-r----- 1 root wheel 680K Apr 23 22:41 /var/netflow/interface_003600.sqlite
-rw-r----- 1 root wheel 33K Apr 23 22:41 /var/netflow/interface_003600.sqlite-journal
-rw-r----- 1 root wheel 56K Apr 23 22:41 /var/netflow/interface_086400.sqlite
-rw-r----- 1 root wheel 8.5K Apr 23 22:41 /var/netflow/interface_086400.sqlite-journal
-rw-r----- 1 root wheel 12K Apr 23 22:41 /var/netflow/metadata.sqlite
-rw-r----- 1 root wheel 12M Apr 23 22:41 /var/netflow/src_addr_000300.sqlite
-rw-r----- 1 root wheel 145K Apr 23 22:41 /var/netflow/src_addr_000300.sqlite-journal
-rw-r----- 1 root wheel 4.9M Apr 23 22:41 /var/netflow/src_addr_003600.sqlite
-rw-r----- 1 root wheel 61K Apr 23 22:41 /var/netflow/src_addr_003600.sqlite-journal
-rw-r----- 1 root wheel 18M Apr 23 22:41 /var/netflow/src_addr_086400.sqlite
-rw-r----- 1 root wheel 321K Apr 23 22:41 /var/netflow/src_addr_086400.sqlite-journal
-rw-r----- 1 root wheel 98M Apr 23 22:41 /var/netflow/src_addr_details_086400.sqlite
-rw-r----- 1 root wheel 1.1M Apr 23 22:41 /var/netflow/src_addr_details_086400.sqlite-journal
root@OPNsense:/home/xabix # ls -lah /var/log/flowd*
-rw------- 1 root wheel 77K Apr 23 22:58 /var/log/flowd.log
-rw------- 1 root wheel 258M Apr 23 22:56 /var/log/flowd.log.000001
-rw------- 1 root wheel 10M Apr 20 15:35 /var/log/flowd.log.000002
-rw------- 1 root wheel 10M Apr 20 13:05 /var/log/flowd.log.000003
-rw------- 1 root wheel 10M Apr 20 09:55 /var/log/flowd.log.000004
-rw------- 1 root wheel 10M Apr 20 06:24 /var/log/flowd.log.000005
-rw------- 1 root wheel 10M Apr 20 02:35 /var/log/flowd.log.000006
-rw------- 1 root wheel 10M Apr 19 23:00 /var/log/flowd.log.000007
-rw------- 1 root wheel 10M Apr 19 20:11 /var/log/flowd.log.000008
-rw------- 1 root wheel 10M Apr 19 16:58 /var/log/flowd.log.000009
-rw------- 1 root wheel 10M Apr 19 13:46 /var/log/flowd.log.000010
root@OPNsense:/home/xabix # df -h
Filesystem Size Used Avail Capacity Mounted on
/dev/gpt/rootfs 15G 3.1G 10G 23% /
devfs 1.0K 1.0K 0B 100% /dev
fdescfs 1.0K 1.0K 0B 100% /dev/fd
procfs 4.0K 4.0K 0B 100% /proc
devfs 1.0K 1.0K 0B 100% /var/dhcpd/dev
devfs 1.0K 1.0K 0B 100% /var/unbound/dev
I am launching a repair of the Netflow database to see if this fixes something. Anyway, it seems that in the past there were similar issues/patchs depending on the python releases.
Am I the only one facing the issue? Is there a way without reinstalling to reset this netflow part? I assume with a delete the netflow database but would that be enough.
Merci
XabiX
-
Looking better this morning :D
PID USERNAME PRI NICE SIZE RES STATE C TIME WCPU COMMAND
11 root 155 ki31 0 32K CPU1 1 81:05 100.00% [idle{idle: cpu1}]
0 root -16 - 0 880K swapin 0 669:39 0.00% [kernel{swapper}]
17217 root 20 0 26M 23M select 1 2:21 0.00% /usr/local/bin/python3 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py (python3.7)
57611 root 20 0 2750M 664M nanslp 1 0:38 0.00% /usr/local/bin/suricata -D --netmap --pidfile /var/run/suricata.pid -c /usr/local/etc/suricata/suricata.yaml{suricata}
-
I'm seeing this same problem. Netflow is pegging a CPU at 100% ... I just rebooted my firewall so I'm wondering if my recent config changes did this, or the problem was there before I didn't notice.
Anyone know if this is a bug in the code, or is netflow simply having trouble keeping up with the traffic volume? My firewall is pushing an average of about a 1 gigabit/sec out to the internet (bursting up to 10 gigs), and that doesn't include internal traffic. So it's possible the volume is simply too much for a single threaded python process to handle. I've noticed the process does periodically drop to idle. But it doesn't stay that way for long (5 to 8 minutes at 100% followed by less than 2 minutes at idle, if I'm guesstimating).
Thoughts?
-
Stopping the flowd_aggregate service via the web GUI eliminated the CPU process. After doing so I noticed a file that the /var/log/flowd.log file had grown to be over a gigabyte. Not sure where it was at before I stopped the aggregator process though.
Anyways, I cleared the netflow data via the web GUI, and so far, the process isn't hogging a CPU core anymore.
-
Clearing the netflow data fixed it for awhile, but eventually the CPU usage returned. For now I'm just going to renice the process.