Hi,
I've had this problem for several months, but now getting more often. OPNsense works several days just fine, but all the sudden home traffic starts slowind down and then I can't access it any longer and network dies. I keep it up to date, it's nothing sudden, the problem has been around for several releases. Now I'm running 24.7.11.
I just had to pull the plug and reboot. I thought I look around a bit. I disabled rrd collection just to make sure it's not that. No help. I run the following services at home, not much traffic:
- HAproxy (mainly traffic to nextcloud instance
- dnsmasq for home gadgets
- kea dhcp
- captive portal for guest VLAN, hardly ever used.
I used to have IPv6 enabled, but after moving the new connection only has IPv4.
So not much running. Immediately I notice some problems:
1. Flowd is eating CPU:
76462 root 1 135 0 58M 44M CPU0 0 16:38 100.00% python3.11
# ps awfux|grep 76462
root 76462 100.0 1.1 59844 44944 - Rs 09:23 16:57.09 /usr/local/bin/python3 /usr/local/opnsense/scripts/netflow/flowd_aggregate.py (python3.11)
2. Config.d Errors in logs
(I have never touched unbound, it's not running)
2024-12-18T09:44:55 Error configd.py [8741e584-e8e0-47d1-940e-639b0fe9a307] Script action failed with Command '/usr/local/opnsense/scripts/unbound/wrapper.py -s ' returned non-zero exit status 1. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/actions/script_output.py", line 78, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.11/subprocess.py", line 413, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '/usr/local/opnsense/scripts/unbound/wrapper.py -s ' returned non-zero exit status 1.
2024-12-18T09:30:11 Error configd.py Timeout (120) executing : system diag log '20' '0' '' 'core' 'audit' 'Emergency,Alert,Critical,Error,Warning' '1734420490.461'
2024-12-18T08:55:33 Error configd.py [eb377147-ead9-4e22-b070-4066dc2a5e25] Script action failed with Command '/usr/local/opnsense/scripts/interfaces/list_macdb.py ' died with <Signals.SIGBUS: 10>. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/actions/script_output.py", line 78, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.11/subprocess.py", line 413, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '/usr/local/opnsense/scripts/interfaces/list_macdb.py ' died with <Signals.SIGBUS: 10>.
2024-12-18T08:55:33 Error configd.py [47cd8873-4e90-45dd-81a7-66fa3dfee38c] Script action failed with Command '/usr/local/sbin/pluginctl -D ''' died with <Signals.SIGBUS: 10>. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/actions/script_output.py", line 78, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.11/subprocess.py", line 413, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '/usr/local/sbin/pluginctl -D ''' died with <Signals.SIGBUS: 10>.
2024-12-18T08:53:14 Warning configd.py Stopping daemon.
2024-12-18T08:53:14 Error configd.py Configd disconnected while executing : interface list macdb
2024-12-18T08:52:52 Error configd.py Configd disconnected while executing : openvpn connections client,server
2024-12-18T08:52:52 Warning configd.py Stopping daemon.
2024-12-18T08:50:06 Error api no active session, user not found
2024-12-18T08:45:08 Error configd.py Timeout (120) executing : firmware remote
2024-12-18T08:43:06 Error configd.py Timeout (120) executing : firmware tiers
2024-12-18T08:41:28 Error configd.py Timeout (120) executing : firmware remote
2024-12-18T08:38:06 Error configd.py Timeout (120) executing : firmware remote
2024-12-18T08:38:05 Error configd.py Timeout (120) executing : firmware tiers
2024-12-18T08:36:05 Error configd.py Timeout (120) executing : firmware tiers
2024-12-18T08:33:04 Error configd.py Timeout (120) executing : firmware tiers
2024-12-18T08:23:11 Error configd.py Timeout (120) executing : firmware remote
2024-12-18T08:20:03 Error configd.py Timeout (120) executing : firmware tiers
2024-12-18T08:16:03 Error configd.py Timeout (120) executing : firmware tiers
2024-12-18T08:12:01 Error configd.py Timeout (120) executing : firmware tiers
3. Disk space should be OK
root@OPNsense:~ # ls -ltrh /var/crash && df -hT
total 4
-rw-r--r-- 1 root wheel 5B Dec 2 21:45 minfree
Filesystem Type Size Used Avail Capacity Mounted on
/dev/gpt/rootfs ufs 13G 8.1G 4.3G 65% /
devfs devfs 1.0K 0B 1.0K 0% /dev
tmpfs tmpfs 2.0G 3.5M 2.0G 0% /tmp
devfs devfs 1.0K 0B 1.0K 0% /var/dhcpd/dev
devfs devfs 1.0K 0B 1.0K 0% /var/captiveportal/zone0/dev
So question, what the heck is this flowd doing, and how to disable it? Perhaps it's that overcooking the CPU. I found some old thread about deleting and putting interfaces back to it, I'll try. Let's see what else is there.
I toggled the nics off and back on in netflow, and also disbabled the local service and cleared the netflow data few times. Now I got the cpu usage down at least for a while. Let's see if it stays that way now.
Hi there,
I've a similar issue as yours. My Opnsense router would stop working all of a sudden (Internet dies and cannot access Opnsense GUI). It's been happening more frequently now. To get back internet, I need to reboot manually.
Digging around the logs in the UI, I saw a Backend error
```
[506c11e3-fc64-4b1c-89d3-1767a6b76110] Script action failed with Command '/usr/local/opnsense/scripts/firmware/read.sh ' died with <Signals.SIGBUS: 10>. at Traceback (most recent call last): File "/usr/local/opnsense/service/modules/actions/script_output.py", line 78, in execute subprocess.check_call(script_command, env=self.config_environment, shell=True, File "/usr/local/lib/python3.11/subprocess.py", line 413, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '/usr/local/opnsense/scripts/firmware/read.sh ' died with <Signals.SIGBUS: 10>.
```
Seems you are getting `<Signals.SIGBUS: 10>` as well which suggests maybe corrupt memory?
Following this thread.
Details:
OPNsense 24.7.11_2-amd64
FreeBSD 14.1-RELEASE-p6
OpenSSL 3.0.15
Intel(R) Core(TM) i3-N305 (8 cores, 8 threads) machine from Aliexpress
Thanks