Hi together,
after the upgrade to 22.1 I was faced with an died unbound process after new wanip for several times now.
It's obviously that a router without a running nameserver is near to useless ;)
The only thing I found so far are the following lines inside system/latest.log
<13>1 2022-02-02T00:49:02+01:00 fw.local.home opnsense 68773 - [meta sequenceId="444"] plugins_configure newwanip (,opt2)
<13>1 2022-02-02T00:49:02+01:00 fw.local.home opnsense 68773 - [meta sequenceId="445"] plugins_configure newwanip (execute task : ntpd_configure_do())
<13>1 2022-02-02T00:49:12+01:00 fw.local.home opnsense 68773 - [meta sequenceId="446"] plugins_configure newwanip (execute task : opendns_configure_do())
<13>1 2022-02-02T00:49:12+01:00 fw.local.home opnsense 68773 - [meta sequenceId="447"] plugins_configure newwanip (execute task : openssh_configure_do(,opt2))
<13>1 2022-02-02T00:49:12+01:00 fw.local.home opnsense 68773 - [meta sequenceId="448"] plugins_configure newwanip (execute task : unbound_configure_do(,opt2))
<11>1 2022-02-02T00:49:35+01:00 fw.local.home configctl 98849 - [meta sequenceId="449"] error in configd communication Traceback (most recent call last): File "/usr/local/opnsense/service/configd_ctl.py", line 68, in exec_config_cmd line = sock.recv(65536).decode() socket.timeout: timed out
<11>1 2022-02-02T00:51:41+01:00 fw.local.home opnsense 88635 - [meta sequenceId="2"] /usr/local/etc/rc.newwanipv6: The command '/usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain 'local.home'' returned exit code '1', the output was 'Unable to lock on the pidfile.'
<13>1 2022-02-02T00:51:41+01:00 fw.local.home opnsense 88635 - [meta sequenceId="3"] plugins_configure newwanip (execute task : vxlan_configure_interface())
<13>1 2022-02-02T00:51:41+01:00 fw.local.home opnsense 88635 - [meta sequenceId="4"] plugins_configure newwanip (execute task : webgui_configure_do(,wan))
<11>1 2022-02-02T00:51:41+01:00 fw.local.home opnsense 72174 - [meta sequenceId="5"] /usr/local/etc/rc.filter_configure: Ignore down inet gateways : WAN_DHCP6
<11>1 2022-02-02T00:51:41+01:00 fw.local.home opnsense 72174 - [meta sequenceId="6"] /usr/local/etc/rc.filter_configure: ROUTING: keeping current default gateway '100.84.0.1'
<11>1 2022-02-02T00:51:41+01:00 fw.local.home opnsense 72174 - [meta sequenceId="7"] /usr/local/etc/rc.filter_configure: Ignore down inet6 gateways : WAN_DHCP6
The error is not reproducible from the command line. There must be some special circumstances leading to the error. The really bad thing is you are left with a system near to useless as unbound is no longer running anymore after the failure :(
Has anybody seen this or any advice how to track this down?
Best regards
Ingo
Hi Ingo,
Well, what happens when you run the command that fails?
# /usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain 'local.home'
You can also always check which part in the Unbound configuration is causing an error:
# configctl unbound check
Cheers,
Franco
Hi Franco,
thanks for picking up my issue.
Here are the results from the commands
root@fw:~ # /usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain 'local.home'
Unable to lock on the pidfile.
root@fw:~ # configctl unbound check
no errors in /var/unbound/unbound.conf
Repated the command /usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain 'local.home' several times. The result is always Unable to lock on the pidfile.
But despite the error unbound is still running this time.
I am a bit lost and helpless :(
Best regards
Ingo
Hmm, would indicate the service is still running?
# cat /var/run/unbound_dhcpd.pid
Dann mal
# ps aux | grep PID-AUS-DER-DATEI
Danke,
Franco
Here we go ....
root@fw:~ # cat /var/run/unbound_dhcpd.pid
11057
root@fw:~ # ps aux | grep 11057
root 11057 0.0 0.0 25416 14940 - Ss 08:58 0:02.56 /usr/local/bin/python3 /usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain local.home (python3.8)
Quote from: franco on February 02, 2022, 09:20:18 AM
Hmm, would indicate the service is still running?
# cat /var/run/unbound_dhcpd.pid
then do a decent
# ps aux | grep pid-from-cat-command-above
Thanks,
Franco
...translation by forum-autotranslate.org... :-D
sorry, not the first time this happened :D
Can you see if unbound-control is stuck?
# ps aux | grep unbound-control
Cheers,
Franco
Hi Franco,
nope. Just found my own grep ;)
root@fw:~ # ps aux | grep unbound-control
root 70523 0.0 0.0 12740 2536 0 S+ 09:40 0:00.00 grep unbound-control
What happens on
# kill -TERM 11057
That's what the GUI is trying to do and apparently failing.
Cheers,
Franco
Ok, process termiated.
root@fw:~ # ps aux | grep 11057
root 11057 0.0 0.0 25416 14940 - Ss 08:58 0:03.08 /usr/local/bin/python3 /usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain local.home (python3.8)
root 71577 0.0 0.0 12740 2532 0 S+ 09:55 0:00.00 grep 11057
root@fw:~ # kill -TERM 11057
root@fw:~ # ps aux | grep 11057
root 78974 0.0 0.0 12740 2532 0 S+ 09:56 0:00.00 grep 11057
Is this good or bad news?
For debugging bad, in general good :)
I'm still suspecting unbound-control might be hanging and it's for strange reasons which I'm trying to address via https://github.com/opnsense/core/commit/a9a67b42550a
Can you do me a favour and report back if this happens again so we can try the patch as a next step (but not right now).
Cheers,
Franco
The behaviour with the command /usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain 'local.home' is reproducible. I executed the command (after the #kill -TERM 11057) and it succeeded. The next execution fails with 'Unable to lock on the pidfile'.
root@fw:~ # /usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain 'local.home'
root@fw:~ # /usr/local/opnsense/scripts/dns/unbound_dhcpd.py --domain 'local.home'
Unable to lock on the pidfile.
I'll drop you a note once this happenes again out of nowhere.
Best regards
Ingo
Hi Ingo,
That's to be expected honestly. It's already running. What I'm wondering is why a service might not terminate before restarting it. The code is always the same:
https://github.com/opnsense/core/blob/master/src/etc/inc/plugins.inc.d/unbound.inc#L407-L426
see first and last line in markup
Cheers,
Franco