Unbound failed to start after upgrading to 23.7, Domain Overrides Issue

Started by anicoletti, August 01, 2023, 05:50:34 AM

Previous topic - Next topic
Ran into some issues upgrading to 23.7 and Unbound not starting. Figured I'd share this information as I did not see anyone else post this specific issue yet.

I upgraded from 23.1.11_1 to 23.7 on one of our client firewalls this evening. Upon completion, DNS services failed to start on the firewall. I was able to remote into another system and connect into the firewall and noticed Unbound was not running. Attempting to start it spun from 10-15 seconds then returned with it still offline. Connected to the firewall via SSH and ran the following command to check the status on starting the service:

Command:
unbound -c /var/unbound/unbound.conf

Results:
/var/unbound/etc/domainoverrides.conf:1: error: syntax error
read /var/unbound/unbound.conf failed: 1 errors in configuration file
[1690860690] unbound[25940:0] fatal error: Could not read config file: /var/unbound/unbound.conf. Maybe try unbound -dd, it stays on the commandline to see more errors, or unbound-checkconf


We still have some clients where domain overrides are under the Overrides section in the GUI and not moved over to the Query Forwarding yet. Upon removing the entries from the Overrides section and adding them back in under Query Forwarding, I was able to successfully start the Unbound services and query the internal domain overrides.

I realize you moved the entries, but is it possible to go back and get verbose output to see the exact breakage? e.g.

unbound -ddvv -c /var/unbound/unbound.conf

after stopping Unbound from the GUI.

Cheers,
Stephan

I have a bunch of domain overrides and didn't encounter this issue, so it must be something more specific. Would indeed be interesting if you could reproduce it.

Cheers
Maurice
OPNsense virtual machine images
OPNsense aarch64 firmware repository

Commercial support & engineering available. PM for details (en / de).

Interesting. So two issues. First, after adding the items back under the Overrides GUI, I'm able to restart Unbound without issue. Second, when I attempt to run the command with -ddvv, I get the following error:

root@opnsense:/var/unbound/etc # unbound -ddvv -c /var/unbound/unbound.conf
[1690891368] unbound[46649:0] notice: Start of unbound 1.17.1.
[1690891368] unbound[46649:0] debug: chdir to /var/unbound
[1690891368] unbound[46649:0] debug: chroot to /var/unbound
[1690891368] unbound[46649:0] debug: drop user privileges, run as unbound
[1690891368] unbound[46649:0] debug: switching log to stderr
[1690891368] unbound[46649:0] debug: module config: "python iterator"
[1690891368] unbound[46649:0] notice: init module 0: python
Could not find platform independent libraries <prefix>
Could not find platform dependent libraries <exec_prefix>
Consider setting $PYTHONHOME to <prefix>[:<exec_prefix>]
Python path configuration:
  PYTHONHOME = (not set)
  PYTHONPATH = (not set)
  program name = 'unbound'
  isolated = 0
  environment = 1
  user site = 1
  import site = 0
  sys._base_executable = ''
  sys.base_prefix = '/usr/local'
  sys.base_exec_prefix = '/usr/local'
  sys.platlibdir = 'lib'
  sys.executable = ''
  sys.prefix = '/usr/local'
  sys.exec_prefix = '/usr/local'
  sys.path = [
    '/usr/local/lib/python39.zip',
    '/usr/local/lib/python3.9',
    '/usr/local/lib/lib-dynload',
  ]
Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
Python runtime state: core initialized
ModuleNotFoundError: No module named 'encodings'

Current thread 0x0000000829022000 (most recent call first):
<no Python frame>



I can see about pulling a backup of the configuration from prior to the upgrade to see if there is anything odd in how it generates the domainoverrides.conf file. I also have quite a few other units to upgrade so I can monitor those and post additional details if I run into it again.

I'm in much the same boat, trying to set up access control views, but running into the same inability to start unbound, along with the same output from `unbound -ddvv -c /var/unbound/unbound.conf`...

Can you guys run this and see what it reports?

# /usr/local/opnsense/mvc/script/run_migrations.php


Cheers,
Franco

I have the same problem.

I get this message. Where can I find the log?

*** OPNsense\Unbound\Unbound Migration failed, check log for details

I found these lines in System/general log:
2023-07-31T21:23:50 Notice kernel <118>You may need to manually remove /usr/local/etc/unbound/unbound.conf if it is no longer needed.
2023-07-31T21:23:42 Notice kernel <118>*** OPNsense\Unbound\Unbound Migration failed, check log for details
2023-07-31T21:23:42 Error config Model OPNsense\Unbound\Unbound can't be saved, skip ( OPNsense\Phalcon\Filter\Validation\Exception: [OPNsense\Unbound\Unbound:general.active_interface] option not in list{}
2023-07-31T21:23:42 Error config [OPNsense\Unbound\Unbound:general.active_interface] option not in list{}
2023-07-31T21:22:50 Notice kernel <118>[87/214] Extracting unbound-1.17.1_3: .......... done
2023-07-31T21:22:50 Notice kernel <118>Using existing user 'unbound'.
2023-07-31T21:22:50 Notice kernel <118>Using existing group 'unbound'.
2023-07-31T21:22:50 Notice kernel <118>[87/214] Upgrading unbound from 1.17.1_2 to 1.17.1_3...
2023-07-31T21:22:50 Notice kernel <118> unbound: 1.17.1_2 -> 1.17.1_3
2023-07-31T21:22:50 Notice kernel <118>unbound-1.17.1_2: already unlocked

Ok what does this return?

# pluginctl -g unbound.active_interface


Cheers,
Franco


Ok let's try differently:

# grep active_interface /conf/config.xml


Cheers,
Franco


Same commands on a router that is still running 23.1.11:
root@husabyvagen:~ # grep active_interface /conf/config.xml
    <active_interface/>
root@husabyvagen:~ # pluginctl -g unbound.active_interface

root@husabyvagen:~ #

I have a specific interface selected in Unbound, maybe that's why I didn't encounter the issue?

# pluginctl -g unbound.active_interface

# grep active_interface /conf/config.xml
        <active_interface>opt9</active_interface>
OPNsense virtual machine images
OPNsense aarch64 firmware repository

Commercial support & engineering available. PM for details (en / de).

No, this is going to be pretty silly...

Can you remove that line "<active_interface/>" from /conf/config.xml and run the migration again?

# /usr/local/opnsense/mvc/script/run_migrations.php


Cheers,
Franco