Upgrade to 24.1.3 (HA Backup Node) -> Unbound crash

Started by patient0, March 06, 2024, 09:59:36 PM

Previous topic - Next topic
March 06, 2024, 09:59:36 PM Last Edit: March 06, 2024, 10:05:11 PM by patient0
I run a OPNsense HA cluster on Proxmox in a Lab.

The master node still runs 24.1.2_1 and now I tried upgrading the backup node to 24.1.3. The upgrade itself runs fine, no errors and Unbound still runs. But if I restart the service it fails:

Notice unbound [84291:0] notice: init module 0: python
Error unbound [84291:0] error: pythonmod: can't parse Python script dnsbl_module.py
Error unbound [84291:0] error: pythonmod: python error: NoneType: None
Error unbound [84291:0] error: module init for module python failed


It's a standard HA setup, no blocklists or anything.

I rolled back again to 23.7.12_5, upgraded to 24.1 (it upgrades to 24.1.2_1) => OK; upgrade to 24.1.3 => NOK
Deciso DEC740

You need to reboot or apply unbound configuration. Restart won't do it.

Details about this weird change of shared object use here... https://github.com/opnsense/core/issues/7274


Cheers,
Franco

Quote from: franco on March 06, 2024, 10:13:19 PM
You need to reboot or apply unbound configuration. Restart won't do it.

Details about this weird change of shared object use here... https://github.com/opnsense/core/issues/7274

Thanks Franco for the quick reply. Reboots I did to, I wouldn't know what to 'Apply'. I applied the unbound config just to make sure, no change. And tried the commit mentioned in the issue but it fails to apply.
Deciso DEC740

The commit is included in 24.1.3.

Unbound should be running after updating and rebooting (or not manually restarting it).

But to be completely sure avoid patching anything and run:

# /usr/core # df -h | grep var.unbound
devfs                       1.0K    1.0K      0B   100%    /var/unbound/dev
/usr/local/lib/python3.9    217G    3.4G    214G     2%    /var/unbound/usr/local/lib/python3.9
/lib                        217G    3.4G    214G     2%    /var/unbound/lib

Should look like this with /lib being new... if it's not there that must be unbound's complaint.


Cheers,
Franco

Well, the mounts are there but unbound is not running after reboot and if I start manually it crashes right away.

root@OPNsense:~ # df -h | fgrep unbound
devfs                       1.0K    1.0K      0B   100%    /var/unbound/dev
/usr/local/lib/python3.9     23G    2.4G     19G    11%    /var/unbound/usr/local/lib/python3.9
/lib                         23G    2.4G     19G    11%    /var/unbound/lib


root@OPNsense:~ # ps aux | grep -i '[u]nbou'
root   84061   0.0  0.2 25732 15212  -  Ss   21:32    0:00.44 /usr/local/bin/python3 /usr/local/opnsense/scripts/dhcp/unbound_watcher.py --domain lab.patient0.xyz (python3.9)


If I disable CARP on the master node and I'm not able to resolve anything on the internet.
Deciso DEC740

Log is not very helpful. Let's try with manual start then:

# /usr/local/sbin/unbound -dc /var/unbound/unbound.conf


Cheers,
Franco

Thanks again Franco,

Here goes the output:

root@OPNsense:~ # /usr/local/sbin/unbound -dc /var/unbound/unbound.conf
Traceback (most recent call last):
  File "dnsbl_module.py", line 46, in <module>
    import dns.name
  File "/usr/local/lib/python3.9/site-packages/dns/name.py", line 33, in <module>
    if dns._features.have("idna"):
  File "/usr/local/lib/python3.9/site-packages/dns/_features.py", line 67, in have
    if not _version_check(requirement):
  File "/usr/local/lib/python3.9/site-packages/dns/_features.py", line 37, in _version_check
    t_version = _tuple_from_text(version)
  File "/usr/local/lib/python3.9/site-packages/dns/_features.py", line 10, in _tuple_from_text
    text_parts = version.split(".")
AttributeError: 'NoneType' object has no attribute 'split'


Upgraded the master node too and got the same error (rolled back now on master)
Deciso DEC740

Odd, can you try reverting this?

# opnsense-revert -r 24.1.2 py39-dnspython


Cheers,
Franco

March 06, 2024, 11:29:54 PM #8 Last Edit: March 07, 2024, 12:01:32 AM by patient0
Quote from: franco on March 06, 2024, 11:25:05 PM
Odd, can you try reverting this?

# opnsense-revert -r 24.1.2 py39-dnspython

Yep, that solved it. Can I help in debugging the cause? Can I get more debugging information to see the arguments that get passed? I'm not used to Python I ... thought the traceback showed the actual values that where passed as arguments, but nope.
Deciso DEC740

I think we have enough to fix this. Thanks a lot.


Cheers,
Franco

I had exactly the same issue after upgrading to 24.1.3_1 yesterday and had to revert unbound.

We will try to address it for 24.1.4.


Cheers,
Franco

Might be a leftover from an earlier installation or the use of external packages, the place this crashes in dnspython tries to figure out what the version of idna is.


import importlib.metadata
importlib.metadata.version('idna')


(https://github.com/rthalley/dnspython/blob/main/dns/_features.py)

When py39-idna is installed, this should return its version number, if not it fails (handled properly by the upstream code), but if there's some sort of empty module or one pretending to be idna without a version number, it returns None (which crashes here).

Best regards,

Ad

Did you guys install anything from the ports tree manually?


Cheers,
Franco

Quote from: franco on March 18, 2024, 02:10:33 PM
Did you guys install anything from the ports tree manually?

Sorry, missed your reply: No, I don't have ports installed at all (and 24.1.4 didn't change anything)
Deciso DEC740