Connectivity Check: Non-recoverable resolver failure

Started by soko, August 01, 2022, 11:48:57 AM

Previous topic - Next topic
Hi guys,

While trying to identify the issues I'm having with v22.7 since the last couple of hours I've found some strange message back when my VM is on v22.1.10_4:

Health Check is sound:

***GOT REQUEST TO AUDIT HEALTH***
Currently running OPNsense 22.1.10_4 (amd64/OpenSSL) at Mon Aug  1 10:35:02 CEST 2022
>>> Check installed kernel version
Version 22.1.9 is correct.
>>> Check for missing or altered kernel files
No problems detected.
>>> Check installed base version
Version 22.1.9 is correct.
>>> Check for missing or altered base files
No problems detected.
>>> Check installed repositories
OPNsense
>>> Check installed plugins
No plugins found.
>>> Check locked packages
No locks found.
>>> Check for missing package dependencies
Checking all packages: .......... done
>>> Check for missing or altered package files
Checking all packages: .......... done
>>> Check for core packages consistency
Core package "opnsense" has 66 dependencies to check.
Checking packages: .................................................................... done
***DONE***


Connectivity Check reports "Non-recoverable resolver failure"

***GOT REQUEST TO AUDIT CONNECTIVITY***
Currently running OPNsense 22.1.10_4 (amd64/OpenSSL) at Mon Aug  1 10:36:10 CEST 2022
Checking connectivity for host: pkg.opnsense.org -> 89.149.211.205
PING 89.149.211.205 (89.149.211.205): 1500 data bytes
1508 bytes from 89.149.211.205: icmp_seq=0 ttl=53 time=188.715 ms
1508 bytes from 89.149.211.205: icmp_seq=1 ttl=53 time=51.155 ms
1508 bytes from 89.149.211.205: icmp_seq=2 ttl=53 time=69.108 ms
1508 bytes from 89.149.211.205: icmp_seq=3 ttl=53 time=68.161 ms

--- 89.149.211.205 ping statistics ---
4 packets transmitted, 4 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 51.155/94.285/188.715/54.985 ms
Checking connectivity for repository (IPv4): https://pkg.opnsense.org/FreeBSD:13:amd64/22.1
Updating OPNsense repository catalogue...
Fetching meta.conf: . done
Fetching packagesite.pkg: .......... done
Processing entries: .......... done
OPNsense repository update completed. 799 packages processed.
All repositories are up to date.
Checking connectivity for host: pkg.opnsense.org -> 2001:1af8:4f00:a005:5::
ping6: UDP connect: No route to host
Checking connectivity for repository (IPv6): https://pkg.opnsense.org/FreeBSD:13:amd64/22.1
Updating OPNsense repository catalogue...
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.1/latest/meta.txz: Non-recoverable resolver failure
repository OPNsense has no meta file, using default settings
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.1/latest/packagesite.pkg: Non-recoverable resolver failure
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.1/latest/packagesite.txz: Non-recoverable resolver failure
Unable to update repository OPNsense
Error updating repositories!
***DONE***


Which I quite not understand as the ping to pkg.opsense.org is OK and I can download https://pkg.opnsense.org/FreeBSD:13:amd64/22.1/latest/meta.txz from my clients with no problem.

The upgrade to v22.7 works. But afterwards I have several issues there:

  • "Dnsmasq DNS" does not work anymore (clients don't get DNS resolved). I have to disable it and enable unbound
  • /ui/core/firmware#status page does take 5 minutes to show values after reboot.
  • Health Audit is stuck @ "Core package "opnsense" has 63 dependencies to check." at 1 dot of progress for 30 mins.

Are the issues on v22.7 caused by the "Non-recoverable resolver failure" in v22.1?

thx
Soko

I think in general it seems IPv6 is defunct. Have you disabled IPv6 type in your WAN?


Cheers,
Franco

Yeah... I have disabled IPv6 "globally" (according to https://www.thomas-krenn.com/de/wiki/OPNsense_IPv6_deaktivieren) back in the days when I first installed OPNsense as I don't have or need IPv6.

Is missing IPv6 also causing the troubles in v22.7?

Is it enough if I set "IPv6 Configuration Type" on WAN back to DHCPv6 or do I have to "Allow IPv6" in Advanced Settings and redo the IPv6 rules in the Firewall?

thx heaps!

PS: With IPv6 to DHCPv6 the connectivity check says "No route to host" :(

If IPv6 is disabled properly the connectivity audit for IPv6 can be discarded. Do you have issues with repo updates in general or was this just a reference to the connectivity audit error?


Cheers,
Franco

Ahh OK!
So everything after the line "Checking connectivity for host: pkg.opnsense.org -> 2001:1af8:4f00:a005:5::" is IPv6 stuff. Then yes... it can be discarded.

This was just my attempt to figuring out the issues I have once I'm on v22.7. (see opening post). The "Check for Update" there gets also stuck and does nothing (same as the Health Check).

I reckon I should open a separate Post in the 22.7 folder, right?

Gets stuck where? It's not in the OP.


Cheers,
Franco

OK,

Now I've ungraded to v22.7 again.

First of all I had to disable dnsmasq and enable unbound to make the internet work again (nothing else be untick and tick the enable-checkboxes).

Then I goto System:Firmware:Status
This takes 1 minute until I see something here. In v22.1 it took 2 seconds. Here's the output:

Type opnsense
Version 22.7_4
Architecture amd64
Flavour OpenSSL
Commit 909dcabd5
Mirror https://pkg.opnsense.org/FreeBSD:13:amd64/22.7
Repositories OPNsense
Updated on Mon Aug 1 12:44:24 CEST 2022
Checked on N/A


Then I click "Check for Updates" which is stuck here:

***GOT REQUEST TO CHECK FOR UPDATES***
Currently running OPNsense 22.7_4 (amd64/OpenSSL) at Mon Aug  1 12:52:51 CEST 2022
Fetching changelog information, please wait... fetch: transfer timed out
Updating OPNsense repository catalogue...
pkg: Repository OPNsense has a wrong packagesite, need to re-create database


fyi: the "timed out" for fetching changelog info comes immediately.
there is no load on the cpu... so I don't think it is doing a re-create.

I've found: https://forum.opnsense.org/index.php?topic=7742.0
# /usr/bin/time configctl firmware check says:
OK  0.05 real    0.04user   0.00sys

While typing (after 5 mins or so) the messages continued:

***GOT REQUEST TO CHECK FOR UPDATES***
Currently running OPNsense 22.7_4 (amd64/OpenSSL) at Mon Aug  1 12:52:51 CEST 2022
Fetching changelog information, please wait... fetch: transfer timed out
Updating OPNsense repository catalogue...
pkg: Repository OPNsense has a wrong packagesite, need to re-create database
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/meta.txz: Operation timed out
repository OPNsense has no meta file, using default settings


thx
Soko

Hi Soko,

So now what does the connectivity audit on 22.7 say?

It looks like there is something going on, but it's not the firmware update but rather something in the firewall rules messing with this.


Cheers,
Franco

So,

Finally the update check finished:

***GOT REQUEST TO CHECK FOR UPDATES***
Currently running OPNsense 22.7_4 (amd64/OpenSSL) at Mon Aug  1 12:52:51 CEST 2022
Fetching changelog information, please wait... fetch: transfer timed out
Updating OPNsense repository catalogue...
pkg: Repository OPNsense has a wrong packagesite, need to re-create database
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/meta.txz: Operation timed out
repository OPNsense has no meta file, using default settings
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.pkg: Operation timed out
pkg: https://pkg.opnsense.org/FreeBSD:13:amd64/22.7/latest/packagesite.txz: Operation timed out
Unable to update repository OPNsense
Error updating repositories!
pkg: Repository OPNsense cannot be opened. 'pkg update' required
Checking integrity... done (0 conflicting)
Your packages are up to date.
***DONE***


The connectivity check says:

***GOT REQUEST TO AUDIT CONNECTIVITY***
Currently running OPNsense 22.7_4 (amd64/OpenSSL) at Mon Aug  1 13:49:14 CEST 2022
Checking connectivity for host: pkg.opnsense.org -> 89.149.211.205
PING 89.149.211.205 (89.149.211.205): 1500 data bytes

--- 89.149.211.205 ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss
Checking connectivity for repository (IPv4): https://pkg.opnsense.org/FreeBSD:13:amd64/22.7
Updating OPNsense repository catalogue...


Which is kinda weird as an "Interfaces: Diagnostics: Ping" with SourceAddress=default fails...
but with SourceAddress=WAN succeeds. LAN too! Also success from the clients.

"Interfaces: Diagnostics: DNS Lookup" also succeeds of course.

Why is something suddendly a problem in v22.7... weird...

Here are my firewall rules.
Nothing in floating or loopback

Everything is being pushed through the NordVPN it seems. Is this perhaps set as the default gateway under System: Gateways: Single?


Cheers,
Franco

Yes it is:

System: Gateways: Single:
NordVPN (active)   WAN   IPv4   254 (upstream)   192.168.179.254   103.86.96.100   ~   ~   ~   Online   

There is another single gateway, but that one is forced offline.
both of them are in a group with NordVPN being Tier 1.

The default IPv4 route also points to 192.168.179.254

System: Routes: Log File: shows something weird though:
CLOG��

I've seen this a couple of times in the logs so far

CLOG reads were removed from 22.7 as we haven't written a CLOG file since 21.7.x. This is sort of expected. Clearing the logs can solve this or removing the corresponding /var/log/*.log files manually.

IMO you want to check why NordVPN won't let your pings through or do a traceroute to see what happens. It might be an issue with the kernel as well. It's a bit difficult to pinpoint.


Cheers,
Franco

Hmmm...its seems that I'm not even able to ping NordVPN in with SourceAddress=default.
What exactly is this SourceAddress=default?

EDIT: Just tried on v22.1: Ping from SourceAddress=default works there! So I think the root problem is this.

As seen below pinging &traceroute NorVPN and pkg.opnsense.org works fine!?!

So how can this be an issue of my NordVPN gateway?

Interfaces: Diagnostics: Ping
Host: 89.149.211.205
Fails with SourceAddress: default
Succeeds with SourceAddress: LAN, WAN

Interfaces: Diagnostics: Ping
Host: 192.168.179.254 (NordVPN Gateway)
Fails with SourceAddress: default
Succeeds with SourceAddress: LAN, WAN

Interfaces: Diagnostics: Trace Route
Host: 89.149.211.205
Fails with SourceAddress: default
Succeeds with SourceAddress: LAN, WAN

# /usr/sbin/traceroute -w 2 -n  -m '18' -s '192.168.254.253'   '89.149.211.205'
traceroute to 89.149.211.205 (89.149.211.205) from 192.168.254.253, 18 hops max, 40 byte packets
1  192.168.179.254  0.397 ms  0.246 ms  0.249 ms
2  10.8.0.1  60.508 ms  25.678 ms  25.393 ms
3  37.120.155.225  28.335 ms  28.452 ms  33.759 ms
4  176.10.83.68  64.103 ms  105.169 ms  58.466 ms
5  82.102.29.36  29.961 ms
    82.102.29.30  28.370 ms
    82.102.29.36  25.943 ms
6  176.10.82.102  33.783 ms  37.875 ms  37.917 ms
7  89.44.212.109  52.182 ms
    89.44.212.5  47.829 ms  46.059 ms
8  77.243.176.217  46.043 ms  48.031 ms  43.809 ms
9  195.66.225.56  45.818 ms  44.217 ms  49.654 ms
10  31.31.34.20  48.640 ms
    31.31.34.22  45.396 ms  48.409 ms
11  31.31.38.83  47.555 ms  46.046 ms
    31.31.38.119  45.977 ms
12  81.17.35.183  45.335 ms
    81.17.32.77  44.718 ms
    81.17.35.179  49.768 ms
13  81.17.35.73  47.918 ms
    81.17.35.67  47.924 ms
    81.17.35.71  46.359 ms
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *

Consistent but not great. It would indicate a kernel issue.

Would you mind posting the following as well:

# netstat -rn | grep default


Thanks,
Franco