OPNsense Forum

English Forums => General Discussion => Topic started by: Northguy on January 12, 2019, 02:04:22 pm

Title: [solved] how to debug DNS resolve error in Unbound<>Bind setup
Post by: Northguy on January 12, 2019, 02:04:22 pm
Hi All,

Who has some suggestions on how to debug an unresolved URL of which I am most certain that it should exist.

I get a DNS_PROBE_FINISHED_NXDOMAIN DNS error on www.synology-forum.nl of which I am sure it exists.

OPNsense Setup:
I have setup Unbound with Bind DNSBL according https://www.routerperformance.net/opnsense/dnsbl-via-bind-plugin/
This is working fine in almost all cases and usual blocked sites I expect to be actively blocked by the DNSBL.

For mentioned synology-forum site I expect it to be legit, but run into a block and I do not know how/why.

Checks performed:
1) Checked Unbound log file, which results in a THROWAWAY error from BIND at 127.0.0.1
2) Checked the BIND DNSBL entries at /usr/local/etc/namedb/dnsbl.inc. the URL is not on any blacklist
3) Checked BIND log file, which results in the log shown below.

If I disable the BIND forward, Unbound resolves the URL without problems.

Big Question: what is causing Bind to not resolve the url?

[
Code: [Select]
12-Jan-2019 13:37:25.625 query-errors: info: client @0x54c39e2d600 127.0.0.1#32753 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:25.624 query-errors: info: client @0x54c39e2d600 127.0.0.1#52141 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:25.624 query-errors: info: client @0x54c39e2d600 127.0.0.1#27259 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:25.623 query-errors: info: client @0x54c39e2d600 127.0.0.1#54978 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:25.621 query-errors: info: client @0x54c3b1f2000 127.0.0.1#34524 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:10644
12-Jan-2019 13:37:17.951 query-errors: info: client @0x54c3ad0f000 127.0.0.1#10908 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.950 query-errors: info: client @0x54c3ad0f000 127.0.0.1#17174 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.948 query-errors: info: client @0x54c3ad0f000 127.0.0.1#43277 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.948 query-errors: info: client @0x54c3ae7aa00 127.0.0.1#24151 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.947 query-errors: info: client @0x54c3ae7aa00 127.0.0.1#11468 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.946 query-errors: info: client @0x54c3ae7aa00 127.0.0.1#12382 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.946 query-errors: info: client @0x54c3ae78e00 127.0.0.1#18119 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.944 query-errors: info: client @0x54c3ae7aa00 127.0.0.1#47096 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.941 query-errors: info: client @0x54c3ae7aa00 127.0.0.1#25162 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:6086
12-Jan-2019 13:37:17.938 query-errors: info: client @0x54c3b1f7400 127.0.0.1#10239 (www.synology-forum.nl): query failed (SERVFAIL) for www.synology-forum.nl/IN/A at query.c:10644
12-Jan-2019 13:37:17.917 lame-servers: info: host unreachable resolving 'www.synology-forum.nl/A/IN': 2001:9a0:2001:1::53:1#53
12-Jan-2019 13:37:17.916 lame-servers: info: host unreachable resolving 'www.synology-forum.nl/A/IN': 2001:9a0:2003:1::53:3#53
12-Jan-2019 13:37:17.916 lame-servers: info: host unreachable resolving 'www.synology-forum.nl/A/IN': 2001:9a0:2002:1::53:2#53


When inspecting the BIND log in more detail I see more of these resolve issues for known existing URLs like:
Quote
12-Jan-2019 13:42:06.790   lame-servers: info: broken trust chain resolving '236.28.59.37.in-addr.arpa/PTR/IN': 213.251.188.144#53
12-Jan-2019 13:42:05.045   lame-servers: info: host unreachable resolving 'notepad-plus-plus.org/A/IN': 2603:5:2272::18#53
12-Jan-2019 13:36:50.089   lame-servers: info: broken trust chain resolving '165.225.132.31.in-addr.arpa/PTR/IN': 31.132.224.5#53
12-Jan-2019 13:36:50.026   lame-servers: info: SERVFAIL unexpected RCODE resolving '182.244.72.144.in-addr.arpa/PTR/IN': 198.208.42.12#53
12-Jan-2019 13:36:49.752   lame-servers: info: SERVFAIL unexpected RCODE resolving '182.244.72.144.in-addr.arpa/PTR/IN': 198.208.43.11#53
12-Jan-2019 13:36:49.334   lame-servers: info: host unreachable resolving 'ns2.astra-mir.ru/AAAA/IN': 2001:678:17:0:193:232:128:6#53
12-Jan-2019 13:35:53.186   lame-servers: info: host unreachable resolving 'services.sonarr.tv/A/IN': 2400:cb00:2049:1::adf5:3bb8#53
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: phoenix on January 12, 2019, 03:02:08 pm
I get a DNS_PROBE_FINISHED_NXDOMAIN DNS error on www.synology-forum.nl of which I am sure it exists.
That would undicate the domain does not exist according to your DNS resolver.

If I disable the BIND forward, Unbound resolves the URL without problems.

Big Question: what is causing Bind to not resolve the url?
That would indicate that the website is on a 'blacklist' and therefore gives the error I've quoted above, that's what a blacklist is for. You need to remove that domain name entry from your blacklist.
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: Northguy on January 12, 2019, 09:03:39 pm
I get a DNS_PROBE_FINISHED_NXDOMAIN DNS error on www.synology-forum.nl of which I am sure it exists.
That would undicate the domain does not exist according to your DNS resolver.

That is quite obvious to me and the reason I started this post.

If I disable the BIND forward, Unbound resolves the URL without problems.

Big Question: what is causing Bind to not resolve the url?
That would indicate that the website is on a 'blacklist' and therefore gives the error I've quoted above, that's what a blacklist is for. You need to remove that domain name entry from your blacklist.

This is not true. It is not in the blacklist. I did check. See my other remarks in my post. Even disabling the the Blacklists, does not make Bind resolve the URL.

2) Checked the BIND DNSBL entries at /usr/local/etc/namedb/dnsbl.inc. the URL is not on any blacklist

It seems that BIND does something strange when contacting the root name server.... It is unclear to me what or why.

Code: [Select]
12-Jan-2019 13:42:06.790   lame-servers: info: broken trust chain resolving '236.28.59.37.in-addr.arpa/PTR/IN': 213.251.188.144#53
12-Jan-2019 13:42:05.045   lame-servers: info: host unreachable resolving 'notepad-plus-plus.org/A/IN': 2603:5:2272::18#53
12-Jan-2019 13:36:50.089   lame-servers: info: broken trust chain resolving '165.225.132.31.in-addr.arpa/PTR/IN': 31.132.224.5#53
12-Jan-2019 13:36:50.026   lame-servers: info: SERVFAIL unexpected RCODE resolving '182.244.72.144.in-addr.arpa/PTR/IN': 198.208.42.12#53
12-Jan-2019 13:36:49.752   lame-servers: info: SERVFAIL unexpected RCODE resolving '182.244.72.144.in-addr.arpa/PTR/IN': 198.208.43.11#53
12-Jan-2019 13:36:49.334   lame-servers: info: host unreachable resolving 'ns2.astra-mir.ru/AAAA/IN': 2001:678:17:0:193:232:128:6#53
12-Jan-2019 13:35:53.186   lame-servers: info: host unreachable resolving 'services.sonarr.tv/A/IN': 2400:cb00:2049:1::adf5:3bb8#53
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: apiods on May 30, 2019, 09:56:08 pm
Bumping this thread as I'm having a similar issue.
For a few domain names, DNS won't resolve

Setup is: Unbound forwarding to BIND locally
I've disabled DNSBL on BIND, so no blacklist issues occurring.

I typically get this.

- Do a query:

> dig retail.santander.co.uk
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 48179
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

- Check bind logging in GUI:

QUERIES
client 127.0.0.1#44237 (retail.santander.co.uk): query: retail.santander.co.uk IN A +E(0)D

GENERAL
query-errors: info: 127.0.0.1#20066 (retail.santander.co.uk): query failed (timed out) for retail.santander.co.uk/IN/A at query.c:6651

(- Assume this is unrelated as it occurs for a lot of domains that work okay, but there's an entry for this in named.log:
lame-servers: info: host unreachable resolving 'santander.co.uk/NS/IN': 2001:502:cbe4::33#53
> I'd really like to have BIND use IPv4 only, with the '-4' option when starting named. Anyone know which file to add this in to test ?)

Trying an internet service (i.e. 1.1.1.1): query works.
Disable Unbound forwarding to BIND and get Unbound to resolve it: query works

It does seem BIND is causing the issue here ?  :-\

Thanks.
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: mimugmail on May 31, 2019, 06:16:06 am
Can you try a port-forward so Unbound isn't used? This way you can check if it's the forwarding from Unbound to BIND or BIND itself.
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: apiods on May 31, 2019, 01:08:45 pm
Can you try a port-forward so Unbound isn't used? This way you can check if it's the forwarding from Unbound to BIND or BIND itself.

I didn't try port forward, but have specified the port with dig - assume this is pretty much the same outcome ? (BIND ACL allows queries from localnet)

So, Unbound first (running on default port 53) - and Unbound is not forwarding to BIND.

1. A known working query: OK
Code: [Select]
# dig bing.com

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 34645
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; ANSWER SECTION:
bing.com.       3600    IN  A   204.79.197.200
bing.com.       3600    IN  A   13.107.21.200

Now two queries that I've had problems with: OK

Code: [Select]
# dig retail.santander.co.uk

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35260
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; ANSWER SECTION:
retail.santander.co.uk. 600 IN  CNAME   retail.lbi.santander.uk.
retail.lbi.santander.uk. 600    IN  A   193.127.211.1

Code: [Select]
# dig msecardslive.wip.hdd2.co.uk

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 26233
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; ANSWER SECTION:
msecardslive.wip.hdd2.co.uk. 30 IN  A   162.13.74.201

Second, try BIND (running on port 53530):

First, the query that should work: OK

Code: [Select]
# dig bing.com -p 53530

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 60582
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; ANSWER SECTION:
bing.com.       3600    IN  A   13.107.21.200
bing.com.       3600    IN  A   204.79.197.200

Now the two that have been failing: NOT OK

Code: [Select]
# dig retail.santander.co.uk -p 53530

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 48374
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

Code: [Select]
# dig msecardslive.wip.hdd2.co.uk -p 53530

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 21906
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

So, seems Unbound works fine, but BIND has issues ??

Any suggestions ??

Thanks.
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: mimugmail on May 31, 2019, 03:55:22 pm
tcpdump on wan with port 53 and check the packets
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: apiods on June 03, 2019, 10:10:26 am
tcpdump on wan with port 53 and check the packets

This is what I get from BIND:

- Query for 'retail.santander.co.uk': SERVFAIL

Code: [Select]
~  dig retail.santander.co.uk -p 53530

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 52774
;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

- packet capture

Code: [Select]
08:31:24.806470 IP 12.12.12.12.49314 > 209.112.114.33.53: 14635% [1au] A? retail.santander.co.uk. (63)
08:31:24.815639 IP 209.112.114.33.53 > 12.12.12.12.49314: 14635*- 1/0/1 CNAME retail.lbi.santander.uk. (86)
08:31:24.817146 IP 12.12.12.12.64821 > 209.112.114.33.53: 8337% [1au] NS? lbi.santander.uk. (57)
08:31:24.825590 IP 209.112.114.33.53 > 12.12.12.12.64821: 8337- 0/2/3 (113)
08:31:24.826504 IP 12.12.12.12.54931 > 193.127.252.1.53: 43479% [1au] NS? lbi.santander.uk. (57)
08:31:25.655526 IP 12.12.12.12.63628 > 193.127.253.1.53: 50039% [1au] NS? lbi.santander.uk. (57)
08:31:26.482757 IP 12.12.12.12.57190 > 209.112.114.33.53: 61207% [1au] AAAA? ns2.santander.uk. (57)
08:31:26.482770 IP 12.12.12.12.51897 > 193.127.253.1.53: 52624% [1au] NS? lbi.santander.uk. (57)
08:31:26.483226 IP 12.12.12.12.52548 > 209.112.114.33.53: 33151% [1au] AAAA? ns1.santander.uk. (57)
08:31:26.491568 IP 209.112.114.33.53 > 12.12.12.12.57190: 61207*- 0/1/1 (107)
08:31:26.494318 IP 209.112.114.33.53 > 12.12.12.12.52548: 33151*- 0/1/1 (107)
08:31:27.293382 IP 12.12.12.12.63310 > 193.127.252.1.53: 4092% [1au] NS? lbi.santander.uk. (57)
08:31:28.104648 IP 12.12.12.12.59289 > 193.127.253.1.53: 46038% [1au] NS? lbi.santander.uk. (57)
08:31:28.909510 IP 12.12.12.12.62629 > 193.127.252.1.53: 1571% [1au] NS? lbi.santander.uk. (57)
08:31:29.760404 IP 12.12.12.12.58994 > 193.127.253.1.53: 59123% [1au] NS? lbi.santander.uk. (57)
08:31:31.459345 IP 12.12.12.12.51256 > 193.127.252.1.53: 51945% [1au] NS? lbi.santander.uk. (57)
08:31:33.146285 IP 12.12.12.12.65330 > 193.127.253.1.53: 17122% [1au] NS? lbi.santander.uk. (57)
08:31:34.818622 IP 12.12.12.12.62371 > 193.127.253.1.53: 20675% [1au] A? retail.lbi.santander.uk. (64)
08:31:34.834624 IP 193.127.253.1.53 > 12.12.12.12.62371: 20675*- 1/0/1 A 193.127.211.1 (68)

Note: the dig query returns SERVFAIL before the packet capture shows the A record getting returned.
Looks like it wants to do AAAA lookups for the name servers ns1|ns2.santander.uk first?
And then it wants to do NS queries to the auth DNS servers for lbi.santander.uk. - and the auth DNS servers do not respond to those queries (typical load balancer behaviour!)


- Query for 'retail.santander.co.uk' again: WORKS
(BIND didn't seem to cache the A record from the previous query, so goes direct to the Auth DNS server again and gets the response ??)

Code: [Select]
~  dig retail.santander.co.uk -p 53530

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45405
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;retail.santander.co.uk.        IN  A

;; ANSWER SECTION:
retail.santander.co.uk. 451 IN  CNAME   retail.lbi.santander.uk.
retail.lbi.santander.uk. 600    IN  A   193.127.211.1

Code: [Select]
08:33:53.960238 IP 12.12.12.12.62664 > 193.127.253.1.53: 35357% [1au] A? retail.lbi.santander.uk. (64)
08:33:53.976288 IP 193.127.253.1.53 > 12.12.12.12.62664: 35357*- 1/0/1 A 193.127.211.1 (68)


Try the same with Unbound, works straight away. Seems to take a slightly different approach to resolving the name ??

Code: [Select]
~  dig retail.santander.co.uk

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48307
;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 1

;; QUESTION SECTION:
;retail.santander.co.uk.        IN  A

;; ANSWER SECTION:
retail.santander.co.uk. 600 IN  CNAME   retail.lbi.santander.uk.
retail.lbi.santander.uk. 600    IN  A   193.127.211.1

Code: [Select]
08:46:35.933375 IP 12.12.12.12.43775 > 69.36.145.33.53: 2739% [1au] A? retail.santander.co.uk. (51)
08:46:35.947940 IP 69.36.145.33.53 > 12.12.12.12.43775: 2739*- 1/0/1 CNAME retail.lbi.santander.uk. (86)
08:46:35.948862 IP 12.12.12.12.15119 > 209.112.114.33.53: 53600% [1au] A? lbi.santander.uk. (45)
08:46:35.958657 IP 209.112.114.33.53 > 12.12.12.12.15119: 53600- 0/2/3 (113)
08:46:35.959638 IP 12.12.12.12.25646 > 193.127.253.1.53: 15001% [1au] A? retail.lbi.santander.uk. (52)
08:46:35.976688 IP 193.127.253.1.53 > 12.12.12.12.25646: 15001*- 1/0/1 A 193.127.211.1 (68)
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: apiods on June 03, 2019, 10:11:47 am
deleted
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: apiods on June 05, 2019, 10:19:42 am
Can anyone confirm how to test out starting BIND's named daemon with '-4' (IPv4 only) - interested to see if this makes any difference.
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: mimugmail on June 05, 2019, 10:30:37 am
via console:

killall named
/usr/local/sbin/named -u bind -c /usr/local/etc/namedb/named.conf -4
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: apiods on June 05, 2019, 11:05:10 am
via console:

killall named
/usr/local/sbin/named -u bind -c /usr/local/etc/namedb/named.conf -4

Thanks - made no difference  >:(

Seems the issue is that BIND is trying to query the auth DNS servers (assuming they're some type of load balancer) to confirm the NS records for 'lbi.santander.uk', but those LBs don't respond to NS/SOA queries.

Code: [Select]
$ dig @193.127.252.1 lbi.santander.uk. NS
;; connection timed out; no servers could be reached

BIND then gives up on the NS query and just asks the load balancer for the original A record query - which works:

Code: [Select]
$ dig @193.127.252.1 retail.lbi.santander.uk.

;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 13187
;; flags: qr aa rd ad; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1
;; WARNING: recursion requested but not available

;; ANSWER SECTION:
retail.lbi.santander.uk. 600 IN A 193.127.211.1

Unbound seems to handle this differently and doesn't get hung up with trying to resolve NS record lookups.

Not sure where to go with this ... I want to use the BIND RPZ feature and other good options, but BIND is failing to resolve certain names which is impacting people !
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: mimugmail on June 05, 2019, 12:20:49 pm
This affects only this domain .. it's not BINDs fault that the NS doesn't reply.
You can also use dnscrypt-proxy if you need DNSBL.
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: apiods on June 05, 2019, 01:26:22 pm
This affects only this domain .. it's not BINDs fault that the NS doesn't reply.
You can also use dnscrypt-proxy if you need DNSBL.

It's not just this domain, I've found others that don't work. Seems to be that domains using load balancers that don't respond to standard NS/SOA queries are affected.

Agree - it's not BINDs fault, it is the load balancers (auth dns servers) issue. However, I'm only in control of BIND, not the dodgy load balancers  ;)

Will have a look at dnscrypt-proxy - thanks  :)
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: apiods on June 05, 2019, 02:07:42 pm
You can also use dnscrypt-proxy if you need DNSBL.

Installed dnscrypt-proxy plugin, configured and forwarding Unbound to it. Works a treat so far.  :D
Thanks

Code: [Select]
$ dig retail.lbi.santander.uk -p 5353 +short
193.127.210.129
Title: Re: Help request: how to debug DNS resolve error in Unbound<>Bind setup
Post by: mimugmail on June 05, 2019, 02:13:56 pm
Perfect!  8)