dnsmasq and query forwarding

Started by tessus, May 25, 2025, 03:22:59 PM

Previous topic - Next topic
Thanks for testing it, it looked like a neat spot to put it.

Now we should have most bases covered.
Hardware:
DEC740

@meyergru post #13 - querying dnsmasq directly using "dig" from my laptop (MacBook) with the port (53053) and server (192.168.31.1 in my case) worked as expected, short and fully qualified names, including reverse lookups. I confirmed /var/etc/dnsmasq-hosts contain the expected static assignments as well.

√ ~ % dig -p 53053 @192.168.31.1 kmbpro

; <<>> DiG 9.10.6 <<>> -p 53053 @192.168.31.1 kmbpro
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36073
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;kmbpro.                IN    A

;; ANSWER SECTION:
kmbpro.            1    IN    A    192.168.31.20

;; Query time: 9 msec
;; SERVER: 192.168.31.1#53053(192.168.31.1)
;; WHEN: Tue May 27 20:02:40 EDT 2025
;; MSG SIZE  rcvd: 51

dig -p 53053 @192.168.31.1 kmbpro.mgmt.internal

; <<>> DiG 9.10.6 <<>> -p 53053 @192.168.31.1 kmbpro.mgmt.internal
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22263
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;kmbpro.mgmt.internal.        IN    A

;; ANSWER SECTION:
kmbpro.mgmt.internal.    1    IN    A    192.168.31.20

;; Query time: 9 msec
;; SERVER: 192.168.31.1#53053(192.168.31.1)
;; WHEN: Tue May 27 20:42:43 EDT 2025
;; MSG SIZE  rcvd: 65

dig -p 53053 @192.168.31.1 -x 192.168.31.20

; <<>> DiG 9.10.6 <<>> -p 53053 @192.168.31.1 -x 192.168.31.20
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10355
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;20.31.168.192.in-addr.arpa.    IN    PTR

;; ANSWER SECTION:
20.31.168.192.in-addr.arpa. 1    IN    PTR    kMBPro.mgmt.internal.

;; Query time: 9 msec
;; SERVER: 192.168.31.1#53053(192.168.31.1)
;; WHEN: Tue May 27 20:02:06 EDT 2025
;; MSG SIZE  rcvd: 89

doing the same query without pointing directly to dnsmasq always fails on short names and reverse lookups, whereas FQDN works occasionally, otherwise also fails with NXDOMAIN
dig kmbpro

; <<>> DiG 9.10.6 <<>> kmbpro
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 53396
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;kmbpro.                IN    A

;; AUTHORITY SECTION:
.            1562    IN    SOA    a.root-servers.net. nstld.verisign-grs.com. 2025052702 1800 900 604800 86400

;; Query time: 17 msec
;; SERVER: 192.168.31.1#53(192.168.31.1)
;; WHEN: Tue May 27 20:45:49 EDT 2025
;; MSG SIZE  rcvd: 110

dig kmbpro.mgmt.internal

; <<>> DiG 9.10.6 <<>> kmbpro.mgmt.internal
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 22874
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;kmbpro.mgmt.internal.        IN    A

;; ANSWER SECTION:
kmbpro.mgmt.internal.    1    IN    A    192.168.31.20

;; Query time: 11 msec
;; SERVER: 192.168.31.1#53(192.168.31.1)
;; WHEN: Tue May 27 20:47:56 EDT 2025
;; MSG SIZE  rcvd: 65

√ ~ % dig -x 192.168.31.20   

; <<>> DiG 9.10.6 <<>> -x 192.168.31.20
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 43607
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;20.31.168.192.in-addr.arpa.    IN    PTR

;; AUTHORITY SECTION:
168.192.in-addr.arpa.    10800    IN    SOA    localhost. nobody.invalid. 1 3600 1200 604800 10800

;; Query time: 16 msec
;; SERVER: 192.168.31.1#53(192.168.31.1)
;; WHEN: Tue May 27 20:48:10 EDT 2025
;; MSG SIZE  rcvd: 114

I believe I have things configured consistent with the online doc examples (see screenshots below)

Am I missing something in unbound configuration or is this a possible bug?







N5105  4GB | 250GB | 2x2.5GbE i226-v

May 28, 2025, 08:48:31 AM #17 Last Edit: May 28, 2025, 09:18:54 AM by tessus
Quote from: Monviech (Cedrik) on May 26, 2025, 08:10:58 PMYou can patch it in from the opnsense shell

Ah, nice. Thanks.

One last question: If I replace ISC/Unbound with dnsmasq and use dnsmasq to resolve my local hostnames, will reverse lookup also work? I never got an answer to my previous questions in other topics whether this works or not:

nslookup mymachine.lan.internal returns 192.168.2.5
nslookup 192.168.2.5 returns mymachine.lan.internal.

P.S.: After all these discussions, maybe it is time for me to re-think my DNS strategy.
Currently I am using my pi-hole cluster directly (or via redirected means) and use opnsense's dns for local address resolution (via conditional forwarding from pihole). I can't recall in detail why I set it up this way years ago, but there must have been a very good reason. I faintly remember it was a workaround for an issue with pihole or opnsense (or a combination of both).
Anyhoo, IMO it is a better architecture to do it the other way around: use opnsense's dns and use the pi-hole cluster as upstream DNS. I guess I have some thinking and planning to do. I probably will change to this architecture when I migrate away from ISC/Unbound.

Thanks for all the interesting discussions around this topic.

May 28, 2025, 09:50:45 AM #18 Last Edit: May 28, 2025, 01:15:04 PM by meyergru
@stumper: Judging from your statement about /var/etc/dnsmasq-hosts, I assume you applied the patches? But the last patch changes something in how this works by adding a "local" flag to the DHCP host entries.

I also had "Require domain", "Do not forward private reverse lookups" and "DHCP authoritative" checked in my DNSmasq settings, but I currently have deactivated DNSmasq and cannot test it. Maybe you should wait until the next release comes out.
 
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

There is a lot happening and there are quite some patches lined up that it is getting harder to add anything specifically. Waiting for a release is indeed a good choice.

Also, all hosts in the dnsmasq hosts file get turned into A, AAAA and PTR Records automatically by dnsmasq. So things /should/ work automagically.
Hardware:
DEC740

@monviech and @meyergru thank your for your responses and will wait for next release

kind regards
N5105  4GB | 250GB | 2x2.5GbE i226-v

Just to say that I am having the same issue as @stumper.  DNS requests from Unbound to DNSmasq for local hostnames work intermittently.  I have applied both patches but still the same issue.  By enabling DNS query logging in DNSmasq, I've determined that when the problem occurs, Unbound is not forwarding the local queries to DNSmasq, instead trying to resolve them recursively (which results in NXDOMAIN since they are local names) despite the Unbound Forwarding configuration.  There are no queries logged in the DNSmasq log.  When it works, I do see the queries from Unbound in the DNSmasq log as expected.

So it seems that the problem is on the Unbound side, not DNSmasq.

Do these queries ask for names that are proper FQDNs including the local domain for which you configured the forwarding? If they ask for unqualified host names the behaviour you observe is expected. Servers do not append search/default domains. The client's resolver library is supposed to do that.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Yes, FQDNs for my local domain (which I've defined as home.lan).  Sometimes these queries are forwarded properly from Unbound to DNSmasq, other times they are not.  A restart of Unbound temporarily fixes the issue.

Technically in Windows, asking for unqualified host names is OK because Windows automatically appends the DHCP-configured search suffix (in my case home.lan).  But I'm querying the FQDNs anyway just to make sure.

This leads to all sorts of LAN issues.  For example, I have SMB shares configured on my PCs that point to the local name of my NAS storage.  These shares don't work when the DNS lookups aren't working.

@stumper: first try to help and post here, so bear with me

1. Reverse lookups: dig -p 53053 @192.168.31.1 -x 192.168.31.20
your reverse resolution forward entries in unbound are probably wrong: I guess you wanna change *.198.* to *.192.in-addr.arpa . Furthermore your are probably better off with a single 168.192.in-addr.arpa. as I doubt you want to individually configure this on host level in your setup.

2. Forward lookups non-fqdn: dig -p 53053 @192.168.31.1 kmbpro
AFAIK dig doesn't add search domains (as nslookup would do) and kmbpro is something else than kmbpro.mgmt.internal . so unbound doesn't know to forward it to dnsmasq:53053 and fails resolving it on its own. just try dig kmbpro.mgmt.internal or use nslookup (and make sure your resolv.conf is ok on search domains)

3. Forward fqdn lookups working just occassionally:
I had the same issue with fqdns in my setup. digging a bit in the logs it seemed as if unbound and dnsmasq were ping ponging in some situations until something broke (esp. on AAAA entries). I was able to solve it in my setup by telling dnsmasq not to perform any further upstream resolution tries (create a file with the content as shown below + restart dnsmasq)
# cat /usr/local/etc/dnsmasq.conf.d/01_no-resolv.conf
no-resolv

I haven't tried the patches meyerguru and monviech have recommended but I guess they won't change anything about 1. and 2. .

@cinergi I understand why this is a problem. :-) So it looks like a bug. Maybe open an issue on github.

I don't use DNSmaq and probably never will. Kea and Unbound it is for me. I do not understand the motivation to bring in this piece of software, to be honest.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

May 28, 2025, 07:55:30 PM #26 Last Edit: May 28, 2025, 07:58:09 PM by meyergru
I think the problem is with the DNSmasq / Unbound interaction: I have observed, that when you ask DNSmasq for names it cannot resolve and which are not considered "local" (i.e. that DNSmasq does not think it is authoritative for), it returns REFUSED instead of the usual NXDOMAIN.

Alas, Unbound turns those answers into SERVFAIL and then thinks DNSmasq is broken. It then stops asking it for a short while, despite the custom forwarding it tells it do do so.

It all depends upon keeping DNSmasq from ever returning REFUSED answers. The problem here is that, e.g., Windows adds a DNS search domain to any DNS name it is asking for. So, if you ask for www.google.com, it may ask DNSmasq for www.google.com.internal. If that name is forwarded to DNSmasq, but not one of the "local" domains, the problem will occur.

@Monviech has changed the scheme on how to determine the "local" domains by his latest patch once again. It requires the user to mark at least one DHCP host entry from each forwarded domain to be marked as "local" (a new flag introduced by the commit).

But still: that commit is also not yet part of any release and you have to apply both previous patches in order first, therefore I am not showing how to do it. Just keep your patience and hope it will work. I have switched back to ISC DHCP / Unbound for the time being as I was struck by the same problem.

Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

Why can't we get proper RFC 2136 integration of e.g. Kea and BIND? Problem solved. There is no need to have at least three different DNS servers. BIND does everything a DNS server can do.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

May 28, 2025, 07:59:48 PM #28 Last Edit: May 28, 2025, 08:01:30 PM by cinergi
Quote from: medivh on May 28, 2025, 07:51:29 PM3. Forward fqdn lookups working just occassionally:
I had the same issue with fqdns in my setup. digging a bit in the logs it seemed as if unbound and dnsmasq were ping ponging in some situations until something broke (esp. on AAAA entries). I was able to solve it in my setup by telling dnsmasq not to perform any further upstream resolution tries (create a file with the content as shown below + restart dnsmasq)
# cat /usr/local/etc/dnsmasq.conf.d/01_no-resolv.conf
no-resolv

I haven't tried the patches meyerguru and monviech have recommended but I guess they won't change anything about 1. and 2. .

I think that's exactly what one of those patches does: adds no-resolv to the default DNSmasq configuration in /usr/local/etc/dnsmasq.conf.  Actually I don't know if it's added by those patches, but no-resolv is definitely there in the config file.

May 28, 2025, 08:05:07 PM #29 Last Edit: May 28, 2025, 08:11:07 PM by meyergru
Quote from: medivh on May 28, 2025, 07:51:29 PMI haven't tried the patches meyerguru and monviech have recommended but I guess they won't change anything about 1. and 2. .

No, they don't - those problems must be addressed separately. And also, the first two patches are superseded by the latest patch, as explained.

@Patrick: I was not recommending DNSmasq (especially not at this time) - I will gladly use Kea once the missing options are build in... ;-)

P.S.: If you can live with upstream DNS servers without a full DNS resolver, just use DNSmasq alone, then none of the discussed interaction problems occurs.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+