Dnsmasq+Unbound observations in 25.1.7

Started by OPNenthu, May 19, 2025, 07:13:28 PM

Previous topic - Next topic
I looked at the config, but everything I thought could cause this was harmless, like the blocklists, hide identify, hide version. I tried enabling those here and it still worked fine. The only time I saw similar timeout problems was with 25.1.7 when my static reservations were gone.

Ah, interesting, the thing with Windows - I only tried from a Linux client.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

May 20, 2025, 07:48:42 PM #31 Last Edit: May 20, 2025, 09:20:04 PM by OPNenthu
OPNsense seems to combine the Unbound 'Query Forwarding' and 'DNS over TLS' configs into one /var/unbound/etc/dot.conf file.  So I fed this to ChatGPT for its opinion.  It made two recommendations.

For reference here is the original file:

server:
  do-not-query-localhost: no

# Forward zones
forward-zone:
  name: "1.168.192.in-addr.arpa."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "20.168.192.in-addr.arpa."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "30.168.192.in-addr.arpa."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "40.168.192.in-addr.arpa."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "50.168.192.in-addr.arpa."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "60.168.192.in-addr.arpa."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "guest.internal."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "h1.home.arpa."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "home.internal."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "iot.internal."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "lab.internal."
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "vpn.internal."
  forward-addr: 127.0.0.1@53053

# Forward zones over TLS
server:
  tls-cert-bundle: /usr/local/etc/ssl/cert.pem

forward-zone:
  name: "."
  forward-tls-upstream: yes
  forward-addr: 9.9.9.9@853#dns.quad9.net
  forward-addr: 149.112.112.112@853#dns.quad9.net
  forward-addr: 2620:fe::fe@853#dns.quad9.net
  forward-addr: 2620:fe::9@853#dns.quad9.net

Feedback #1:

You cannot view this attachment.

I have yet to look into whether or not Dnsmasq has these server=/domain/ directives (not entirely clear on that), but this also gave me the idea to clean up the Forwarding in Unbound.  I have condensed it to this (although it didn't help).

server:
  do-not-query-localhost: no

# Forward zones
forward-zone:
  name: "168.192.in-addr.arpa"
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "h1.home.arpa"
  forward-addr: 127.0.0.1@53053
forward-zone:
  name: "internal"
  forward-addr: 127.0.0.1@53053



# Forward zones over TLS
server:
  tls-cert-bundle: /usr/local/etc/ssl/cert.pem

forward-zone:
  name: "."
  forward-tls-upstream: yes
  forward-addr: 9.9.9.9@853#dns.quad9.net
  forward-addr: 149.112.112.112@853#dns.quad9.net
  forward-addr: 2620:fe::fe@853#dns.quad9.net
  forward-addr: 2620:fe::9@853#dns.quad9.net

----

EDIT: I looked in /usr/local/etc/dnsmasq.conf for any 'server=' directives, but found none.  I only see the domains, so I'm not sure what GPT is referring to here.

Near the top of the file there is one reference to 'h1.home.arpa', the system default domain for 192.168.1.0/24.  Then further down there are the ranges that I defined for the VLANs.

...
dhcp-fqdn
domain=h1.home.arpa

...

dhcp-range=tag:vlan0.1,192.168.1.100,192.168.1.199,86400

domain=h1.home.arpa,192.168.1.100,192.168.1.199
dhcp-range=tag:vlan0.20,192.168.20.100,192.168.20.199,86400

domain=guest.internal,192.168.20.100,192.168.20.199
dhcp-range=tag:vlan0.30,192.168.30.100,192.168.30.199,86400

domain=home.internal,192.168.30.100,192.168.30.199
dhcp-range=tag:vlan0.40,192.168.40.100,192.168.40.199,86400

domain=iot.internal,192.168.40.100,192.168.40.199
dhcp-range=tag:vlan0.50,192.168.50.100,192.168.50.199,86400

domain=vpn.internal,192.168.50.100,192.168.50.199
dhcp-range=tag:vlan0.60,192.168.60.100,192.168.60.199,86400

domain=lab.internal,192.168.60.100,192.168.60.199

I'm going to ignore this feedback because anyway the issue seems to be with Unbound not forwarding correctly, not with Dnsmasq.

Feedback #2:

You cannot view this attachment.

I can't do anything about this one.  I tried editing the file directly with 'vi' to combine the server blocks as suggested, however on config reload this file just gets overwritten.  There's no way to persist it, so I can't test.

Is this suggestion correct, or is ChatGPT hallucinating?


----

EDIT: I think this is allowed as per the Unbound manual: https://unbound.docs.nlnetlabs.nl/en/latest/manpages/unbound.conf.html#server-options

QuoteFile Format

There must be whitespace between keywords. Attribute keywords end with a colon ':'. An attribute is followed by a value, or its containing attributes in which case it is referred to as a clause. Clauses can be repeated throughout the file (or included files) to group attributes under the same clause.

... which makes sense, else there would be a lot of broken setups and complaints :)
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE

May 20, 2025, 07:52:59 PM #32 Last Edit: May 20, 2025, 08:13:51 PM by OPNenthu
It also said this, but I think this is bullsh*t.  It may be required by RFC for technical correctness, but I know that Unbound handles this internally.

Others don't include a trailing dot (.) and it's working for them.  I tried it anyway.  No difference.

You cannot view this attachment.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE

@meyergru could you quickly provide some clarity on ranges?

If we set a DHCP range as per the docs of, say, 192.168.1.10 - 192.168.1.100 for dynamic leases, and then set a reservation up for 'Host A' at address 192.168.1.200 - do we need to set up a separate range but with mode 'static' that incorporates the desired reserved addresses?

The docs suggest we need to for DHCP v4, but the OP suggests they haven't done so and addresses are being set/reserved - what are the side effects of not doing so / why do we need a range set?

Just asking as this is contrary to what people are used to with Kea etc - the direction there was to ensure any reservations were explitly OUTSIDE of any defined ranges on the DHCP server...

Thanks.

Please create separate threads for separate issues, this one is about tracking the weird dns forwarding issues of the OP.
Hardware:
DEC740

Quote from: Monviech (Cedrik) on May 20, 2025, 09:46:43 PMPlease create separate threads for separate issues, this one is about tracking the weird dns forwarding issues of the OP.

Understood - not an issue, as such, was just seeking to understand if it might be contributing to this issue as the OP indicated that this is one area where they have deviated from the docs direction- hence asking the question in this thread. Your reply suggests it is not a factor, so noted - thanks.

@Taunt9930 - where did you see that mentioned in the docs?

I don't think the UI allows to create static ranges.  You get into a circular error loop if you try (see screenshots).  So the only way to create something 'static' at the moment is to omit both the ending address and the domain part- which makes it unclear.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE

May 20, 2025, 10:16:52 PM #37 Last Edit: May 20, 2025, 10:20:31 PM by Taunt9930
Quote from: OPNenthu on May 20, 2025, 10:06:00 PM@Taunt9930 - where did you see that mentioned in the docs?


Second Paragraph of the DHCP Reservations section. But, as indicated above by Cedrik, not considered a contributing factor so a red herring perhaps!


Thanks- two issues I think:

1) The doc wording isn't clear.  Maybe they mean to say IPv4 dynamic leases require a DHCP range defined, and the range should be set to static if it only serves reservations.  (?)

2) Might be a UI bug.  If this is indeed required in Dnsmasq, then there's no way to do it.

Good that you brought it up- I'd like to know the answer as well.  In either case I think @Monviech was right earlier that my issue is with Unbound misforwarding things.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE

Quote from: Taunt9930 on May 20, 2025, 09:41:02 PMIf we set a DHCP range as per the docs of, say, 192.168.1.10 - 192.168.1.100 for dynamic leases, and then set a reservation up for 'Host A' at address 192.168.1.200 - do we need to set up a separate range but with mode 'static' that incorporates the desired reserved addresses?

The docs suggest we need to for DHCP v4, but the OP suggests they haven't done so and addresses are being set/reserved - what are the side effects of not doing so / why do we need a range set?

Just asking as this is contrary to what people are used to with Kea etc - the direction there was to ensure any reservations were explitly OUTSIDE of any defined ranges on the DHCP server...

From my own experience, if you setup a DHCP range and then set a static host reservation for an IP outside of that range (but still in that same subnet), then the static and dynamic DHCP addressing will work just fine. I haven't found a second, separate "static" range to be necessary for that subnet.

If you're setting up a subnet with only static addresses, then creating a DHCP range with the mode "static" will suffice for this and not require you to add a range (i.e. start and end).

Obviously, for dynamic only DHCP you can just setup the DHCP range and let dnsmasq pull IPs from that available pool.

The behavior @Drinyth describes is the same as my experience. I have VLANs that are configured with all 3 scenarios (static only, dynamic only and a mix).

May 21, 2025, 12:28:26 AM #41 Last Edit: May 21, 2025, 12:30:21 AM by OPNenthu
I think I found it, but it's not great news- DNS might be broken in 25.1.7. 

There was a hint in this post: https://forum.opnsense.org/index.php?topic=47126.msg237665;topicseen#msg237665.  @Monviech could you check the recent commits?


My Starting point:
 - No system name servers in System->Settings->General
 - TLS upstream (Quad9) in Unbound
 - Observed: DNS misrouting and timeouts

Second test:
 - Disable TLS upstreams in Unbound (it should become a recursive resolver)
 - Observed: DNS misrouting and timeouts

Third test:
 - Add a name server to System->Settings->General (I chose 1.1.1.1)
 - Leave "Use System Nameservers" unchecked in Unbound
 - Expected: Unbound continues to act as recursive resolver
 - Observed: Unbound forwards to 1.1.1.1:53 (confirmed in pf logs)
 - Observed: DNS timeouts go away, although external queries still get forwarded to internal zones first.

Fourth test:
 - Leave 1.1.1.1 configured in System->Settings->General
 - Re-enable the TLS upstreams in Unbound
 - Expected: Unbound forwards to DoT, doesn't use 1.1.1.1
 - Observed: Both DoT and system nameservers being used.  Also, external queries still getting forwarded to internal zones.

You cannot view this attachment.

In conclusion, system nameservers are required to resolve the timeouts, but they are clobbering the DoT forwarding and being used when they shouldn't.  Also, all external queries are first getting forwarded to the system default domain (h1.home.arpa) in all cases.  None of the other internal zones are being used.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE

Actually, I have system nameservers defined, but I expected them not to be used when I configured DoT.
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

Check your pf logs @meyergru
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE

Yes, with or without DoT configured, I can see outbound traffic on port 53 with strict DoT on. Switching "Use System Nameservers" changed nothing on that. The targets always were my system resolvers. IDK why this is, because even if DoT is disabled, I would expect Unbound to act as DNS resolver, not using any upstream servers.

And then - oh, well: https://github.com/opnsense/core/issues/7639 and: https://github.com/NLnetLabs/unbound/issues/451
Intel N100, 4* I226-V, 2* 82559, 16 GByte, 500 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+