Update: 2025/Mar/14 16:00 hrs
Please see Post 11 for details to recreate this issue
dnsmasq option for "Query DNS servers sequentially" is not working as expected in 25.1.b_20-amd64
A fairly simple setup:192.168.1.111 is a PiHole machine on the same LAN
8.8.8.8 is the external Google DNS
The two DNS servers are defined in
System > Settings > General in this order
- 192.168.1.111
- 8.8.8.8
The underlying idea is that OPNsense should first try and resolve the DNS query using PiHole (192.168.1.111) and ONLY if it fails, should then resolve the query using the next DNS server i.e. Google (8.8.8.8)
Working behaviour:- dnsmasq receives the queries from clients.
- DNS queries are forwarded to 192.168.1.111
- No queries are forwarded to 8.8.8.8 (as the query DNS server sequentially is set).
- Verified this with dnsmasq logs. All good.
- Just as information, if the '
Query DNS server sequentially' flag is unset, queries are forwarded to both upstream servers, exactly as expected.
All good so far.
Problematic behaviour:- Turn the PiHole machine (192.168.1.111) off or remove network cable (i.e. make PiHole inaccessible)
- dnsmasq should forward query to 192.168.1.111 (It does, all good)
- On failing to resolve the query (i.e. timeout), dnsmasq should now forward the query to 8.8.8.8, but it never does.
- No query is ever sent to 8.8.8.8
- Essentially, all DNS queries from clients now start to fail and dnsmasq never forwards any queries to the next DNS server (8.8.8.8)
As info, this setup was working fine until 24.7 (from what I recall)
-----------------------------------------
Additional information:- Unbound is running on port 53535 (I know not needed, but should not be relevant for the use case)
- Also using a custom dnsmasq config file (/usr/local/etc/dnsmasq.conf.d/0-myfile.conf).
- It contains two entries so that PiHole can identify the client correctly.
add-mac
add-subnet=32,128
Bump:
Anyone else seeing this behaviour in dnsmasq ?
Hmm, not that I know of. Also wrong in 24.7.x? At first glance -- if this is an actual issue -- I would still consider Dnsmasq as the culprit.
The only recent change in the binary is https://github.com/opnsense/ports/commit/74191b13c03 but this is a fix for dhcp-relay from upstream itself.
Cheers,
Franco
Hi Franco,
This 'bug/issue' has likely been introduced sometime in 24.7.x, as I was running this setup for many months without any issue and the failover principle always worked.
I then switched dover to AGH at some point, so cannot pinpoint in which 24.7.x build this may have crept in.
It is definitely not working as expected in 25.1 beta.
Happy so send any logs or any other information required.
As a side question, is there any plan to build DHCP into dnsmasq itself?
@franco:
Are you aware of the default timeout as used by dnsmasq (in OPNsense) for its forwarded query? Or any way of finding out.
I think there may be an issue with the default timeout or some other code base that is causing dnsmasq not to use the next available server (if the first fails).
Hmm, google is full of this particular question if you ask me...
https://community.ui.com/questions/DNS-forwarding-Dnsmasqs-strict-order-option-ignored/ef5e9a3c-e0d5-4e0e-992f-deb8888a40f9
https://unix.stackexchange.com/questions/500900/dnsmasq-dns-nameserver-priority-parameters-strict-order
etc.
Something about reverse order specification? oO
Quote from: gspannu on January 08, 2025, 09:49:35 PMHi Franco,
This 'bug/issue' has likely been introduced sometime in 24.7.x, as I was running this setup for many months without any issue and the failover principle always worked.
I then switched dover to AGH at some point, so cannot pinpoint in which 24.7.x build this may have crept in.
It is definitely not working as expected in 25.1 beta.
Happy so send any logs or any other information required.
As a side question, is there any plan to build DHCP into dnsmasq itself?
I agree, I believe it came in with 24.7.12. All was fine until I updated then had a dnsmasq issue of some sort. I also noticed that my advanced settings conf file in the dnsmasq.conf.d folder was wiped out after the update. This would have just simplified my client reporting to pihole. Something else must have also happened since my DNS wasn't working at all.
Quote from: opensourcefan on January 18, 2025, 06:49:36 AMQuote from: gspannu on January 08, 2025, 09:49:35 PMHi Franco,
This 'bug/issue' has likely been introduced sometime in 24.7.x, as I was running this setup for many months without any issue and the failover principle always worked.
I then switched dover to AGH at some point, so cannot pinpoint in which 24.7.x build this may have crept in.
It is definitely not working as expected in 25.1 beta.
Happy so send any logs or any other information required.
As a side question, is there any plan to build DHCP into dnsmasq itself?
I agree, I believe it came in with 24.7.12. All was fine until I updated then had a dnsmasq issue of some sort. I also noticed that my advanced settings conf file in the dnsmasq.conf.d folder was wiped out after the update. This would have just simplified my client reporting to pihole. Something else must have also happened since my DNS wasn't working at all.
You may be right that something definitely changed around 24.7.x
I also recall that in earlier versions (24.7.?) the check-box settings 'Query DNS servers sequentially' did not work at all; the only way to make this work was to write 'strict-order' in a custom conf file.
However, now the checkbox setting does work, but now dnsmasq doe snot utilise the next server.
There is definitely something that has happened over the last few updates...
Hopefully @Franco/ others will look into these.
Sequential server queries continue to work normally for me under 24.7.12
> I agree, I believe it came in with 24.7.12.
Wow, but how? Did you use opnsense-revert to verify which should be as easy as claiming this here? These blanket statements are not helping this move along.
Quote from: franco on January 20, 2025, 09:35:14 AM> I agree, I believe it came in with 24.7.12.
Wow, but how? Did you use opnsense-revert to verify which should be as easy as claiming this here? These blanket statements are not helping this move along.
@franco:
Just an update.
The bug is still present in the recent 24.1.2 updateChecking the option for 'Query DNS servers sequentially' ensures that queries are sent in the order of the specified dns servers. However, if the first server does not respond, the query just times out; and dnsmasq does not forward the query to the next defined dns server.
And with the plan for deprecating ISC DHCP and dnsmasq to be further updated in the 25.7 release with more DHCP options; may I request that this bug be looked into? Many thanks.
I am happy to test/ provide more information...
@franco
I think I have identified the problem....
dnsmasq in OPNsense does not behave as expected if there is a custom .conf file in the `/usr/local/etc/dnsmasq.conf.d` folder.
A simple test to recreate the bug:
1. Add DNS servers to the OPNsense Settings
System > Settings > General > DNS servers
Server: 192.168.99.99
Server: 192.168.22.22
Server: 8.8.4.4
Ensure that the first 2 servers are dummy (i.e. will not respond to any DNS queries) and the 3rd server is a proper DNS server
2. Set strict-order
Services > dnsmasq > Settings
Query DNS servers sequentially - Checked
3. Restart dnsmasq
4. On any client machine, do some nslookup...
e.g. nslookup bbc.com
nslookup google.com
5. After a while, nslookup queries will be resolved.
dnsmasq will try the 1st server, time out, then try the second server, timeoutl and then finally resolve on 8.8.4.4
Check dnsmasq logs
All working fine as expected until now
Now to recreate the problem
1. Create a custom configuration file in OPnsense /usr/local/etc/dnsmasq.conf.d/folder
Create a file e.g. /usr/local/etc/dnsmasq.conf.d/0-custom.conf
Add two simple entries and save the file
add-mac
add-subnet=32,128
2. Restart dnsmasq service
3. Now run the same nslookup test
On any client machine, do a nslookup...
nslookup bbc.com
nslookup google.com
Result:
dnsmasq DOES NOT go to the next server sequence
The nslookup query will eventaually timeout and not resolve.
dnsmasq does not work as expected.
4. Now delete the custom config file from /usr/local/etc/dnsmasq.conf.d/ folder
5. Run the same test again.
Result:
dnsmasq works as expected.
dnsmasq will try the first server in sequence, time out, go to the next one, time out, and will then finally resolve on 8.8.4.4
It appears that dsnmasq does not work as expected when there is a custom configuration file.
dnsmasq was working fine in early versions of 24.7.x and this new incorrect behaviour was introduced sometime in 24.7.x
I cannot recall in which exact 24.7.x version this behaviour changed, but dnsmasq used to work fine with custom configurations.
custom configurations are crucial as there is no way to send mac-addresses, IP-addresses of requesting clients to upstream dns servers without the `add-mac, add-subnet` directives defined in custom conf file.
Could I request that this be looked at and addressed please?
> dnsmasq in OPNsense does not behave as expected if there is a custom .conf file in the `/usr/local/etc/dnsmasq.conf.d` folder.
Congratulations, you played yourself?
Cheers,
Franco
Quote from: franco on March 14, 2025, 05:05:19 PM> dnsmasq in OPNsense does not behave as expected if there is a custom .conf file in the `/usr/local/etc/dnsmasq.conf.d` folder.
Congratulations, you played yourself?
Cheers,
Franco
Hi Franco,
Is there a planned fix for this at some point?
This behaviour was introduced at some point in 24.7 sub-releases,
strict-order used to work as expected with custom configurations earlier.Thanks for your support...
What exactly makes this a OPNsense bug ? You should have the same issue on FreeBSD 14.2 or any linux distro that runs dnsmasq 2.90_5
I'm not sure why you don't understand that we add custom includes features for people to use because they ask for it but they also own all the changes they do to the system. We're unable to fix something we do not have in code. Your best bet is to ask for your feature to be included into the GUI, but even then it could still be a bug or even documented quirk you're running into with Dnsmasq itself.
Cheers,
Franco
Quote from: franco on March 14, 2025, 05:29:27 PMI'm not sure why you don't understand that we add custom includes features for people to use because they ask for it but they also own all the changes they do to the system. We're unable to fix something we do not have in code. Your best bet is to ask for your feature to be included into the GUI, but even then it could still be a bug or even documented quirk you're running into with Dnsmasq itself.
Cheers,
Franco
Hi Franco,
I fully understand your logic.. and maybe the best way is to get these configs included in the GUI itself (especially, since dnsmasq features is under going development within OPnsense).
I think at least the
add-mac,
add-subnet directives should be added to the dnsmasq GUI. These are likely the most used options for anyone who runs an external pinhole, etc.
- What makes it a bit concerning is that this was working back in a version of 24.7.x and got introduced sometime last year in 2024.
Just to add to the information:
- What I recall is that at some point in 24.7.x, strict-order checkbox in the GUI was not working (had no effect); and the only way to get strict-order working was to put strict-order directive in the custom configuration.
In some sub 24.7.x release; strict-order in GUI started working as expected; and at that point I guess this behaviour got introduced -
this is my guess..
Quote from: newsense on March 14, 2025, 05:28:02 PMWhat exactly makes this a OPNsense bug ? You should have the same issue on FreeBSD 14.2 or any linux distro that runs dnsmasq 2.90_5
This issue is not seen in Debian x12 Bookworm (
at least in my quick testing)
Like said in github, please create a issue with a feature request for these options.
They seem generally useful yet if you always ask between the lines it wont get added since theres no ticket to solve.
Quote from: Monviech (Cedrik) on March 14, 2025, 08:53:26 PMLike said in github, please create a issue with a feature request for these options.
They seem generally useful yet if you always ask between the lines it wont get added since theres no ticket to solve.
As advised, I have just
created a ticket (https://github.com/opnsense/core/issues/8440) on GitHub requesting the GUI features.
Many thanks for your support.
"strict-order" does not work at all for my case:
server=127.0.0.1#5335
strict-order
server=/mydomain.com/8.8.8.8
I'd expect dnsmasq to only forward queries to 8.8.8.8 for mydomain.com here, but Unbound (running on port 5335) is used as well - mydomain.com also appears in its log file.
Did you do something apart from above config?
It might also be the combination of conditional forwarding + generic `server` directive, but this setting doesn't seem to be mature enough.
Quote from: cami09 on March 23, 2025, 03:22:56 PM"strict-order" does not work at all for my case:
server=127.0.0.1#5335
strict-order
server=/mydomain.com/8.8.8.8
I'd expect dnsmasq to only forward queries to 8.8.8.8 for mydomain.com here, but Unbound (running on port 5335) is used as well - mydomain.com also appears in its log file.
Did you do something apart from above config?
It might also be the combination of conditional forwarding + generic `server` directive, but this setting doesn't seem to be mature enough.
I think the
dnsmasq settings will (hopefully) all be sorted in next few OPNsense builds, as dnsmasq support in OPNsense is going through major development.
To assist you in your queries, could you help me understand your setup?
- Why is your exact need for strict-order?
If you are trying to specify different servers for different domains, then do this in the GUI itself. You can manage your setting in the GUI itself and should not require a custom .conf file.
- Also, if you need to specify strict-order, do this in the GUI itself (not in the custom .conf file). OPNsense does not work well with custom definitions that already exist in the GUI.
Share some more details about your end objective, and I will try and help with your setup...
If all goes well I'd expect a dnsmasq update as soon as next week.
https://www.freshports.org/dns/dnsmasq
version 2.91
Fix spurious "resource limit exceeded messages". Thanks to
Dominik Derigs for the bug report.
Fix out-of-bounds heap read in order_qsort().
We only need to order two server records on the ->serial field.
Literal address records are smaller and don't have
this field and don't need to be ordered on it.
To actually provoke this bug seems to need the same server-literal
to be repeated twice, e.g., --address=/a/1.1.1.1 --address-/a/1.1.1.1
which is clearly rare in the wild, but if it did exist it could
provoke a SIGSEGV. Thanks to Daniel Rhea for fuzzing this one.
Fix buffer overflow when configured lease-change script name
is too long.
Thanks to Daniel Rhea for finding this one.
Improve behaviour in the face of non-responsive upstream TCP DNS
servers. Without shorter timeouts, clients are blocked for too long
and fail with their own timeouts.
Set --fast-dns-retries by default when doing DNSSEC. A single
downstream query can trigger many upstream queries. On an
unreliable network, there may not be enough downstream retries
to ensure that all these queries complete.
Improve behaviour in the face of truncated answers to queries
for DNSSEC records. Getting these answers by TCP doesn't now
involve a faked truncated answer to the downstream client to
force it to move to TCP. This improves performance and robustness
in the face of broken clients which can't fall back to TCP.
No longer remove data from truncated upstream answers. If an
upstream replies with a truncated answer, but the answer has some
RRs included, return those RRs, rather than returning and
empty answer.
Fix handling of EDNS0 UDP packet sizes.
When talking upstream we always add a pseudo header, and set the
UDP packet size to --edns-packet-max. Answering queries from
downstream, we get the answer (either from upstream or local
data) If local data won't fit the advertised size (or 512 if
there's not an EDNS0 header) return truncated. If upstream
returns truncated, do likewise. If upstream is OK, but the
answer is too big for downstream, truncate the answer.
Modify the behaviour of --synth-domain for IPv6.
When deriving a domain name from an IPv6 address, an address
such as 1234:: would become 1234--.example.com, which is
not legal in IDNA2008. Stop using the :: compression method,
so 1234:: becomes
1234-0000-0000-0000-0000-0000-0000-0000.example.com
Fix broken dhcp-relay on *BSD. Thanks to Harold for finding
this problem.
Add --dhcp-option-pxe config. This acts almost exactly like
--dhcp-option except that the defined option is only sent when
replying to PXE clients. More importantly, these options are sent
in reply PXE clients when dnsmasq in acting in PXE proxy mode. In
PXE proxy mode, the set of options sent is defined by the PXE standard
and the normal set of options is not sent. This config allows arbitrary
options in PXE-proxy replies. A typical use-case is to send option
175 to iPXE. Thanks to Jason Berry for finding the requirement for
this.
Support PXE proxy-DHCP and DHCP-relay at the same time.
When using PXE proxy-DHCP, dnsmasq supplies PXE information to
the client, which also talks to another "normal" DHCP server
for address allocation and similar. The normal DHCP server may
be on the local network, but it may also be remote, and accessed via
a DHCP relay. This change allows dnsmasq to act as both a
PXE proxy-DHCP server AND a DHCP relay for the same network.
Fix erroneous "DNSSEC validated" state with non-DNSSEC
upstream servers. Thanks to Dominik Derigs for the bug report.
Handle queries with EDNS client subnet fields better. If dnsmasq
is configured to add an EDNS client subnet to a query, it is careful
to suppress use of the cache, since a cached answer may not be valid
for a query with a different client subnet. Extend this behaviour
to queries which arrive a dnsmasq already carrying an EDNS client
subnet.
Handle DS queries to auth zones. When dnsmasq is configured to
act as an authoritative server and has an authoritative zone
configured, and receives a query for that zone _as_forwarder_
it answers the query directly rather than forwarding it. This
doesn't affect the answer, but it saves dnsmasq forwarding the
query to the recursor upstream, which then bounces it back to dnsmasq
in auth mode. The exception should be when the query is for the root
of zone, for a DS RR. The answer to that has to come from the parent,
via the recursor, and will typically be a proof-of-non-existence
since dnsmasq doesn't support signed zones. This patch suppresses
local answers and forces forwarding to the upstream recursor for such
queries. It stops breakage when a DNSSEC validating client makes
queries to dnsmasq acting as forwarder for a zone for which it is
authoritative.
Implement "DNS-0x20 encoding", for extra protection against
reply-spoof attacks. Since DNS queries are case-insensitive,
it's possible to randomly flip the case of letters in a query
and still get the correct answer back.
This adds an extra dimension for a cache-poisoning attacker
to guess when sending replies in-the-blind since it's expected
that the legitimate answer will have the same pattern of upper
and lower case as the query, so any replies which don't can be
ignored as malicious. The amount of extra entropy clearly depends
on the number of a-z and A-Z characters in the query, and this
implementation puts a hard limit of 32 bits to make resource
allocation easy. This about doubles entropy over the standard
random ID and random port combination. This technique can interact
badly with rare broken DNS servers which don't preserve the case
of the query in their reply. The first time a reply is returned
which matches the query in all respects except case, a warning
will be logged. In this release, 0x020-encoding is default-off
and must be explicitly enabled with --do-0x20-encoding. In future
releases it may default on. You can avoid a future release
changing the behaviour of an installation with --no-x20-encode.
Fix a long-standing problem when two queries which are identical
in every repect _except_ case, get combined by dnsmasq. If
dnsmasq gets eg, two queries for example.com and Example.com
in quick succession it will get the answer for example.com from
upstream and send that answer to both requestors. This means that
the query for Example.com will get an answer for example.com, and
in the modern DNS, that answer may not be accepted.
Appreciate the quick feedback from both of you.
Quote from: gspannu on March 23, 2025, 08:06:00 PM- Why is your exact need for strict-order?
If you are trying to specify different servers for different domains
Yes, exactly.
Conceptually I have two mutually exclusive cases: 1) forward to local Unbound as default, except 2) specific domains get conditional forwarding.
Optimally dnsmasq should only forward to 8.8.8.8 for mydomain.com and not use generic server=127.0.0.1#5335, if there already is a conditional forwarding match.
Unfortunatelly there seems no setting for this, so strict-order is best option we have right now.
I am using a custom config file /usr/local/etc/dnsmasq.conf.d/custom.conf as well.
Quote from: gspannu on March 23, 2025, 08:06:00 PMdo this in the GUI itself (not in the custom .conf file). OPNsense does not work well with custom definitions that already exist in the GUI.
Good idea. So basically you are suggesting
- Services -> Dnsmasq DNS -> Query DNS servers sequentially (
Enabled)
- Services -> Dnsmasq DNS -> Domains
- First entry
Domain: mydomain.com
IP address: 8.8.8.8
- Second entry
Domain: *
IP address: 127.0.0.1
Port: 5335
I somewhere read about
* / asterisk support in UI, so I'd imagine this entry will mimic server=127.0.0.1#5335?
What also would be nice to know: Are entries are processed from top-bottom or reverse? There are controversial discussions...
Regarding custom config file issues:
Did only /usr/local/etc/dnsmasq.conf.d/0-custom.conf cause problems for you, or also using default /usr/local/etc/dnsmasq.conf?
I definitely still need a custom config file, as dnsmasq's
ipset functionality cannot be configured within OPNsense web interface.
Also thanks for mentioning the changelog, good to know there are continuing improvements. Not sure though, what the relevant feature/fix for OP case is.
Quote from: cami09 on March 24, 2025, 09:02:25 AMOptimally dnsmasq should only forward to 8.8.8.8 for mydomain.com and not use generic server=127.0.0.1#5335, if there already is a conditional forwarding match.
Why not place the domain override in unbound instead of DNSmasq?
Quote from: Patrick M. Hausen on March 24, 2025, 09:25:23 AMWhy not place the domain override in unbound instead of DNSmasq?
Indeed a good suggestion, thanks! I will investigate this case, too.
The only downside is, that config now is scattered across dnsmasq and Unbound.
To elaborate: There is custom processing logic for specific domains in place, which includes a) forwarding requests to different DNS server b) firewall rules for resolved IP via ipset.
In dnsmasq it looks like this:
ipset=/mydomain.com/ipset-mydomain-com
server=/mydomain.com/8.8.8.8
With Unbound domain override, filtering logic about mydomain.com etc. would now reside in two different places (not DRY).
Nonetheless it certainly appears to be a reasonable workaround to mitigate dnsmasq issues in short-term.
Btw I am using dnsmasq mainly for its great ipset feature. Unbound now seems to support ipset via compile flag `--enable-ipset` as well: https://fossies.org/linux/unbound/doc/README.ipset.md, https://github.com/NLnetLabs/unbound/blob/master/ipset/ipset.c . Credit to post from different forum: https://community.ipfire.org/t/unbound-with-ipset-support/13140 Maybe that could be useful addition for OPNsense.
You can mess with firewall rules in OPNsense from DNSmasq? WTF? Call be suprised.
Yes, indirectly I would say 😉 . dnsmasq basically leverages BSD PF tables (for OPNsense) to store resolved IPs, which can be referenced in OPNsense firewall rules via external alias.
Great thing is, this allows for domain-based tracking and filtering, no need to deal with hardcoded IPs, or IP white-/blacklists (*that* is the mess in my view).
Huh isn't ipset a linux feature for iptables?
--ipset=/<domain>[/<domain>...]/<ipset>[,<ipset>...]
Places the resolved IP addresses of queries for one or more domains in the specified Netfilter IP set. If multiple setnames are given, then the addresses are placed in each of them, subject to the limitations of an IP set (IPv4 addresses cannot be stored in an IPv6 IP set and vice versa). Domains and subdomains are matched in the same way as --address. These IP sets must already exist. See ipset(8) for more details.
And it also works with PF?
I'm asking because I'm interested, where's documentation that ipset and PF work together?
Yes, ipset is part of Linux Netfilter: https://en.wikipedia.org/wiki/Netfilter#ipset
dnsmasq under (Free)BSD just re-uses this name in the config file, but internally implements ip-based storage via BSD's Packet filter (PF) tables. And OPNsense itself can reference PF tables via external firewall alias.
Not sure (and also wondered), where this documentation is. But I can confirm it definitely works. There already was a post in this forum, let me search for it.
// Edit
https://github.com/opnsense/core/issues/4145#issuecomment-1208889357
https://forum.opnsense.org/index.php?topic=27650.0
// Edit 2:
To clarify, as terminology is a bit confusing: ipset and PF don't work together. Rather dnsmasq uses a different implementation under BSD (PF tables) than under Linux (ipset) to manage IP sets. dnsmasq in BSD just re-uses the term "ipset" to refer to general concept of fast IP lookup storages.