[solved] Dnsmasq failing to start with bind error

Started by OPNenthu, May 18, 2025, 06:13:10 PM

Previous topic - Next topic
The instructions that I followed came from here: https://tcpip.wtf/en/unifi-l3-adoption-with-dhcp-option-43-on-pfsense-mikrotik-and-others.htm

Since that site very recently stopped loading, we can access it from Internet archive: https://web.archive.org/web/20250422225707/https://tcpip.wtf/en/unifi-l3-adoption-with-dhcp-option-43-on-pfsense-mikrotik-and-others.htm

There the instructions for OPNsense were to input as type "string" with the surrounding quotes.  For pfSense it was again "string" but without quotes.

I'm confused now why this worked for me in the past, and the current recommendation (without quotes) did not.  Maybe my switch was blocked due to another issue.  I'll do some more tests.

Ideally, we should have some logging in UniFi to confirm it but I'm not aware of it at the moment.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE i226-v
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE i210

May 25, 2025, 04:10:51 PM #16 Last Edit: May 25, 2025, 04:16:35 PM by meyergru
Interesting. The site answers for me - and the OpnSense settings as proposed there are plain wrong. I just contacted the author.

Maybe it still worked for you if the DNS name "unifi" resolves to your Unifi controller - or if any other layer3 mechanism is used to provision your devices. After all, you do not need this at all if the Unifi controller is within the same network (i.e. not routed).
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

May 25, 2025, 04:29:07 PM #17 Last Edit: May 25, 2025, 04:53:11 PM by OPNenthu
Sounds good, I will switch to the method you and Patrick recommend.  Thanks for the feedback.

Yes at present my UniFi controller is on the same net as the switch so this shouldn't have been needed.  In the past I had the controller on my "Home" VLAN (in VirtualBox).  It's presently in a Proxmox container on the 192.168.1.0/24 network, same as the switch.

Some other glitch was preventing the adoption, then.  I'll play with it later.

As for the site not loading, I found it:  the resolved IPs are belonging to Cloudflare and these are present in one of the public DNS lists I use for DoH blocking.

C:\>nslookup tcpip.wtf
Server:  UnKnown
Address:  192.168.30.1

Non-authoritative answer:
Name:    tcpip.wtf
Addresses:  2606:4700:3030::6815:1001
          2606:4700:3030::6815:6001
          2606:4700:3030::6815:3001
          2606:4700:3030::6815:4001
          2606:4700:3030::6815:2001
          2606:4700:3030::6815:5001
          2606:4700:3030::6815:7001
          104.21.96.1
          104.21.80.1
          104.21.48.1
          104.21.32.1
          104.21.64.1
          104.21.112.1
          104.21.16.1

https://whois.domaintools.com/104.21.32.1

You cannot view this attachment.

I haven't noticed any other sites getting blocked until now.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE i226-v
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE i210

Quote from: meyergru on May 25, 2025, 04:10:51 PMMaybe it still worked for you if the DNS name "unifi" resolves to your Unifi controller

Just on this point-

The recommended setup for Dnsmasq based on the "Observations" thread with the recent patches, results in short name resolution being deactivated.  If the switch is looking for "unifi" this might be a contributing factor.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE i226-v
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE i210

May 25, 2025, 05:23:37 PM #19 Last Edit: May 25, 2025, 05:27:33 PM by meyergru
That is exactly why I mentioned it explicitely. I go great lengths to ensure that short name resolution works - like providing DNS search lists to clients, but that does not always help.

For this exact reason I have my unifi controller host entry also register a short "unifi" alias and on top, I have a specific override in Unbound for "unifi" without a domain and leave it in place despite DNSmasq normally having the authority over my VLAN domains.

Actually, for the time being, I have gone back to ISC DHCP and Unbound, because I think that there are still too many bugs with the "new way" to use in a production environment:

  • E.g., for me, the upcoming "local" patch 3b8e4a6 is essential - and then some, because I also need DNSmasq to be authoritative for domains that are not on my local DHCP subnets, and I have them delegated from Unbound, yet I cannot specify that.
  • Also, DNScrypt-Proxy is broken at this time, so I cannot use it as an Unbound alternative.

I had hoped for DNSmasq to solve a lots of problems of the loosely-coupled integration of DHCP, DNS and RA, but I probably will opt for Kea/Unbound, once the deficits on that will have been adressed (i.e. DNS for dynamic leases and missing DHCP options).
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

JFYI-

I observed a difference in how Dnsmasq and ISC are encoding and sending vendor-option 43 for UniFi.  In both cases here I used the recommended input format without quotes.  The outputs in packet capture are pasted below.


Dnsmasq:

Option: (43) Vendor-Specific Information
    Length: 6
    Value: 0104c0a80110


ISC:

Vendor-Option (43), length 6: 1.4.192.168.1.16

When no quotes are used both are sending a 6-byte value, but Dnsmasq is just sending the verbatim input value sans the colons.  ISC is converting the last 4 bytes back into the original IP address and separating with dots.

If we assume that the ISC format is the correct one, then presumably this option is broken in Dnsmasq (?)
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE i226-v
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE i210

How exactly did you produce these code snippets?

When I use tcpdump to watch what ISC is sending, I also get this:
Vendor-Option (43), length 6: 1.4.217.29.45.77

But I am sure that ISC does send 6 bytes and what you see above is just tcpdump's friendly human readable display. It definitely does not send literal dots.

So how did you get the output for DNSmasq?
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Ah, maybe the difference is due to Wireshark.  I had used that when I did my initial Dnsmasq tests a few days back, but I am using tcpdump from the OPNsense UI now.

I'll sort this out and post back.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE i226-v
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE i210

Quote from: OPNenthu on May 27, 2025, 11:20:06 AMbut I am using tcpdump from the OPNsense UI now.

There's tcpdump in the UI? :-)

Never noticed. I always SSH in and use, well, tcpdump.
Deciso DEC750
People who think they know everything are a great annoyance to those of us who do. (Isaac Asimov)

Yes, there is no difference, apart from different tools producing the dump output. I told @OPNenthu this already: https://github.com/opnsense/core/issues/8620#issuecomment-2907679558
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 800 up, Bufferbloat A+

I don't remember if I used Wireshark for those examples, but I will anyway retry and mark down the tool used.  You are most likely correct.

@Patrick:  Interfaces->Diagnostics->Packet Capture.   Limited as compared to the CLI but still convenient.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE i226-v
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE i210

Alright- I don't need to run through every permutation as I can already see the pattern.  You guys are right, it's the difference in how each tools displays.  Nothing to see here :)

Input Value DHCP Provider PCAP tool Output
01:04:c0:a8:01:10 ISC tcpdump Vendor-Option (43), length 6: 1.4.192.168.1.16
"01:04:c0:a8:01:10" ISC tcpdump Vendor-Option (43), length 17: 48.49.58.48.52.58.99.48.58.97.56.58.48.49.58.49.48
01:04:c0:a8:01:10 ISC Wireshark Option: (43) Vendor-Specific Information     Length: 6     Value: 0104c0a80110
"01:04:c0:a8:01:10" ISC Wireshark Option: (43) Vendor-Specific Information     Length: 17     Value: 30313a30343a63303a61383a30313a3130
01:04:c0:a8:01:10 Dnsmasq tcpdump Vendor-Option (43), length 6: 1.4.192.168.1.16


So there's still the mystery of why my switch failed to adopt when I used Dnsmasq and the correct input value.

I know that DNS for local hosts was not resolving correctly at the time, but Option 43 should have been picked up. Maybe local hostname resolution is a critical function for UniFi bring-up regardless of Option 43?  I'll probably find out in due time with more testing.
"The power of the People is greater than the people in power." - Wael Ghonim

Site 1 | N5105 | 8GB | 256GB | 4x 2.5GbE i226-v
Site 2 |  J4125 | 8GB | 256GB | 4x 1GbE i210