I'm starting the migration from ISC to Dnsmasq w/ Unbound upstream on OPN 25.1.6_4 and I've quickly hit a bind error:
2025-05-18T11:28:54-04:00 Critical dnsmasq FAILED to start up
2025-05-18T11:28:54-04:00 Critical dnsmasq failed to bind DHCP server socket: Address already in use
Checking sockstat I see that service 'dhcpd' is listening on *:67. I believe this is used by ISC?
root@firewall:~ # sockstat -4 -l
USER COMMAND PID FD PROTO LOCAL ADDRESS FOREIGN ADDRESS
_flowd flowd 68062 3 udp4 127.0.0.1:2056 *:*
root mdns-repea 66175 5 udp4 *:5353 *:*
root mdns-repea 66175 6 udp4 192.168.20.1:5353 *:*
root mdns-repea 66175 8 udp4 192.168.30.1:5353 *:*
root mdns-repea 66175 9 udp4 192.168.40.1:5353 *:*
nobody samplicate 24413 5 udp4 127.0.0.1:2055 *:*
nobody samplicate 24413 6 udp4 *:5269 *:*
root ntpd 26908 21 udp4 *:123 *:*
root ntpd 26908 23 udp4 xx.xxx.xxx.xxx:123 *:* *(public IP - redacted)
root ntpd 26908 27 udp4 127.0.0.1:123 *:*
root ntpd 26908 29 udp4 10.2.2.1:123 *:*
root ntpd 26908 32 udp4 192.168.1.1:123 *:*
root ntpd 26908 36 udp4 192.168.20.1:123 *:*
root ntpd 26908 39 udp4 192.168.30.1:123 *:*
root ntpd 26908 43 udp4 192.168.40.1:123 *:*
root ntpd 26908 46 udp4 192.168.50.1:123 *:*
root ntpd 26908 49 udp4 192.168.60.1:123 *:*
unbound unbound 6410 7 udp4 *:53 *:*
unbound unbound 6410 8 tcp4 *:53 *:*
unbound unbound 6410 11 udp4 *:53 *:*
unbound unbound 6410 12 tcp4 *:53 *:*
unbound unbound 6410 15 udp4 *:53 *:*
unbound unbound 6410 16 tcp4 *:53 *:*
unbound unbound 6410 19 udp4 *:53 *:*
unbound unbound 6410 20 tcp4 *:53 *:*
unbound unbound 6410 21 tcp4 127.0.0.1:953 *:*
dhcpd dhcpd 82937 14 udp4 *:67 *:*
root lighttpd 71873 7 tcp4 *:443 *:*
root sshd 47538 7 tcp4 *:22 *:*
? ? ? ? udp4 *:51820 *:*
I have Dnsmasq set to listen only on the specific interface that I'm migrating and its DNS service is on 53053. Unbound is on port 53 (All interfaces). I get the same error both with and without the "Strict Interface Binding" option under advanced settings. I also tried restarting all services from the console with Option 11.
Is it possible to migrate a live system one interface at a time? I was expecting that if I disable an interface from Services->ISC DHCPv4, then there wouldn't be any conflicts.
Thanks!
If you search for migration from ISC to Kea, you will find multiple threads about how ISC takes over the binding, regardless of whether an interface is not being "used" by ISC.
Appreciate the hint- I'll look for those.
Without doing the homework, I am guessing this will involve setting a static IP on my PC for some time and killing off dhcpd. Hopefully someone found something more elegant.
I do not get what you try to achieve? If you want DNSmasq only for DNS, then you can totally do that by disabling all interfaces for DHCP in the advanced settings and still have ISC DHCP running. If you want DNSmasq's DHCP, then you you need to disable ISC completely.
What you cannot to is use both ISC and DNSmasq DHCP at the same time, because ISC cannot select the interfaces it runs on (DNSmasq can!).
Quote from: meyergru on May 18, 2025, 09:07:23 PMWhat you cannot to is use both ISC and DNSmasq DHCP at the same time, because ISC cannot select the interfaces it runs on (DNSmasq can!).
I wasn't aware. I followed the example in the guide which enables Dnsmasq on port 53053 as a first step before adding configurations to it, so I assumed that the migration entails having both DHCP servers active for some time.
As I understand it now I will have to completely disable ISC first. But, won't that disrupt the network? (Anyway I can do it while the family is offline)
EDIT: I guess another option is to fully configure the DHCP ranges in Dnsmasq first and then enable / cut over to it in one shot.
Nevermind on adding the DHCP ranges while Dnsmasq is disabled. I don't know if it's by design or a bug, but it's not possible to add DHCP ranges while the service is disabled. The interfaces aren't listed there, even if they are selected in General.
So maybe it will be necessary to fully disable DHCP services during the configuration.
EDIT: Geeez, I'm a fool. My web session was timed out but I didn't know it until I navigated to a different section of the UI and then only got the login prompt.
Interfaces are showing in DHCP ranges tab now.
Quote from: OPNenthu on May 19, 2025, 02:20:30 AMEDIT: I guess another option is to fully configure the DHCP ranges in Dnsmasq first and then enable / cut over to it in one shot.
Yup, and if you have lots of DNS entries and DHCP reservations, like me: https://github.com/meyergru/iscdhcp_to_dnsmasq
I had less than a dozen static leases so got those done OK enough through the GUI.
BTW, thank you for your contributions together with Ad & Franco on the DHCP vendor options expansion for Dnsmasq (https://github.com/opnsense/core/issues/8620). I found Option 43 after upgrading to 25.1.7 and just copy/pasted the string value (with the quotes) as it was in ISC DHCPv4 for my UniFi controller address. Hopefully this is the expected format in Dnsmasq as well.
unifi_option_43.png
For anyone else needing to do this, it should be prefix "01:04:" + the controller IPv4 address converted to hex and separated with colons. In my case it's at 192.168.1.16 = 0xC0A80110.
Leases are working now and I see all the options included in Wireshark capture, so the DHCP part of Dnsmaq is fine. As I posted in the other thread though, I'm working through DNS resolution issues now. Thanks for the help here.
I use it the same way, without the quotes, and I verified it to work, but it is passed on directly to DNSmasq, so you should be good.
Actually I think you're right not to use the quotes. With them, the value is not passed through transparently in the packet capture:
Option: (43) Vendor-Specific Information
Length: 17
Value: 30313a30343a63303a61383a30313a3130
Without the quotes:
Option: (43) Vendor-Specific Information
Length: 6
Value: 0104c0a80110
Ah, so it is interpreted as a string, not hanled as binary string, then.
@meyergru FYI- I commented on https://github.com/opnsense/core/issues/8620 as the Dnsmasq offer for Option 43 is not allowing my UniFi switch to adopt after reboot.
I am a little confused right now.
Unifi's specification on how this should work lacks substance: https://help.ui.com/hc/en-us/articles/204909754-Remote-Adoption-Layer-3
I always assumed that the option 43 should be specified using RFC 3925, so it must be 6 bytes of Vendor (01), Length (04), followed by 4 bytes IP address, so, say 192.168.1.16 (c0 a8 01 10), however this must be specified for the specific tool.
For ISC DHCP, I always used 01:04:c0:a8:01:10 as specification, but that does not say much, because I also have the DNS name "unifi" defined.
You and Patrick (https://forum.opnsense.org/index.php?topic=19726.0) say that "01:04:c0:a8:01:10" work in ISC DHCP, resulting in 17 bytes instead of 6, namely exactly the content of the string between the double quotes?
Yet, I found multiple sources on the internet saying that 6 bytes is correct:
https://community.ui.com/questions/UniFi-FAQ-the-missing-manual-and-beyond/110f3e4a-9994-42b3-a7f0-1ef1eb55c093#answer/e967649e-4dc8-4f39-ba46-fcfa79dba9de
https://www.lancom-forum.de/fragen-zu-lancom-systems-voip-router-f42/dhcp-option-43-fuer-unifi-t19360.html
https://ubiquiti-networks-forum.de/board/thread/4664-dhcp-option-43-am-lancom/
That should be specified without quotes for ISC DHCP and for DNSmasq, as well.
The quotation marks in that old post from 2020 are just that - indicating that this is a qotation. Of course I put the value in the field without them. See screen shot. It's 6 bytes that are sent to the clients.
If you read the whole thread you'll find I was confused about the type "string" combined with 6 hex values separated by colons.
Thanks, Patrick for the clarification. So my premise was correct. 6 bytes it is, and it has to be given without quotes to both ISC DHCP and DNSmasq. I commented to that extent here already: https://github.com/opnsense/core/issues/8620#issuecomment-2907679558
The instructions that I followed came from here: https://tcpip.wtf/en/unifi-l3-adoption-with-dhcp-option-43-on-pfsense-mikrotik-and-others.htm
Since that site very recently stopped loading, we can access it from Internet archive: https://web.archive.org/web/20250422225707/https://tcpip.wtf/en/unifi-l3-adoption-with-dhcp-option-43-on-pfsense-mikrotik-and-others.htm
There the instructions for OPNsense were to input as type "string" with the surrounding quotes. For pfSense it was again "string" but without quotes.
I'm confused now why this worked for me in the past, and the current recommendation (without quotes) did not. Maybe my switch was blocked due to another issue. I'll do some more tests.
Ideally, we should have some logging in UniFi to confirm it but I'm not aware of it at the moment.
Interesting. The site answers for me - and the OpnSense settings as proposed there are plain wrong. I just contacted the author.
Maybe it still worked for you if the DNS name "unifi" resolves to your Unifi controller - or if any other layer3 mechanism is used to provision your devices. After all, you do not need this at all if the Unifi controller is within the same network (i.e. not routed).
Sounds good, I will switch to the method you and Patrick recommend. Thanks for the feedback.
Yes at present my UniFi controller is on the same net as the switch so this shouldn't have been needed. In the past I had the controller on my "Home" VLAN (in VirtualBox). It's presently in a Proxmox container on the 192.168.1.0/24 network, same as the switch.
Some other glitch was preventing the adoption, then. I'll play with it later.
As for the site not loading, I found it: the resolved IPs are belonging to Cloudflare and these are present in one of the public DNS lists I use for DoH blocking.
C:\>nslookup tcpip.wtf
Server: UnKnown
Address: 192.168.30.1
Non-authoritative answer:
Name: tcpip.wtf
Addresses: 2606:4700:3030::6815:1001
2606:4700:3030::6815:6001
2606:4700:3030::6815:3001
2606:4700:3030::6815:4001
2606:4700:3030::6815:2001
2606:4700:3030::6815:5001
2606:4700:3030::6815:7001
104.21.96.1
104.21.80.1
104.21.48.1
104.21.32.1
104.21.64.1
104.21.112.1
104.21.16.1
https://whois.domaintools.com/104.21.32.1
block.png
I haven't noticed any other sites getting blocked until now.
Quote from: meyergru on May 25, 2025, 04:10:51 PMMaybe it still worked for you if the DNS name "unifi" resolves to your Unifi controller
Just on this point-
The recommended setup for Dnsmasq based on the "Observations" thread with the recent patches, results in short name resolution being deactivated. If the switch is looking for "unifi" this might be a contributing factor.
That is exactly why I mentioned it explicitely. I go great lengths to ensure that short name resolution works - like providing DNS search lists to clients, but that does not always help.
For this exact reason I have my unifi controller host entry also register a short "unifi" alias and on top, I have a specific override in Unbound for "unifi" without a domain and leave it in place despite DNSmasq normally having the authority over my VLAN domains.
Actually, for the time being, I have gone back to ISC DHCP and Unbound, because I think that there are still too many bugs with the "new way" to use in a production environment:- E.g., for me, the upcoming "local" patch 3b8e4a6 is essential - and then some, because I also need DNSmasq to be authoritative for domains that are not on my local DHCP subnets, and I have them delegated from Unbound, yet I cannot specify that.
- Also, DNScrypt-Proxy is broken at this time (https://github.com/opnsense/plugins/pull/4698), so I cannot use it as an Unbound alternative.
I had hoped for DNSmasq to solve a lots of problems of the loosely-coupled integration of DHCP, DNS and RA, but I probably will opt for Kea/Unbound, once the deficits on that will have been adressed (i.e. DNS for dynamic leases and missing DHCP options).
JFYI-
I observed a difference in how Dnsmasq and ISC are encoding and sending vendor-option 43 for UniFi. In both cases here I used the recommended input format without quotes. The outputs in packet capture are pasted below.
Dnsmasq:
Option: (43) Vendor-Specific Information
Length: 6
Value: 0104c0a80110
ISC:
Vendor-Option (43), length 6: 1.4.192.168.1.16
When no quotes are used both are sending a 6-byte value, but Dnsmasq is just sending the verbatim input value sans the colons. ISC is converting the last 4 bytes back into the original IP address and separating with dots.
If we assume that the ISC format is the correct one, then presumably this option is broken in Dnsmasq (?)
How exactly did you produce these code snippets?
When I use tcpdump to watch what ISC is sending, I also get this:
Vendor-Option (43), length 6: 1.4.217.29.45.77
But I am sure that ISC does send 6 bytes and what you see above is just tcpdump's friendly human readable display. It definitely does not send literal dots.
So how did you get the output for DNSmasq?
Ah, maybe the difference is due to Wireshark. I had used that when I did my initial Dnsmasq tests a few days back, but I am using tcpdump from the OPNsense UI now.
I'll sort this out and post back.
Quote from: OPNenthu on May 27, 2025, 11:20:06 AMbut I am using tcpdump from the OPNsense UI now.
There's tcpdump in the UI? :-)
Never noticed. I always SSH in and use, well, tcpdump.
Yes, there is no difference, apart from different tools producing the dump output. I told @OPNenthu this already: https://github.com/opnsense/core/issues/8620#issuecomment-2907679558
I don't remember if I used Wireshark for those examples, but I will anyway retry and mark down the tool used. You are most likely correct.
@Patrick: Interfaces->Diagnostics->Packet Capture. Limited as compared to the CLI but still convenient.
Alright- I don't need to run through every permutation as I can already see the pattern. You guys are right, it's the difference in how each tools displays. Nothing to see here :)
Input Value DHCP Provider PCAP tool Output
01:04:c0:a8:01:10 ISC tcpdump Vendor-Option (43), length 6: 1.4.192.168.1.16
"01:04:c0:a8:01:10" ISC tcpdump Vendor-Option (43), length 17: 48.49.58.48.52.58.99.48.58.97.56.58.48.49.58.49.48
01:04:c0:a8:01:10 ISC Wireshark Option: (43) Vendor-Specific Information Length: 6 Value: 0104c0a80110
"01:04:c0:a8:01:10" ISC Wireshark Option: (43) Vendor-Specific Information Length: 17 Value: 30313a30343a63303a61383a30313a3130
01:04:c0:a8:01:10 Dnsmasq tcpdump Vendor-Option (43), length 6: 1.4.192.168.1.16
So there's still the mystery of why my switch failed to adopt when I used Dnsmasq and the correct input value.
I know that DNS for local hosts was not resolving correctly at the time, but Option 43 should have been picked up. Maybe local hostname resolution is a critical function for UniFi bring-up regardless of Option 43? I'll probably find out in due time with more testing.