Setting up Dnsmasq DHCP for PXE Booting - Vendor Class Matching

Started by rkubes, May 15, 2025, 05:59:03 AM

Previous topic - Next topic
May 15, 2025, 05:59:03 AM Last Edit: May 15, 2025, 07:09:45 AM by rkubes Reason: Found new information
I know of course there are tons of threads this release on the transition from ISC DHCP. With that said, I've tried searching and couldn't find this specific answer.

For my use case, it is important that I be able to still support PXE booting a mix of BIOS and UEFI clients on my network before I can transition from ISC DHCP to Dnsmasq. I unfortunately don't have a "test environment" where I can comfortably use trial and error to figure out the right approach on something as important as DHCP.

What is not clear is whether the "match" option for setting tags supports wildcards - either implicitly or explicitly. Typically to send the right file to the client, I have whatever DHCP server I'm using do a "partial match" of Option 60 (Vendor Class ID). In the OPNsense ISC DHCP settings, this is done transparently, as there's just separate fields for the BIOS vs. UEFI boot program file.

I've seen examples online for dnsmasq specifically that use config entries like: "dhcp-vendorclass=BIOS,PXEClient:Arch:00000" to tag the DHCP entry.
Of course, those familiar with Option 60 and the PXE spec know the above is a partial match, as after that last 0 there are other irrelevant values that can't necessarily be known ahead of time. So that first sub-string of Option 60 is really what's important to identify.

Unfortunately, I don't see a clear path in the UI to specify a "vendor class" match directly.

I considered, of course, using the "match" option that is available in the UI and selecting Option 60. However, as noted above, I'm not clear on the wildcard capability to handle that partial string matching.

Lastly, I considered the Option 93. As I've read that machines are supposed to set this to the architecture value that I'd need. However, I'm not familiar with how widely used this is. It's been a while since I read the PXE spec document, but I don't recall Option 93 being specifically called out.

Any assistance will be greatly appreciated!

Edit:
I found dnsmasq manpage that shows the match directive does indeed support the * character as a wildcard, so I should be able to try that.
I also found that RFC 4578 calls out that tag 93 is indeed required for PXE clients to populate. I reviewed the original Intel spec document and it wasn't immediately clear. It doesn't flag option 93 as required in the main chart, but a footnote explains it is required. So, I may try that route and see if I have any issues with clients not in compliance and I can fall back to partial string matching against option 60.

I created an option entry for each of my two interfaces that should support network booting to populate DHCP option 66, with the IP of the TFTP server. Then I created tags to identify BIOS or EFI based on tag 93, and an option entry for option 67 to populate the correct boot program based on that.

I didn't want to use the built in boot option UI since it doesn't seem like you can restrict it to specific interfaces. There's no where to set an interface tag.

Unfortunately PXE still isn't working for me with dnsmasq, but does if I fall back to ISC.

The client gets as far as getting a DHCP response, then gets to the TFTP stage but times out. The firewall logging doesn't show it trying to connect to the TFTP server (neither allowed or blocked).

Tomorrow I'm going to try a DHCP test tool to see what options dnsmasq is actually sending out. I might do the same with ISC and see what the difference is.

In the meantime any other feedback is appreciated, otherwise I'll look to post my final working configuration in case it helps others.

May 16, 2025, 07:18:54 AM #2 Last Edit: May 16, 2025, 08:30:27 AM by Monviech (Cedrik)
Maybe you could wait till a new update is out next week since we fixed some bugs in dnsmasq.

E.g.:

https://github.com/opnsense/core/issues/8624

Also we can add the interface there as tag as well probably. Its just a tag after all.

EDIT: I looks like dnsmasq only supports one tag per "dhcp-boot" option, so it cannot be a mix of a custom tag, and an interface tag from what I see.

I'm interested seeing a working config to see what we potentially miss in the GUI.
Hardware:
DEC740

Thanks. I actually just ran into that issue when I started to give up on the approach of setting 66 and 67 directly.

The interesting thing is when I use a Set option for 66 and 67, what I'm seeing when I test the DHCP response is that 67 (filename) gets populated exactly as I've set it. No issue there. However 66 gets sent to the client as an empty string (See edit, this is not actually true). That must be why it times out trying to connect and also why I don't see any activity or attempts to reach the actual TFTP server.

I've also tried setting option 150 for the TFTP server but still get the same issue. The client just gets an empty string instead of the IP I put in.

Looking at the dnsmasq config file, I don't see anything that stands out as "incorrect" for the dhcp-option directives. I don't know if maybe dnsmasq prevents you from setting these values in the option method and instead relies on the boot options specifically.

Edit:
Well, that was a bit frustrating. I was using a PowerShell script provided by 2Pint to do the DHCP Test. But there was actually an error in their script where they referenced the wrong variable name. Once I corrected their script, I do see that the TFTP IP address is actually getting populated correctly.
I might need to figure out how I can do a TCPDump or WireShark capture of the DHCP packets at boot time to see if I can figure out what's going on or missing. I'll probably need to get another PC on that network that I can set up as the listener for that, since the PC I usually have that kind of access for is the one that I need to capture the boot time packets.

Edit2:
I see the difference in the DHCP packets now. I don't understand the DHCP spec enough to speak intelligently to it. However, when I have ISC DHCP (working), the TFTP IP address and boot file name are in a separate section of the DHCP offer packet. The way WireShark decodes it, it looks like it's in some fixed-width spots within the packet, as there's no identifier before hand. Additionally, the IP address gets stored as a 4 byte value. However, when I use my configuration on dnsmasq, instead of those same fields getting populated, they appear later in the list of DHCP options returned for value 66 and 67. Moreover, the IP address there (if it matters) is written as a null-terminated ASCII string - rather than a 4-byte IP value.

I probably will need to get dnsmasq's actual "boot" config to work for it to "properly" format the DHCP offer with the bootfile and TFTP server populated in the "correct" spot. Even though options 66 and 67 exist, it doesn't seem like setting them like normal options really works. Probably due to the boot response being a different kind of packet.

It's unfortunate that you can't configure dnsmasq to only offer the boot options on specific interfaces; but I can just manage that with firewall rules.

Edit3:
I looked at the manpage for dnsmasq, and I se the dhcp-boot option described as follows:
-M, --dhcp-boot=[tag:<tag>,]<filename>,[<servername>[,<server address>|<tftp_servername>]]
(IPv4 only) Set BOOTP options to be returned by the DHCP server. Server name and address are optional: if not provided, the name is left empty, and the address set to the address of the machine running dnsmasq. If dnsmasq is providing a TFTP service (see --enable-tftp ) then only the filename is required here to enable network booting. If the optional [b]tag(s)[/b] are given, they must match for this configuration to be sent. Instead of an IP address, the TFTP server address can be given as a domain name which is looked up in /etc/hosts. This name can be associated in /etc/hosts with multiple IP addresses, which are used round-robin. This facility can be used to load balance the tftp load among a set of servers.

What is interesting here, is the first line does make it seem like only one tag can be on a dhcp-boot option; but then the description seems to indicate that there can be multiple tags. I'll go ahead and try later adding a custom config file on the router with dhcp-boot set with an interface tag, and my BIOS/EFI tag to differentiate the filetype and report back if it works. If that does, I'll probably do a separate reply rather than an edit, since it will be a substantial enough update.

I finally got it working.

It's a combination of the mentioned defect (8624 - where Boot settings do not go to the dnsmasq.conf file), as well as the fact that the interface tags aren't listed as an option.

I was able to confirm that the dhcp-boot directive does indeed support multiple tags. I experimented first with the "tag-if" directive, but wound up not needing it.

Below is the separate config file that I dropped into /usr/local/etc/dnsmasq.conf.d/  (named it 20-pxe.conf)

dhcp-match=set:IsBIOS,93,0
dhcp-match=set:IsEFI,93,7

dhcp-boot=tag:igc3,tag:IsBIOS,undionly.kpxe,10.0.64.10,10.0.64.10
dhcp-boot=tag:igc3,tag:IsEFI,snponly.efi,10.0.64.10,10.0.64.10
dhcp-boot=tag:igc2,tag:IsBIOS,undionly.kpxe,10.0.64.10,10.0.64.10
dhcp-boot=tag:igc2,tag:IsEFI,snponly.efi,10.0.64.10,10.0.64.10

I was able to confirm that a BIOS based client on igc3 got the BIOS boot file, and a UEFI based client on igc3 got the EFI boot file.

For all the "DHCP Options" that I configured in the GUI to try to fix this manually, I just created a new "tag" called "Disabled" that just never gets set and added that tag to all of them to disable them without having to fully delete them. It might be "nice to have" for the UI to offer an enable/disable function similar to firewall rules so that the options can be toggled without having to completely delete them.

Thanks for the further investigation.

Why is one tag with ":", and the other with "="?
Hardware:
DEC740

Quote from: Monviech (Cedrik) on May 16, 2025, 09:31:04 PMThanks for the further investigation.

Why is one tag with ":", and the other with "="?

Sorry that was a copy paste error. I had that syntax error at first and then corrected it after I had already copied it over.

I edited and fixed my original post in case anyone else copies it from there.

Nice, I think we can fix this in the GUI with this information to additionally allow an interface tag and then we should be good to go.
Hardware:
DEC740