Unbound has "stuck" DNS entries from a DHCP entry - make it go away?

Started by Linwood, February 20, 2025, 04:35:44 AM

Previous topic - Next topic
I'm baffled.   OPNSense 24.7.12.

I put a new rPi on the network and it got an address of 192.168.130.53.  I then logged in and set its static address to 192.168.130.248.  Somehow maybe it also got 192.168.130.73 (which is a duplicate, active IP, so maybe it got it and gave it up).

Anyway... I then went into unbound and added an override for 192.168.130.248, and deleted the lease for .53 (there was none showing for .73 but there was a legit unbound override for it for a different name).

Unbound is still responding to queries for this name with .53 and .73 (both), and not the override of .248.

I have so far:

- Unbound is set to clear its cache on restart
- Restarted both DHCP4 (ISC) several times
- Restart unbound several times
- Rebooted OPNSense
- Deleted the override for 192.168.130.73 (which was legit and different name) and put it back
- Downloaded a backup configuration of OPNSense as XML, searched for the bogus IP and name (found the name with correct IP, nothing bogus)
- Waited much longer than the TTL to see if it would expire and vanish - it doesn't.

If I run dig it shows like it has a real A record:

;; ANSWER SECTION:
zwave4.xxxxx.com.  2099    IN      A       192.168.130.53
zwave4.xxxxx.com.  2099    IN      A       192.168.130.73

If I run nslookup with debug it shows a regular query and response with both (wrong) answers.  Nothing weird, just wrong.

In NSLOOKUP if I turn off recursion I get the same wrong answers, so it's not somehow recursing to elsewhere.

The TTL resets occasionally to 3600, notably sometimes when I query it, making me think that after restarting it's being recreated from something.

I turned on verbose debug logging in unbound and see the query and answer but nothing about where it comes from.  It's not forwarding (I've turned off forwarding and nothing changes).

It looks like there is a record somewhere that the GUI is not showing me, but I do not know where to look.  It also doesn't seem to be anyplace that is part of what is backed up by the system.

Any idea how to make it go away?

Linwood

There are two files generated by DHCPv4 for unbound:

1. /var/unbound/host_entries.conf for static reservations. After creating a new reservation, you have to restart unbound manually to re-read it.
2. /var/unbound/dhcpleases.conf for DHCP leases. This file is changed and re-read dynamically.

The latter file will be created by dhcpd from /var/dhcpd/var/db/dhcpd.leases and sometimes, items do not get deleted from there. I found that to happen if you first have a machine fetch a dynamic lease and then change that to a static reservation with another IP. Maybe this has something to do with separating static and dynamic ranges.

You can search for the conflicting entries in the files and edit them accordingl, but remember to stop the respective service before and restart it afterwards.
Intel N100, 4 x I226-V, 16 GByte, 256 GByte NVME, ZTE F6005

1100 down / 770 up, Bufferbloat A

Thank you @meyergru

I have found one issue but still not quite understand.  First things first (after a night of it just sitting):

/var/unbound/host_entries.conf looks correct, the bogus entries are not there, and a correct entry for a different name is there for 192.168.130.73.

Oddly /var/unbound/dhcpleases.conf is empty, size zero, but /var/dhcpd/var/db/dhcpd.leases is not, but also looks correct, and does not have either IP address in it from the bogus translations.

I grep'd every file in both folders without finding the bogus IP entry. If I grep by name I find only the unbound override (correct) entry.

It just feels like there's some persistent cache I have not found. 

Continuing to hunt I tried resetting "Aggressive NSEC" (just because it had the word cache) but I forgot to apply, but I did restart unbound (first restart of today). Now the bogus entries did not resolve and the legitimate one did not either.   Note this was JUST the restart, I didn't save the NSEC change by mistake (I saw later).

At that point I found a typo -- the legitimate override had the domain name mis-spelled (missing letter).  I fixed that, and restarted unbound again, and now it works fine.

So... the reason my override was not visible is it was mis-spelled.

But the reason these cached entries continued to appear remains a mystery, as is why they disappeared after about 8 hours of just sitting there.

I did look back and the lease time in DHCP is 12 hours, so that's almost certainly part of this.  However, these leases had been manually deleted.  Further, the TTL on the DNS entries per DIG was coming up as under an hour.  It seems like whatever was doing this was taking a DHCP lease (despite being deleted) and forming a DNS entry with shorter TTL.

When the lease (that was deleted!) expired, AND a restart of unbound occurred, the bogus entry was gone.

What a mess of flakey cleanup.  There are days when I really hate GUI's, I suspect if this was just plain old text file configs I might have found this.

Anyway, I leave this trail of confusing breadcrumbs in case anyone else runs across something similar and it might help.

Thank you for your info, it's surprisingly hard to find where the config files are in google, 95% of what I find just points to the GUI's.