Menu

Show posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Show posts Menu

Messages - stauf

#1
In the spirit of testing, any idea why I am getting my address pool filled with these bogus entries?  Its happening again and I can confirm that I have Automatic Discovery disabled (at least that is what the UI is telling me).  Kea is giving away all my IP addresses (even those reserved for specific MAC addresses).  This means if the address has been given away before my device grabs it, there are no free addresses to give out and my device doesn't get an IP address at all.  This is causing big problems on my network.

All the bogus DHCP table entries in Kea have a lifetime of 86400 seconds (not the lifetime of my standard leases).  This is what I was seeing before but I thought we narrowed the problem down to Automatic Discovery being Enabled.  I don't mind enabling Automatic Discovery and helping test this (especially since it appears to be happening with Automatic Discovery disabled), it would be helpful to work with someone that was actively working in this area though.  Maybe I am doing something wrong?

Regardless of my setup, I would not expect Kea to dole out IP addresses without associating them with a MAC address or hostname.  What is the point of doing that other than gumming up the works of the DHCP server?
#2
Out of curiosity, why was Automatic Discovery defaulted to enabled?  While I understand this feature could be valuable to some people, most products I have been involved with, new features default to disabled.  This allows for easier upgrade/downgrade to/from versions of sw that support/don't support the feature and essentially assures users that new features won't break what they currently have working.  I know there were warnings about Automatic Discovery and shame on me for not diving into this more actively, I guess I am just curious at a higher level that this specific feature.  Is there a case-by-case discussion on new features deciding to default them to enabled or disabled?

I can certainly imagine some features aren't as clear cut at this one to have them enabled or disabled, but this one seems pretty clear as it is just another daemon sitting there listening for ARPs/NDP messages.  Just seems odd to me that it would be enabled by default.
#3
Thanks, that makes total sense based off what I saw.  I disabled Automatic Discovery and cleaned up the csv files, but that did not completely fix my problem.  Some of the erroneous entries were still showing up in the table.  To be fair, once the problem was "mostly" solved, that was enough for me at the time and I didn't pay super close attention to every detail.  All I can say is the next day, all of these 86400 second lifetime, MAC-less and hostname-less entries were totally gone from the KeaDHCP Leases table and, so far, have not come back.

Frank, are you saying you are getting buildup of what appear to be erroneous entries in your KeaDHCP Leases table and you have verified that Automatic Discovery is disabled?  I'm a bit confused by your statements.  You say you aren't using v6, but have v6 addresses assigned?  If you have KeaDHCPv6 disabled, these may just be auto assigned v6 addresses by Proxmox or your containers running on Proxmox.

I have not observed my KeaDHCP pool get used up while Automatic Discovery is Disabled nor was I scrutinizing the logs.  If you have logs referring to Automatic Discovery, sounds to me like you might still have it enabled.
#4
Frank,

For me, there seemed to be two workarounds and my "fix" was to disable Automatic Discovery.

1. Disable Automatic Discovery and wait.
2. If you need a solution faster, you can follow the instructions earlier in this thread.  SSH in (you may have to enable SSH), get into the shell and then go to /var/db/kea/.  There should be at least 1 .csv file in here.  Stop KeaDHCP, edit these files removing the incorrect entries.  This may take a few minutes to ensure you do it correctly as the files don't appear to be in order and there may be multiple (I had 2).  Restart KeaDHCP.  For me, I did this live.  I was stuck with all my addresses "in-use" so there didn't seem much harm to me doing it live.  Didn't appear to make anything worse.  Even with Automatic Discovery disabled, I did have to do this a couple times, but now, a couple days later, I see no erroneous entries in my cache.

As a side note, I did experience Automatic Discovery seemingly turning itself on once.  I had disabled and applied but then noticed it was on again sometime later.  I was never able to reproduce this so not sure what caused it.  I will just caution you to double check a few times that it is disabled, just in case.
#5
Well good to know.  I was going to try to turn on Automatic Discovery and see if I could find a rogue ARP request on my network causing OpnSense to mis-behave like this, but, for now, I think I am good just leaving it off.  I, personally, don't really Automatic Discovery and it appears it does more harm than good at the moment.  Thanks for checking into it.
#6
Interesting, thank you.  Someone was saying that Automatic Discovery sends out pings but I guess that is not the case.  The documentation I can find on it shows that it just listens for ARP and NDP messages.  Feels like there must be some defect(s) in the Automatic Discovery.  If all it does is listen for ARPs and update it's cache if there are devices on the network changing their IP -> MAC mappings, or devices with static entries not using OpnSense to get an IP address, but they would need to send out ARP requests to populate their own ARP cache if they want to talk with other devices on the network.  I can't think of anything (other than malicious/malformed ARPs, which I am certainly not sending...at least not intentionally, on my network) that would explain why OpnSense would populate entries in its DHCP table when there is no device on the network that is sending those ARPs.

It might be a nice feature to add a tag to any KeadDHCP MAC->IP entries in the csv files.  Basically how was this entry learned.  There is already a "Lease Type" column that says it is either static or dynamic.  Might be nice to have an "automatic" maybe?

As I have Proxmox and multiple hosts on individual ethernet interfaces, there could be multiple IP/MAC combinations on the same interface, but from an OpnSense point of view, that should not matter at all.

I don't have v6 configured on OpnSense so I assume, even if Automatic Discovery is on, any rogue v6 NDP packets on my network would just get dropped?  I suppose in my case, it's a moot point as there are no entries in the KeaDHCPv6 table.

When Automatic Discovery is enabled and it sees ARPs and keeps track, what is the next step?  Does it just populate the KeaDHCP cache?  Does it create a table somewhere else?
#7
I don't know the algorithm Automatic Discovery uses, nor am I 100% sure it caused these problems.  Maybe I can do some testing this weekend.  It certainly appears that Automatic Discovery added DHCP pool entries for IP addresses that are not being responded to on my network.  As I understand it, Automatic Discovery would systematically ping every device on defined subnets and map out what it finds for you?  It certainly appears as if in some cases (my entire dynamic pool and a few of my reserved addresses) it sends out the ping but decides to add a 24 hour DHCP cache entry for that IP.  Since there is no response to the ping, there is no MAC and no hostname to associate with that entry.  One would assume that if nothing responds, OpnSense should not maintain the DHCP entry for that IP.  I'm guessing at this point but that is certainly what appears to be what is happening on my network.

Out of curiosity, is anyone successfully using the Automatic Discovery feature and verifying the discovered devices are correct?
#8
I never understood why, in forums like this, you ask a questions about one thing and, invariably, someone has an axe to grind and insists on trying to alter your workflow.  I'm not asking question A in an attempt to get answer B.  If you like Bonjour and not using Reservations, that's great, hooray for you.  However, just because it works for you, doesn't mean it should be everyone's solution.  Also, Bonjour or no, Affinity or no, the information is irrelevant to my issue.  No need to hijack someone else's thread.

Anyway, I'm not 100% sure what changed, the only config change I made was disabling Automatic Discovery under Interfaces -> Neighbors.  I still had to manually remove these erroneous entries (they were refreshing even with Automatic Discovery disabled) and still don't understand why they would be populated in the first place.  I have 1 device (IP) in the dynamic pool of addresses that would be responding and it was setup properly in OpnSense with a 4000 second lifetime and its MAC and hostname were accurate and present.  None of the other addresses that were populated by KeaDHCP were addresses that exist, nor were there responses to pings to those addresses.  Even if I wanted Automatic Discovery enabled, if there is nothing to respond to the ping, why would the DHCP entry be filled?

Furthermore, I did notice that Automatic Discovery re-enabled itself once after I disabled it (and yes, I applied my change).  This may explain why a number of entries re-populated the DHCP cache after I manually removed them.

At this point, things appear to be working properly but I have to say, this experience certainly did not give me a warm/fuzzy feeling about OpnSense.  Hopefully Automatic Discovery remains disabled.  I suppose if this happens again, the workaround to manually clean it up isn't too bad, I just have to SSH in, get into the shell and manually edit these csv files.  Not a problem for me, but its just more than casual users would want to have to do.
#9
Right, I agree.  While having the "Affinity lifetime" is nice, but for me personally, it has zero relevance to my desire to use Reservations.  My "need" for reservations is partly my brain (I am a person that remembers numbers and I want to define the addresses my devices are using) and party for hosts like Proxmox.  It wants a static address defined and does not want to use DHCP (even if the back end of DHCP would always send a consistent IP).  So what I do is setup a reservation that is essentially never used.  Proxmox uses the address statically and since KeaDHCP is setup with a reservation, it doesn't doll out that address to anyone else (at least it shouldn't).  I suppose I could create another subnet for hosts that have static IP addresses and just not have DHCP running, but my current strategy has always worked, so why complicate things further?

I noticed that even with "Automatic Discovery" disabled, these erroneous DHCP entries refreshed themselves after the 24 hour lifetime expired.  So I decided to go in and delete them manually.  I stopped KeaDHCP, manually removed these entries from the kea-leases4.csv.2 file and re-started KeaDHCP.  The addresses disappeared and the one Proxmox server (I have 4) that was not working is now accessible.

However, now I am noticing something strange starting again.  I have 2 deployed containers on this proxmox server.  Nothing I "need", just playing around.  One is bentopdf, which allows for PDF editing and the other is ConvertX, which does file format conversions.  I've used both of these containers multiple times and they have been great.  They have worked and were accessible from the single IP address they are configured for.  For some reason, KeaDHCP is now adding all sorts of entries for these two containers.  What is really strange is that it is adding two entries for each IP address, one, like before, has a lifetime of 86400 seconds, no MAC address and no hostname and claims to be dynamic.  The other is a static entry (which I don't have defined), but is mapped to the same IP address.  At this point, I am not quite sure what to do short of starting to do some wireshark traces when I bounce KeaDHCP.  These containers are set for a single IP address yet KeaDHCP seems to think they are configured for all sorts of addresses.  If I ping any of the address KeaDHCP thinks are associated with that container, nothing responds (as expected).  I have no idea how KeaDHCP could be filling these entries in.

I suppose the good news is that I can stop KeaDHCP and manually edit these csv files in /var/db/kea/ and blow away these erroneous entries.  When I have time, I will try to get a Packet Capture after resetting KeaDHCP and see if I can find any DHCP requests that would account for this behavior.
#10
In /var/db/kea/ there are two files on my router.  kea-leases4.csv does not contain any of these bogus entries.  There is also a kea-leases4.csv.2 which does contain these entries.  Again, no MAC, no hostname:

192.168.6.17,,,86400,1774014491,1,0,0,,1,,0
192.168.6.18,,,86400,1774014501,1,0,0,,1,,0
192.168.6.19,,,86400,1774014511,1,0,0,,1,,0
192.168.6.20,,,86400,1774014521,1,0,0,,1,,0
192.168.6.21,,,86400,1774014531,1,0,0,,1,,0
192.168.6.22,,,86400,1774014541,1,0,0,,1,,0

I will try moving this file in off hours.  See if that helps.
#11
If you click on Interfaces: Neighbors, there is a tab for "Discovered Hosts".  One of the 40 addresses in my pool exists in this table but it is the one that is correct/functioning.  The rest of these DHCP entries (the ones in the pool and the ones not in the pool), none of these are in this Discovered Hosts table.  While this seems like a plausible theory, the fact that none are in this table makes me feel like that is not my issue.
#12
Thank you meyergru.  You are the first person to suggest something that could theoretically be relevant to my issue.  While I don't have KeaDHCPv6 enabled, I did have the Automatic Discovery Enabled.  I disabled it to give this a try.  Nothing appears to have changed yet, but I guess I would not expect anything to change until these Leases expire.

When you say "clean out DHCPv6 leases manually", as I don't have DHCPv6 enabled, there are no leases currently under v6.  Is there a mechanism to clean out v4 leases?  On the leases page for these (invalid) entries without a hostname or MAC address, the only option OpnSense gives me for these entries is "add reservation".  While I don't want these entries to have a reservation, I tried anyway (as that appears to be my only option).  I was thinking maybe I could find a "pool" of addresses I wasn't using and assign these to different addresses outside of my DHCP pool.  However, as there is no MAC address associated with these entries, the resulting pop-up to add the reservation fails telling me that a MAC address is required to add a reservation.

This feels like an OpnSense bug to me.  Affinity or no, Automatic Discovery or no, any feature I can think of on or off, why would DHCPv4 add leases to devices without a hostname or MAC address?  I won't claim to know the entire workings of the DHCP protocol, but my understanding is that a device configured for DHCP simply sends out a broadcast DHCP Discover packet with it's MAC address as the source.  As OpnSense is my only DHCP server on my network, it receives that broadcast request and replies with an Offer.  As I have "Match client-id" disabled, the DHCP server must use the MAC address of the request as it's key to the dolled out IP address (regardless if a Client ID is present in the discover).

So I am left with 2 questions:

1. How do I clear out DHCPv4 address mappings?  I've tried rebooting the entire router, I've tried stopping KeaDHCP but whenever it comes back up, the cache remains.
2. Is there a mechanism for me to log a bug here?  Or does anyone have an explanation as to why there are DHCPv4 entries without a hostname or MAC address associated with them?  Especially with "Match client-id" disabled.

As I have been looking at this issue closer, I noticed one more thing.  There are 3 devices on my network that I have existing reservations for but these invalid leases of time 86400 seconds and no MAC or hostname associated are set to use these IP addresses.  So, not only can these leases be given addresses from within my DHCPv4 pool, they can also be given addresses that I have explicitly configured to be reserved for specific MAC addresses (outside of the pool configured for this subnet).  Luckily I don't need to use these specific devices for anything right now but this makes the issue more severe from my point of view.

One of these devices is a Proxmox server.  It is set to boot using the static IP address I assigned this reservation to, but I can't get to it on my network.  This tells me this problem is pretty recent (around the time I upgraded to 26.1.4) as I was using this Proxmox server just a few days ago.

If whatever is going on here can overlap with any IP address on my network and make it unusable, how do I protect this from happening?

If I have not heard any relevant theories on this issue, I will try restoring my 26.1.3 OpnSense VM this evening.  I make sure to backup my OpnSense VM prior to any upgrades.

Thank you.
#13
Ok, interesting side-note, thank you.  I would still want reservations as the actual address given to specific hosts on my network is important to me (i.e. I know my printer is .110, my PC is .100, etc...).  I don't just want addresses consistent, I want them consistently a specific value.  While this information is interesting, thank you, it doesn't appear to be relevant to my issue.

Kea has dolled out 39 IP addresses with a lifetime value that doesn't appear to exist in my configuration and there is no MAC address or hostname associated with these entries.  The entirety of the information in the DHCP Leases table is the interface they are on, the IP address given, blank MAC address, 86400 second lifetime (seemingly not configured anywhere I can find), expire time (tomorrow), blank hostname, and the fact they are dynamic leases.

I can't find a way to debug why these leases were given out, nor can I find a way to flush these entries.  The only option OpnSense appears to give me is the option to make these entries static.  I'm not even sure what that means though.  With "Match client-id" disabled on this subnet, the entry should be a link between a MAC address and an IP address.  Without a MAC address, how can I make this a reservation?

Thank you for the help.
#14
Technically I have VLANs on my network but in this case, OpnSense is running on Proxmox and Proxmox is configured with VLANs and exposes the interfaces to OpnSense.  I also don't understand why KeaDHCP dolling out MAC-less IP addresses would have anything to do with VLANs.  Everything with a Static reservation is working fine.

I also just rebooted, hoping that would flush the existing DHCP entries but it did not.  There are still 39 dolled out IP addresses without a MAC associated with them (even though "Match client-id" is disabled).

Maybe I can ask this another way.  If my "valid-lifetime" setting for the KeaDHCP server is 4000 (the default I believe), what does it mean when there is an entry with a lifetime of 86400?

Is there a way to flush the current DHCP cache?  I've tried stopping and starting KeaDHCP, I have tried rebooting OpnSense but the entries remain.  They are supposed to expire tomorrow.  Is my only option really to wait?

I also checked my secondary VLAN which I have all my cloud-connected devices on (thermostats, smart light-bulbs, etc...).  It has a pool configured as well but there are no DHCP entries other than ones with a lifetime of 4000.

How can I better debug this?  Just hearing "it works for me" isn't very helpful.  Can I attach some sort of config here to be analyzed?  This used to work just fine.  I'm not sure which upgrade caused the problem.  Given 86400 is 24 hours, I suppose these could have been dolled out multiple times.  I did recently upgrade to 26.1.4 but this doesn't necessarily mean that was the issue.  As 95+% of my devices have Reservations, I may not have noticed this for a while.

This also all "just worked" in pfSense.  If I can't get DHCP to work reliably in OpnSense, there really isn't a reason to use it.
#15
I appreciate the suggestion but I already have that turned off.