dhcpd consuming the entire address pool for ~15 clients

Started by jeremfg, July 22, 2021, 05:02:16 PM

Previous topic - Next topic
I have a weird issue on my LAN that I've been unable to explain thus far.

I have about 15 clients, and a pool of 204 addresses (192.168.1.50 to 192.168.1.254). Yet I quickly run out of available addresses as they all get assigned to 3-4 clients which are apparently competing to grab them all as quickly as possible.

My logs are filled with the following:

2021-07-22T10:22:38 dhcpd[56624] Abandoning IP address 192.168.1.157: pinged before offer
2021-07-22T10:22:38 dhcpd[56624] ICMP Echo reply while lease 192.168.1.157 valid.
2021-07-22T10:22:38 dhcpd[56624] DHCPDISCOVER from aa:47:7f:51:95:c4 via bridge0
2021-07-22T10:22:38 dhcpd[56624] Reclaiming abandoned lease 192.168.1.157.


So during the reclaiming process, it seems all these addresses are configured. And I can ping them too myself, manually, confirming that they are in-use as the reclaiming process seems to figure out. And yes, when looking at one of the Linux clients I see the following:


...
    inet 192.168.1.78/24 brd 192.168.1.255 scope global secondary dynamic eth0
       valid_lft 771sec preferred_lft 771sec
    inet 192.168.1.133/24 brd 192.168.1.255 scope global secondary dynamic eth0
       valid_lft 782sec preferred_lft 782sec
    inet 192.168.1.250/24 brd 192.168.1.255 scope global secondary dynamic eth0
       valid_lft 793sec preferred_lft 793sec
    inet 192.168.1.218/24 brd 192.168.1.255 scope global secondary dynamic eth0
       valid_lft 804sec preferred_lft 804sec
    inet 192.168.1.220/24 brd 192.168.1.255 scope global secondary dynamic eth0
       valid_lft 815sec preferred_lft 815sec
    inet 192.168.1.219/24 brd 192.168.1.255 scope global secondary dynamic eth0
       valid_lft 826sec preferred_lft 826sec
...


Using wireshark to view the packet capture made by opnsense on the LAN interface, I see these clients make multiple DHCPDISCOVER, but I don't see the ICMP echoes/replies that the reclaiming process is supposed to be doing, nor any other DHCP messages than the solicit. And it matches with the eventual behavior exhibited by the clients, they are unable to obtain a lease as I'm out of available IPs.

OPNsense reflects this, as I see all the addresses from the pool configured, although it correctly identifies as only the ~15 ones I'm expecting as active.

My only suspect at this point, is the bridge interface I use for my LAN interface, bridging a virtual XCP-ng interface shared between all my VMs (one of which runs OPNsense), and a physical one (via PCI passthrough directly to OPNsense) that goes to a switch to the rest of the external physical clients. I won't be able to confirm this quickly however, as I have to wait to gain access to the physical server, and remove the bridge to use two physical NICs instead (one for OPNsense LAN, the other for the VMs, invisible to OPNsense).
But even if my suspicion is right, I fail to explain the issue as to why dhcpd would attribute so many different addresses to the same client(s). Especially considering those clients do provide their MAC in the options to identify themselves.