KeaDHCP HA Warnings

Started by tofuSCHNITZEL, March 13, 2025, 11:43:01 PM

Previous topic - Next topic
March 13, 2025, 11:43:01 PM Last Edit: March 19, 2025, 04:01:44 PM by tofuSCHNITZEL
Hi,
I have an opnsense HA setup with KEA in HA to service around 150 interfaces (vlans) with DHCP.
The kea agents communicate over a dedicated (direct) 10G cable the runs between the two hypervisors that host the two opnsense instances. (which is also used for PFSYNC)

since a couple of days we get frequent warnings in the kea log on the primary with the following content:

WARN [kea-dhcp4.ha-hooks.0x1c8631017b00] HA_LEASE_UPDATE_CONFLICT SensePrimary: lease update [hwtype=1 00:1d:c1:0b:91:2a], cid=[01:00:1d:c1:0b:91:2a], tid=0x34523322 sent to SenseSecondary (http://192.168.105.3:8001) returned conflict status code: ResourceBusy: IP address:172.17.90.229 could not be updated. (error code 4)
WARN [kea-dhcp4.lease-cmds-hooks.0x34bf72de2000] LEASE_CMDS_UPDATE4_CONFLICT lease4-update command failed due to conflict (parameters: { "client-id": "01:00:1d:c1:0b:91:2a", "expire": 1742341137, "force-create": true, "fqdn-fwd": false, "fqdn-rev": false, "hostname": "axcdante-0b912a", "hw-address": "00:1d:c1:0b:91:2a", "ip-address": "172.17.90.229", "origin": "ha-partner", "state": 0, "subnet-id": 84, "valid-lft": 1800 }, reason: ResourceBusy: IP address:172.17.90.229 could not be updated.)
on the secondary this is logged:


these warnings appear approx. every 5-10 mins - I have already restarted kea on both machines multiple times. if I call http://192.168.105.3:8001 (this is the
agent listening on the secondary) via curl I get an answer immediately - so I dont know why the "ResourceBusy" error would occur...

Currently I have version OPNsense 25.1.3-amd64 installed.

And now kea even terminated probably of too many resource busy failures..? (see attached screenshot)
I deleted that dhcp4 lease csv file on both nodes and restarted the service and when tailing the files on both nodes I can see new ip leases being added and immediately apperaring on the other node as well so clearly the communictation is working - still getting the ressource busy warnings in the log...

Any ideas?