1
24.1 Legacy Series / Re: DHCP relay stops working in 24.1.6
« on: May 19, 2024, 09:42:43 pm »
Hello,
I have the same issue since 24.1.6 (I just upgraded to 24.1.7).
Everything works for a while and I get 100% CPU usage from one of the dhcp_relay processes which blocks the whole DHCP service.
My setup :
Edge sites (x2) :
- ESXi 8
- OpenSense VM as main gateway
- OpenSense VM as "helper" with DHCP relay for multiple VLANs
- Multiple VLANs
- Unifi switches
- Windows Server VM with DHCP server (as standby)
Central site :
- ESXi 8
- OpenSense VM as main gateway
- OpenSense VM as "helper" with DHCP relay for multiple VLANs
- Multiple VLANs
- Unifi switches
- Windows Server VM with DHCP server (as standby)
Site-to-site Wireguard VPN
No DHCP guarding whatsoever on Unifi side.
Opnsense VMs (router and helper) all have an interface in each VLAN.
Target DHCP servers on edge sites are both the local and the central Windows DHCP server.
This setup worked flawlessly for months (if not years) before 24.1.6.
My Windows DHCP servers also serve the very VLAN where my Opnsense VM and my Windows servers have their management interface.
I tried deactivating the DHCP relay for this management VLAN as per https://forum.opnsense.org/index.php?topic=40126.0 thinking it would solve it (I could live with that workaround even if not ideal and degraded compared to before 24.1.6).
But the issue still occurs now and then, i have to restart dhcp relay for some other VLANs to have CPU come down to normal.
My latest workaround is to make a daily reboot of the helper VM. Definitly not a bulletproof approach.
1°/ Would you know if the developer team is aware of the situation and working on it ?
2°/ Would you know where i could find useful logs for the new dhcp relay service ?
I've been very happy with the tremendous work around OpnSense.
It's the first time in years that I encounter such a blocking issue after an upgrade.
Thank you in advance.
PS : I submitted a bug report : https://github.com/opnsense/core/issues/7471
I have the same issue since 24.1.6 (I just upgraded to 24.1.7).
Everything works for a while and I get 100% CPU usage from one of the dhcp_relay processes which blocks the whole DHCP service.
My setup :
Edge sites (x2) :
- ESXi 8
- OpenSense VM as main gateway
- OpenSense VM as "helper" with DHCP relay for multiple VLANs
- Multiple VLANs
- Unifi switches
- Windows Server VM with DHCP server (as standby)
Central site :
- ESXi 8
- OpenSense VM as main gateway
- OpenSense VM as "helper" with DHCP relay for multiple VLANs
- Multiple VLANs
- Unifi switches
- Windows Server VM with DHCP server (as standby)
Site-to-site Wireguard VPN
No DHCP guarding whatsoever on Unifi side.
Opnsense VMs (router and helper) all have an interface in each VLAN.
Target DHCP servers on edge sites are both the local and the central Windows DHCP server.
This setup worked flawlessly for months (if not years) before 24.1.6.
My Windows DHCP servers also serve the very VLAN where my Opnsense VM and my Windows servers have their management interface.
I tried deactivating the DHCP relay for this management VLAN as per https://forum.opnsense.org/index.php?topic=40126.0 thinking it would solve it (I could live with that workaround even if not ideal and degraded compared to before 24.1.6).
But the issue still occurs now and then, i have to restart dhcp relay for some other VLANs to have CPU come down to normal.
My latest workaround is to make a daily reboot of the helper VM. Definitly not a bulletproof approach.
1°/ Would you know if the developer team is aware of the situation and working on it ?
2°/ Would you know where i could find useful logs for the new dhcp relay service ?
I've been very happy with the tremendous work around OpnSense.
It's the first time in years that I encounter such a blocking issue after an upgrade.
Thank you in advance.
PS : I submitted a bug report : https://github.com/opnsense/core/issues/7471