DHCP Relay Status Colors

Started by ajeffco, September 17, 2024, 06:59:09 PM

Previous topic - Next topic
Hello All,

Do the status colors on the DHCP Relay -> Configuration -> Relays page have a meaning?  Out of 6 defined relays, 2 are green, 4 are red.  All are working.

Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

Red means dhcrelay service not running for this setup.


Cheers,
Franco

Thanks Franco!  I just noticed too that the "new" services representing the new relays are red also on the dashboard, hadn't noticed it before.

How can I enable logging for the DHCP relay service to help troubleshoot why DHCP relay is working on some interfaces and not others?  In the General log I see /usr/local/sbin/pluginctl: plugins_configure dhcrelay (execute task : dhcrelay_configure_do(1,))

Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

September 17, 2024, 09:34:09 PM #3 Last Edit: September 17, 2024, 09:50:09 PM by ajeffco
In spite of the service showing red on the GUI, DHCP relay is working.

I have a test VM that I've been using to test this change to ensure its working on all VLANs.  The st VM is getting an IP address on all the defined relays.  It's getting IP addresses on all the VLANs when I move the test VM between the VLANs.  And the KEA logs are showing that for that test VM, DHCP traffic is coming from the OPNSense server to the Kea server.


Sep 17 15:23:38 infra-01 kea-dhcp4[18584]: INFO  DHCP4_QUERY_LABEL received query: [hwtype=1 bc:24:11:0b:2c:8a], cid=[ff:ca:53:09:5a:00:02:00:00:ab:11:fe:32:c4:10:63:9d:d9:b2], tid=0x8904150a
Sep 17 15:23:38 infra-01 kea-dhcp4[18584]: INFO  DHCP4_PACKET_RECEIVED [hwtype=1 bc:24:11:0b:2c:8a], cid=[ff:ca:53:09:5a:00:02:00:00:ab:11:fe:32:c4:10:63:9d:d9:b2], tid=0x8904150a: DHCPREQUEST (type 3) received from 10.10.5.2 to 10.10.2.4 on interface ens18
Sep 17 15:23:38 infra-01 kea-dhcp4[18584]: INFO  DHCP4_INIT_REBOOT [hwtype=1 bc:24:11:0b:2c:8a], cid=[ff:ca:53:09:5a:00:02:00:00:ab:11:fe:32:c4:10:63:9d:d9:b2], tid=0x8904150a: client is in INIT-REBOOT state and requests address 10.10.8.102
Sep 17 15:23:38 infra-01 kea-dhcp4[18584]: INFO  DHCP4_PACKET_SEND [hwtype=1 bc:24:11:0b:2c:8a], cid=[ff:ca:53:09:5a:00:02:00:00:ab:11:fe:32:c4:10:63:9d:d9:b2], tid=0x8904150a: trying to send packet DHCPNAK (type 6) from 10.10.2.4:67 to 10.10.5.2:67 on interface ens18



EDIT: The log above is for the VM attached on the User relay which is showing red in the GUI.

Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

Rebooting the opnsense node has brought all DHCP Relays to green status.
Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

They all point to the same servers, maybe one service takes over for the others for one reason or another.

Normally you'd need one per interface, but maybe the VM bridge does something interesting.

When checking for dhcrelay services it should just show as many as there are green dots. If not and the VM bridge doesn't play tricks there is a race condition with the pid file perhaps.


Cheers,
Franco

Franco, thanks for the replies, they are very much appreciated!!!

There is one relay per gateway interface on the VLANs I want to relay for.

I did find a problem unrelated in testing, my standby KEA server had a very old global option name server config that I'd neglected to update during previous changes, doubt it'd be related.

Have a great day!

Al



Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

One last observation from some testing with CARP/HA.

I've added DHCRelay to the XMLRPC Sync under HA Settings and performed a sync.

When the CARP interfaces fail over to the standby node (2) , the DHCRelay services stay running on node 1, and never become active on node 2.

Not sure this is expected behavior.  My assumption is that since there's an option under HA Sync that it might work with HA.


Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

Hmm, both should be active at all times. It's not CARP aware and never has been from what I can tell.


Cheers,
Franco

I'll do some more testing.  I did see on the CLI there are 2 processes for each defined relay on the first node, I don't recall looking on the second node to see the status, only that in the dashboard they were not running.
Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

Hello Franco,

I tested this further.  DHCP Relay works as you described.  The only "oddity" is when modifying (enable/disable) a relay, very intermittently it comes up red however DHCP relay still works.  A reboot fixes that.  I tried to stop/start again and 9 / 10 times the status would show green.  I did an HA Sync, nothing was green on the standby dashboard, and no matter how I enabled/disabled/etc, they would not change. A reboot of the standby node fixed it and the relays came up green and stayed that way even after enable/disable.

Thanks for the feedback and clarification on how DHCP Relay works in HA/CARP.

Have a good one!

Al
Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

Hi Al,

Ok, one thing to look out for is the fact that if it's booting and all is green it means two things could happen when it's red:

1. the daemon stops but cannot start again (something in the dhcrelay log perhaps)
2. the daemon does start but a race condition ends up clobbering the PID file

In both cases you could check with

# pgrep dhcrelay
# pgrep dhcrelay6

To see if you have all the PIDS you would expect (case 2) or some are missing (case 1). In case 1 it's probably interesting why those react different. In case two we need to improve the scripting perhaps.


Cheers,
Franco

Hi Franco,

I can have CheckMK look for the processes to ensure all are running.

I see the log in the GUI but it's always blank and I do not know how to have it start logging.  Is there a log in the CLI I can look at?

I did see 2 processes per relay for anything green.  Didn't notice (my bad) for the reds.

I'll poke it some more and see how it goes.

Thanks!
Al
Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)

> I see the log in the GUI but it's always blank and I do not know how to have it start logging.  Is there a log in the CLI I can look at?

In that case it's probably alright. It does not log a lot.


Cheers,
Franco

Hi Franco,

I poked these 2 nodes a fair amount over the weekend, trying to get the issue to occur again, and have been unable to do so.  The screenshot I posted immediately after first setup, before rebooting both nodes.  Since the reboot, I've been unable to get the defined relays, or new relays, to become red.  And even when they were read, dhcp relay appeared to work.

So, maybe the trick is after first setup, reboot the node.

Thanks again and best of luck!

Al
Dual Virtual OPNsense on PVE with HA via CARP
Node 1: OPNsense 24.7.3_1 - Protectli Vault FW6E (i7)
Node 2: OPNsense 24.7.3_1 - Qotom-Q555G6-S05 (i5)