PPPOE hijack by slave HA node - seems to be still unresolved.

Started by tomk_1313, November 29, 2024, 02:08:56 AM

Previous topic - Next topic
Hi Chaps, I'm kinda at the end of the rope with this one. Problem is relatively academic:
setup two instances of open sense (let's call those opnsense1 & opnsense1).
Setup both in pretty mych identical form, using PPPOE as WAN interface, as much vanilla as possible - not even an mss clamp !
Set both up as per HA guide - everything work OK, those can synchronize it's own configs, all is great in router land.
Reboot one machine, and everything fails over to a second one just perfectly.

Problem here is that is on node that is marked as "master carp" the PPPOE disconnects for ANY reason, "slave carp" will gladly jump in and establish the pppoe connection. Problem here is that virtual IP did change between opnsense1 & opnsense2 ... I know that WAN failure shall not cause carp failover, however why selecting in HA settings "Disconnect dialup interfaces" means absolutely nothing and slave can still try to use PPPOE ?!?!

I know that I'm pretty dumb, so please let me know that there is some magical setting or secret handshake that I'm not privy to - becuase PPPOE disconnects happen pretty regurarly on my FTTP.


OK some details:

both of routers are VM's

both have identical 4 interfaces, 3 physical 1 virtual.
first interface is connected to the switch connected to the ONT
second interface is for PFsync
third interface is for LAN
fourth interface - this one is for future fun stuff with VPN etc.

WAN is setup as ipv4 PPPoE (which makes vtnet0 free in assigments window) and dhcpv6 (Request the IPv6 information through the IPv4 PPP connectivity link)

on both routers there are two gateways, one for PPPOE and one for ipv6, an on car slave PPPoE gateway shows as "defunct".

IF pppoe connection disconnects on carp slave, pppoe rapidly connects on slave and shows valid IP address in the dashboard, WHILE gateway still shows as "defunct" - beats me, but that might be a bug ?! Duno, but the pppoe daemon might've been accidentally left running even thou it's supposed to be "defunct" ?