PPPOE hijack by slave HA node - seems to be still unresolved.

Started by tomk_1313, November 29, 2024, 02:08:56 AM

Previous topic - Next topic
Hi Chaps, I'm kinda at the end of the rope with this one. Problem is relatively academic:
setup two instances of open sense (let's call those opnsense1 & opnsense2).
Setup both in pretty much identical fashion, using PPPOE as WAN interface, as much vanilla as possible - not even an mss clamp !
Set both up as per HA guide - everything work OK, those can synchronise it's own configs, all is great in router land.
Reboot one machine, and everything fails over to a second one just perfectly.

TWO problems that I've discovered:
1.
IF on node that is marked as "master carp", the PPPOE disconnects for ANY reason, node that is "slave carp" will gladly jump in and establish the pppoe connection.
So you end up with situation, where PPPOE connection shifted to different node, while virtual IP did NOT change between opnsense1 & opnsense2 ... I know that WAN failure shall not cause carp failover, however why selecting in HA settings "Disconnect dialup interfaces" means absolutely nothing and slave can still try to use PPPOE ?!?!
2.
If you reboot opnsense1 -> everything nicely fails over to opnsense2. After the reboot of opnsense1 is finished. The opnsense2 will let go of CARP, but it will hold on to PPPOE connection. End status is that opnsense1 has CARP master, while opnsesnse has PPPOE link


I know that I'm pretty dumb, so please let me know that there is some magical setting or secret handshake that I'm not privy to - becuase PPPOE disconnects happen pretty regurarly on my FTTP.


OK some details about the opnsense instances:

both of routers are VM's

both have identical 4 interfaces, 3 physical 1 virtual.
first interface is connected to the switch connected to the ONT
second interface is for PFsync
third interface is for LAN
fourth interface - this one is for future fun stuff with VPN etc.

WAN is setup as ipv4 PPPoE (which makes vtnet0 free in assigments window) and dhcpv6 (Request the IPv6 information through the IPv4 PPP connectivity link)

on both routers there are two gateways, one for PPPOE and one for ipv6, an on car slave PPPoE gateway shows as "defunct".

IF pppoe connection disconnects on carp slave, pppoe rapidly connects on slave and shows valid IP address in the dashboard, WHILE gateway still shows as "defunct" - beats me, but that might be a bug ?! Duno, but the pppoe daemon might've been accidentally left running even thou it's supposed to be "defunct" ?